Jun Yang

D308 Levine Science Research Center
Box 90129
Duke University
Durham, North Carolina 27708-0129
Tel: 919-660-6587
Fax: 919-660-6519
Web: http://www.cs.duke.edu/~junyang/
Email: <cs.duke.edu, junyang>

Research Interests


Professional Experience


Published work:
  1. Haibo Xiu, Pankaj K. Agarwal, and Jun Yang. "PARQO: penalty-aware robust plan selection in query optimization." Proceedings of the VLDB Endowment, 17(13), 2024.
  2. Rickard Stureborg, Jenna Nichols, Bhuwan Dhingra, Jun Yang, Walter Orenstein, Robert A. Bednarczyk, and Lavanya Vasudevan. "Development and validation of VaxConcerns: a taxonomy of vaccine concerns and misinformation with crowdsource-viability." Vaccine, 2024. [link]
  3. Vincent Capol, Yuxi Liu, Haibo Xiu, and Jun Yang. "CrypQ: a database benchmark based on dynamic, ever-evolving Ethereum data." In Proceedings of the 2024 TPC Technology Conference on Performance Evaluation and Benchmarking, Guangzhou, China, August 2024. [link]
  4. Jun Yang, Amir Gilad, Yihao Hu, Hanze Meng, Zhengjie Miao, Sudeepa Roy, and Kristin Stephens-Martinez. "What teaching databases taught us about researching databases: extended talk abstract." In Proceedings of the 2024 International Workshop on Data Systems Education: Bridging Education Practice with Education Research, pages 1-6, Santiago, Chile, June 2024.
  5. Rickard Stureborg, Sanxing Chen, Roy Xie, Aayushi Patel, Christopher Li, Chloe Zhu, Tingnan Hu, Jun Yang, and Bhuwan Dhingra. "Tailoring vaccine messaging with common-ground opinions." In Findings of the Association for Computational Linguistics: NAACL 2024, pages 2553-2575, Mexico City, Mexico, June 2024.
  6. Pankaj K. Agarwal, Xiao Hu, Stavros Sintos, and Jun Yang. "On reporting durable patterns in temporal proximity graphs." In Proceedings of the 2024 ACM Symposium on Principles of Database Systems, Santiago, Chile, June 2024.
  7. Yihao Hu, Amir Gilad, Kristin Stephens-Martinez, Sudeepa Roy, and Jun Yang. "Qr-Hint: actionable hints towards correcting wrong SQL queries." Proceedings of the ACM on Management of Data, 2(3), May 2024.
  8. Sudeepa Roy, Amir Gilad, Yihao Hu, Hanze Meng, Zhengjie Miao, Kristin Stephens-Martinez, and Jun Yang. "How database theory helps teach relational queries in database education (invited talk)." In Proceedings of the 2024 International Conference on Database Theory, Paestum, Italy, March 2024.
  9. Pankaj Agarwal, Rahul Raychaudhury, Stavros Sintos, and Jun Yang. "Computing data distribution from query selectivities." In Proceedings of the 2024 International Conference on Database Theory, Paestum, Italy, March 2024.
  10. Georgia Koutrika and Jun Yang, ed. Proceedings of the VLDB Endowment, 2023. 16(1–13). September 2022 - September 2023.
  11. Hanze Meng, Zhengjie Miao, Amir Gilad, Sudeepa Roy, and Jun Yang. "Characterizing and verifying queries via CInsGen." In Proceedings of the 2023 ACM SIGMOD International Conference on Management of Data, Seattle, Washington, USA, June 2023. Demonstration track.
  12. Rickard Stureborg, Bhuwan Dhingra, and Jun Yang. "Interface design for crowdsourcing hierarchical multi-label text annotations." In Proceedings of the 2023 International Conference on Human Factors in Computing Systems, Hamburg, Germany, April 2023.
  13. Sudeepa Roy and Jun Yang, ed. Special Issue on Widening the Impact of Data Engineering through Innovations in Education, Interfaces, and Features, IEEE Data Engineering Bulletin, September 2022. 45(3). [link]
  14. Xiao Hu, Stavros Sintos, Junyang Gao, Pankaj Agarwal, and Jun Yang. "Computing complex temporal join queries efficiently." In Proceedings of the 2022 ACM SIGMOD International Conference on Management of Data, Philadelphia, Pennsylvania, USA, June 2022.
  15. Xiao Hu, Yuxi Liu, Haibo Xiu, Pankaj Agarwal, Debmalya Panigrahi, Sudeepa Roy, and Jun Yang. "Selectivity functions of range queries are learnable." In Proceedings of the 2022 ACM SIGMOD International Conference on Management of Data, Philadelphia, Pennsylvania, USA, June 2022.
  16. Amir Gilad, Zhengjie Miao, Sudeepa Roy, and Jun Yang. "Understanding queries by conditional instances." In Proceedings of the 2022 ACM SIGMOD International Conference on Management of Data, Philadelphia, Pennsylvania, USA, June 2022.
  17. Yihao Hu, Zhengjie Miao, Zhiming Leong, Haechan Lim, Zachary Zheng, Sudeepa Roy, Kristin Stephens-Martinez, and Jun Yang. "I-Rex: an interactive relational query debugger for SQL." In Proceedings of the 2022 ACM Technical Symposium on Computer Science Education, Providence, Rhode Island, USA, March 2022. Demonstration track.
  18. Guoliang Li, Guo Yu, Jun Yang, and Ju Fan, ed. Special Topic on New Techniques of Database Systems, Journal of Software (Ruanjian Xuebao), March 2022. 33(3).
  19. Chengkai Li and Jun Yang, ed. Special Issue on Data Engineering Challenges in Combating Misinformation, IEEE Data Engineering Bulletin, September 2021. 44(3). [link]
  20. Pankaj K. Agarwal, Xiao Hu, Stavros Sintos, and Jun Yang. "Dynamic enumeration of similarity joins." In Proceedings of the 2021 International Colloquium on Automata, Languages, and Programming, pages 11:1-11:19, Glasgow, Scotland, July 2021.
  21. Junyang Gao, Yifan Xu, Pankaj Agarwal, and Jun Yang. "Efficiently answering durability prediction queries." In Proceedings of the 2021 ACM SIGMOD International Conference on Management of Data, pages 591-604, Xi'an, China, June 2021.
  22. Junyang Gao, Stavros Sintos, Pankaj K. Agarwal, and Jun Yang. "Durable top-k instant-stamped temporal records with user-specified scoring functions." In Proceedings of the 2021 International Conference on Data Engineering, pages 720-731, Chania, Greece, April 2021.
  23. Chenghong Wang, David Pujol, Yanping Zhang, Johes Bater, Matthew Lentz, Ashwin Machanavajjhala, Kartik Nayak, Lavanya Vasudevan, and Jun Yang. "Poirot: private contact summary aggregation." In Proceedings of the 2020 Privacy Preserving Machine Learning (NeurIPS Workshop), Virtual, December 2020.
  24. Yanping Zhang, Chenghong Wang, David Pujol, Johes Bater, Matthew Lentz, Ashwin Machanavajjhala, Kartik Nayak, Lavanya Vasudevan, and Jun Yang. "Poirot: private contact summary aggregation." In Proceedings of the 2020 ACM Conference on Embedded Networked Sensor Systems, Yokohama, Japan, November 2020. Poster track. Part of the research poster track on COVID-19 pandemic response.
  25. Zhengjie Miao, Tiangang Chen, Alexander Bendeck, Kevin Day, Sudeepa Roy, and Jun Yang. "I-Rex: an interactive relational query explainer for SQL." Proceedings of the VLDB Endowment, 13(12):2997-3000, August 2020. Demonstration description.
  26. Brett Walenz, Stavros Sintos, Sudeepa Roy, and Jun Yang. "Learning to sample: counting with complex queries." Proceedings of the VLDB Endowment, 13(3):390-402, 2019.
  27. Stavros Sintos, Pankaj Agarwal, and Jun Yang. "Selecting data to clean for fact checking: minimizing uncertainty vs. maximizing surprise." Proceedings of the VLDB Endowment, 12(13):2408-2421, 2019.
  28. Junyang Gao, Xian Li, Yifan Ethan Xu, Bunyamin Sisman, Xin Luna Dong, and Jun Yang. "Efficient knowledge graph accuracy evaluation." Proceedings of the VLDB Endowment, 12(11):1679-1691, 2019. [report]
  29. Naeemul Hassan, Chengkai Li, Jun Yang, and Cong Yu, ed. Special Issue on Combating Digital Misinformation and Disinformation, ACM Journal of Data and Information Quality, July 2019. 11(3). [link]
  30. Zhengjie Miao, Sudeepa Roy, and Jun Yang. "RATest: explaining wrong relational queries using small examples." In Proceedings of the 2019 ACM SIGMOD International Conference on Management of Data, pages 1961-1964, Amsterdam, Netherlands, June 2019. Demonstration track. [paper]
  31. Zhengjie Miao, Sudeepa Roy, and Jun Yang. "Explaining wrong queries using small examples." In Proceedings of the 2019 ACM SIGMOD International Conference on Management of Data, pages 503-520, Amsterdam, Netherlands, June 2019. [paper]
  32. Guoliang Li, Jun Yang, João Gama, Juggapong Natwichai, and Yongxin Tong, ed. Proceedings of the 2019 International Conference on Database Systems for Advanced Applications, Chiang Mai, Thailand, April 2019. Lecture Notes in Computer Science 11447. Springer. ISBN: 978-3-030-18578-7.
  33. Matthias Boehm, Arun Kumar, and Jun Yang. Data management in machine learning systems. Morgan & Claypool Publishers, February 2019. [paper]
  34. Bill Adair, Chengkai Li, Jun Yang, and Cong Yu. "Automated pop-up fact-checking: challenges and progress." In Proceedings of the 2019 Computation+Journalism Symposium, Miami, Florida, USA, February 2019. Informal publication. [paper]
  35. Jun Yang, Pankaj K. Agarwal, Sudeepa Roy, Brett Walenz, You Wu, Cong Yu, and Chengkai Li. "Query perturbation analysis: an adventure of database researchers in fact-checking." IEEE Data Engineering Bulletin, 41(3):28-42, 2018. Invited contribution. [paper]
  36. Yuhao Wen, Xiaodan Zhu, Sudeepa Roy, and Jun Yang. "Interactive summarization and exploration of top aggregate query answers." Proceedings of the VLDB Endowment, 11(13):2196-2208, 2018. [paper]
  37. Junyang Gao, Pankaj Agarwal, and Jun Yang. "Durable top-k queries on temporal data." Proceedings of the VLDB Endowment, 11(13):2223-2235, 2018. [paper]
  38. Yuhao Wen, Xiaodan Zhu, Sudeepa Roy, and Jun Yang. "QAGView: interactively summarizing high-valued aggregate query answers." In Proceedings of the 2018 ACM SIGMOD International Conference on Management of Data, pages 1709-1712, Houston, Texas, USA, June 2018. Demonstration track. [paper]
  39. Bill Adair, Chengkai Li, Jun Yang, and Cong Yu. "Progress toward “the holy grail”: the continued quest to automate fact-checking." In Proceedings of the 2017 Computation+Journalism Symposium, Evanston, Illinois, USA, October 2017. Informal publication.
  40. Brett Walenz, Sudeepa Roy, and Jun Yang. "Optimizing iceberg queries with complex joins." In Proceedings of the 2017 ACM SIGMOD International Conference on Management of Data, pages 1243-1258, Chicago, Illinois, USA, May 2017. [paper]
  41. Arun Kumar, Matthias Boehm, and Jun Yang. "Data management in machine learning: challenges, techniques, and systems." In Proceedings of the 2017 ACM SIGMOD International Conference on Management of Data, pages 1717-1722, Chicago, Illinois, USA, May 2017. [paper]
  42. Semih Salihoglu, Wenchao Zhou, Rada Chirkova, Jun Yang, and Dan Suciu, ed. Proceedings of the 2017 ACM SIGMOD International Conference on Management of Data, Chicago, Illinois, USA, May 2017.
  43. Risi Thonangi and Jun Yang. "On log-structured merge for solid-state drives." In Proceedings of the 2017 International Conference on Data Engineering, pages 683-694, San Diego, California, USA, April 2017. [paper]
  44. Botong Huang and Jun Yang. "Cümülön-D: data analytics in a dynamic spot market." Proceedings of the VLDB Endowment, 10(8):865-876, April 2017. [paper]
  45. You Wu, Junyang Gao, Pankaj K. Agarwal, and Jun Yang. "Finding diverse, high-value representatives on a surface of answers." Proceedings of the VLDB Endowment, 10(7):793-804, March 2017. [paper]
  46. You Wu, Pankaj K. Agarwal, Chengkai Li, Jun Yang, and Cong Yu. "Computational fact checking through query perturbations." ACM Transactions on Database Systems, 42(1):4:1-4:41, March 2017. [paper]
  47. Albert Yu, Pankaj K. Agarwal, and Jun Yang. "Top-k preferences in high dimensions." IEEE Transactions on Knowledge and Data Engineering, 28(2):311-325, 2016. Invited as a special selection from ICDE 2014. [paper]
  48. Brett Walenz and Jun Yang. "Perturbation analysis of database queries." Proceedings of the VLDB Endowment, 9(14):1635-1646, September 2016. [paper and report]
  49. Brett Walenz, Junyang Gao, Emre Sonmez, Yubo Tian, Yuhao Wen, Charles Xu, Bill Adair, and Jun Yang. "Fact checking congressional voting claims." In Proceedings of the 2016 Computation+Journalism Symposium, Stanford, California, USA, September 2016. Informal publication. [paper]
  50. Naeemul Hassan, Bill Adair, James T. Hamilton, Chengkai Li, Mark Tremayne, Jun Yang, and Cong Yu. "The quest to automate fact-checking." In Proceedings of the 2015 Computation+Journalism Symposium, New York City, New York, USA, October 2015. Informal publication. [paper]
  51. Botong Huang, Nicholas W. D. Jarrett, Shivnath Babu, Sayan Mukherjee, and Jun Yang. "Cümülön: matrix-based data analytics in the cloud with spot instances." Proceedings of the VLDB Endowment, 9(3):156-167, September 2015. [paper and report]
  52. You Wu, Boulos Harb, Jun Yang, and Cong Yu. "Efficient evaluation of object-centric exploration queries for visualization." Proceedings of the VLDB Endowment, 8(12):1752-1763, August 2015. [paper]
  53. Jun Yang, ed. Special Issue on Visionary Ideas in Data Management, ACM SIGMOD Record, June 2015. 44(2). [link]
  54. You Wu, Pankaj K. Agarwal, Chengkai Li, Jun Yang, and Cong Yu. "Toward computational fact-checking." Proceedings of the VLDB Endowment, 7(7):589-600, 2014. [paper]
  55. Naeemul Hassan, Afroza Sultana, You Wu, Gensheng Zhang, Chengkai Li, Jun Yang, and Cong Yu. "Data in, fact out: automated monitoring of facts by FactWatcher." Proceedings of the VLDB Endowment, 7(13), 2014. Demonstration track. Winner of the Excellent Demonstration Award. [paper]
  56. Brett Walenz, You Wu, Seokhyun Song, Emre Sonmez, Eric Wu, Kevin Wu, Pankaj K. Agarwal, Jun Yang, Naeemul Hassan, Afroza Sultana, Gensheng Zhang, Chengkai Li, and Cong Yu. "Finding, monitoring, and checking claims computationally based on structured data." In Proceedings of the 2014 Computation+Journalism Symposium, New York City, New York, USA, October 2014. Informal publication, with contents drawn from SIGMOD 2014 and VLDB 2014 demos. [paper]
  57. Bill Adair, Jun Yang, and the uclaim/icheck Team. "Turning computers into fact-checkers." American Journalism Review, October 2014. Invited contribution. [link and paper]
  58. Botong Huang, Nicholas W. D. Jarrett, Shivnath Babu, Sayan Mukherjee, and Jun Yang. "Cumulon: cloud-based statistical analysis from users' perspective." IEEE Data Engineering Bulletin, 37(3):77-89, September 2014. Invited contribution. [paper]
  59. Rada Chirkova and Jun Yang, ed. Proceedings of the 2014 International Workshop on Bringing the Value of Big Data to Users, Hangzhou, China, September 2014. [link]
  60. You Wu, Brett Walenz, Peggy Li, Andrew Shim, Emre Sonmez, Pankaj K. Agarwal, Chengkai Li, Jun Yang, and Cong Yu. "iCheck: computationally combating “lies, d—ned lies, and statistics”." In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, Snowbird, Utah, USA, June 2014. Demonstration track. [paper]
  61. Albert Yu, Pankaj K. Agarwal, and Jun Yang. "Top-k preferences in high dimensions." In Proceedings of the 2014 International Conference on Data Engineering, Chicago, Illinois, USA, March 2014. Results in this paper are subsumed by those in the TKDE 2016 paper by the same authors.
  62. Afroza Sultana, Naeemul Hassan, Chengkai Li, Jun Yang, and Cong Yu. "Incremental discovery of prominent situational facts." In Proceedings of the 2014 International Conference on Data Engineering, Chicago, Illinois, USA, March 2014. [paper and slides]
  63. Risi Thonangi and Jun Yang. "Permuting data on random-access block storage." Proceedings of the VLDB Endowment, 6(9):721-732, 2013. [errata, paper, and report]
  64. Botong Huang, Shivnath Babu, and Jun Yang. "Cumulon: optimizing statistical data analysis in the cloud." In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, New York City, New York, USA, June 2013. [paper and slides]
  65. Yi Zhang, Kristian Lum, and Jun Yang. "Failure-aware cascaded suppression in wireless sensor networks." IEEE Transactions on Knowledge and Data Engineering, 25(5):1042-1055, May 2013. [paper and supplemental]
  66. Pankaj K. Agarwal, Lars Arge, Sathish Govindarajan, Jun Yang, and Ke Yi. "Efficient external memory structures for range-aggregate queries." Computational Geometry: Theory and Applications, 46(3):358-370, April 2013. [paper]
  67. Albert Yu, Pankaj K. Agarwal, and Jun Yang. "Subscriber assignment for wide-area content-based publish/subscribe." IEEE Transactions on Knowledge and Data Engineering, 24(10):1833-1847, 2012. Invited as a special selection from ICDE 2011. [paper and supplemental]
  68. S. N. Lahiri, XuanLong Nguyen, Jun Yang, Zhengyuan Zhu, and P. Banerjee. "Wireless sensor networks: statistical issues and challenges." Journal of the Indian Statistical Association, 50(1–2):151-191, 2012.
  69. Rada Chirkova and Jun Yang. "Materialized views." Foundations and Trends in Databases, 4(4):295-405, 2012. [paper]
  70. Risi Thonangi, Shivnath Babu, and Jun Yang. "A practical concurrent index for solid-state drives." In Proceedings of the 2012 International Conference on Information and Knowledge Management, pages 1332-1341, Maui, Hawaii, USA, October 2012. Databases track. [paper and report]
  71. You Wu, Pankaj K. Agarwal, Chengkai Li, Jun Yang, and Cong Yu. "On “one of the few” objects." In Proceedings of the 2012 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1487-1495, Beijing, China, August 2012. [paper and report]
  72. Yi Zhang and Jun Yang. "Optimizing I/O for big array analytics." Proceedings of the VLDB Endowment, 5(8):764-775, June 2012. [paper]
  73. Albert Yu, Pankaj K. Agarwal, and Jun Yang. "Processing a large number of continuous preference top-k queries." In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pages 397-408, Scottsdale, Arizona, USA, May 2012. [paper]
  74. Albert Yu, Pankaj K. Agarwal, and Jun Yang. "Processing and notifying range top-k subscriptions." In Proceedings of the 2012 International Conference on Data Engineering, pages 810-821, Washington DC, USA, April 2012. [paper and report]
  75. Yi Zhang, Kamesh Munagala, and Jun Yang. "Storing matrices on disk: theory and practice revisited." Proceedings of the VLDB Endowment, 4(11):1075-1086, August 2011. [paper and report]
  76. James S. Clark, Pankaj K. Agarwal, David M. Bell, Paul G. Flikkema, Alan Gelfand, Xuanlong Nguyen, Eric Ward, and Jun Yang. "Inferential ecosystem models, from network data to prediction." Ecological Applications, 21(5):1523-1536, July 2011.
  77. Albert Yu, Pankaj K. Agarwal, and Jun Yang. "Subscriber assignment for wide-area content-based publish/subscribe." In Proceedings of the 2011 International Conference on Data Engineering, pages 267-278, Hannover, Germany, April 2011. Results in this paper are subsumed by those in the TKDE 2012 paper by the same authors. [paper and report]
  78. Sarah Cohen, Chengkai Li, Jun Yang, and Cong Yu. "Computational journalism: a call to arms to database researchers." In Proceedings of the 2011 Conference on Innovative Data Systems Research, Asilomar, California, USA, January 2011. Outrageous ideas and vision track. Third-place winner of the Best Outrageous Ideas and Vision Track Paper Competition sponsored by the Computing Community Consortium. [paper and slides]
  79. Lei Chen, Changjie Tang, Jun Yang, and Yunjun Gao, ed. Proceedings of the 2010 International Conference on Web-Age Information Management, Jiuzhaigou, Sichuan, China, July 2010. Lecture Notes in Computer Science 6184. Springer. ISBN: 978-3-642-14245-1.
  80. Yi Zhang, Weiping Zhang, and Jun Yang. "I/O-efficient statistical computing with RIOT." In Proceedings of the 2010 International Conference on Data Engineering, pages 1157-1160, Long Beach, California, USA, March 2010. Demonstration track. [paper and poster]
  81. Jun Yang, Kamesh Munagala, and Adam Silberstein. "Data aggregation in sensor networks." In Encyclopedia of Database Systems. Ling Liu and M. Tamer Özsu, ed. Springer. 2009. Invited contribution.
  82. Albert Yu, Pankaj K. Agarwal, and Jun Yang. "Generating wide-area content-based publish/subscribe workloads." In Proceedings of the 2009 Workshop on Networking Meets Databases, Big Sky, Montana, USA, October 2009. [paper]
  83. Pankaj K. Agarwal, Junyi Xie, Jun Yang, and Hai Yu. "Input-sensitive scalable continuous join query processing." ACM Transactions on Database Systems, 34(3):1-41, August 2009. [paper]
  84. Fei Chen, Byron J. Gao, AnHai Doan, Jun Yang, and Raghu Ramakrishnan. "Optimizing complex extraction programs over evolving text data." In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, pages 321-334, Providence, Rhode Island, USA, June 2009. [paper]
  85. Risi Thonangi, Hao He, AnHai Doan, Haixun Wang, and Jun Yang. "Weighted proximity best-joins for information retrieval." In Proceedings of the 2009 International Conference on Data Engineering, pages 234-245, Shanghai, China, March 2009. [paper]
  86. Yi Zhang, Herodotos Herodotou, and Jun Yang. "RIOT: I/O-efficient numerical computing without SQL." In Proceedings of the 2009 Conference on Innovative Data Systems Research, Asilomar, California, USA, January 2009. [paper and slides]
  87. Badrish Chandramouli and Jun Yang. "End-to-end support for joins in large-scale publish/subscribe systems." In Proceedings of the 2008 International Conference on Very Large Data Bases, pages 434-450, Auckland, New Zealand, August 2008. Infrastructure track. [paper]
  88. Badrish Chandramouli, Jun Yang, Pankaj K. Agarwal, Albert Yu, and Ying Zheng. "ProSem: scalable wide-area publish/subscribe." In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pages 1315-1318, Vancouver, Canada, June 2008. Demonstration track. Acceptance rate: 31.9 percent. [paper]
  89. Junyi Xie, Jun Yang, Yuguo Chen, Haixun Wang, and Philip S. Yu. "A sampling-based approach to information recovery." In Proceedings of the 2008 International Conference on Data Engineering, pages 476-485, Cancun, Mexico, April 2008. Short presentation track. Acceptance rate: 19.2 percent of 715. Full paper. [paper]
  90. Fei Chen, AnHai Doan, Jun Yang, and Raghu Ramakrishnan. "Efficient information extraction over evolving text data." In Proceedings of the 2008 International Conference on Data Engineering, pages 943-952, Cancun, Mexico, April 2008. Acceptance rate: 12.1 percent of 715. [paper]
  91. Magdalena Balazinska, Amol Deshpande, Alexandros Labrinidis, Qiong Luo, Samuel Madden, and Jun Yang. "Report on the fourth international workshop on data management for sensor networks (DMSN 2007)." ACM SIGMOD Record, 36(4):53-55, 2007.
  92. Adam Silberstein, Gavino Puggioni, Alan E. Gelfand, Kamesh Munagala, and Jun Yang. "Suppression and failures in sensor data: a Bayesian approach." In Proceedings of the 2007 International Conference on Very Large Data Bases, pages 842-853, Vienna, Austria, September 2007. Infrastructure track. Acceptance rate: 45 out of 275. [paper]
  93. Badrish Chandramouli, Jeff M. Phillips, and Jun Yang. "Value-based notification conditions in large-scale publish/subscribe systems." In Proceedings of the 2007 International Conference on Very Large Data Bases, pages 878-889, Vienna, Austria, September 2007. Infrastructure track. Acceptance rate: 45 out of 275. [paper]
  94. Magdalena Balazinska, Amol Deshpande, Qiong Luo, and Jun Yang, ed. Proceedings of the 2007 International Workshop on Data Management for Sensor Networks, Vienna, Austria, September 2007.
  95. Hao He, Haixun Wang, Jun Yang, and Philip S. Yu. "BLINKS: ranked keyword searches on graphs." In Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, pages 305-316, Beijing, China, June 2007. Acceptance rate: 70 out of 480. [paper and report]
  96. Badrish Chandramouli, Christopher N. Bond, Shivnath Babu, and Jun Yang. "Query suspend and resume." In Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, pages 557-568, Beijing, China, June 2007. Acceptance rate: 70 out of 480. [paper and report]
  97. Adam Silberstein and Jun Yang. "Many-to-many aggregation for sensor networks." In Proceedings of the 2007 International Conference on Data Engineering, pages 986-995, Istanbul, Turkey, April 2007. Acceptance rate: 122 out of 659. [paper and report]
  98. Badrish Chandramouli, Christopher Bond, Shivnath Babu, and Jun Yang. "On suspending and resuming dataflows." In Proceedings of the 2007 International Conference on Data Engineering, pages 1289-1291, Istanbul, Turkey, April 2007. Poster track. Acceptance rate: 60(+122) out of 659. Results in this paper are subsumed by those in the SIGMOD 2007 paper by the same authors.
  99. Adam Silberstein, Gregory Filpus, Kamesh Munagala, and Jun Yang. "Data-driven processing in sensor networks." In Proceedings of the 2007 Conference on Innovative Data Systems Research, pages 10-21, Asilomar, California, USA, January 2007. Acceptance rate: 34 out of 98. [paper]
  100. Junyi Xie and Jun Yang. "A survey of join processing in data streams." In Data Streams: Models and Algorithms. Charu C. Aggarwal, ed. Springer. November 2006. Invited contribution. [paper]
  101. Pankaj K. Agarwal, Junyi Xie, Jun Yang, and Hai Yu. "Scalable continuous query processing by tracking hotspots." In Proceedings of the 2006 International Conference on Very Large Data Bases, pages 31-42, Seoul, Korea, September 2006. Core database track. Acceptance rate: 46 out of 334. Results in this paper are subsumed by those in the 2009 TODS paper by the same authors. [paper and report]
  102. Adam Silberstein, Kamesh Munagala, and Jun Yang. "Energy-efficient monitoring of extreme values in sensor networks." In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pages 169-180, Chicago, Illinois, USA, June 2006. Acceptance rate: 58 out of 446. [paper]
  103. Adam Silberstein, Rebecca Braynard, and Jun Yang. "Constraint chaining: on energy-efficient continuous monitoring in sensor networks." In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pages 157-168, Chicago, Illinois, USA, June 2006. Acceptance rate: 58 out of 446. [paper]
  104. Badrish Chandramouli, Junyi Xie, and Jun Yang. "On the database/network interface in large-scale publish/subscribe systems." In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pages 587-598, Chicago, Illinois, USA, June 2006. Acceptance rate: 58 out of 446. [paper and report]
  105. Paul G. Flikkema, Pankaj K. Agarwal, James S. Clark, Carla Schlatter Ellis, Alan Gelfand, Kamesh Munagala, and Jun Yang. "Model-driven dynamic control of embedded wireless sensor networks." In Proceedings of the 2006 International Conference on Computational Science, pages 409-416, Reading, United Kingdom, May 2006.
  106. Haixun Wang, Hao He, Jun Yang, Philip S. Yu, and Jeffrey Xu Yu. "Dual labeling: answering graph reachability queries in constant time." In Proceedings of the 2006 International Conference on Data Engineering, Atlanta, Georgia, USA, April 2006. Acceptance rate: 89 out of 456. [paper]
  107. Adam Silberstein, Rebecca Braynard, and Jun Yang. "Energy-efficient continuous isoline queries in sensor networks." In Proceedings of the 2006 International Conference on Data Engineering, Atlanta, Georgia, USA, April 2006. Poster track. Results in this paper are subsumed by those in the SIGMOD 2006 paper by the same authors. [paper]
  108. Adam Silberstein, Rebecca Braynard, Carla Ellis, Kamesh Munagala, and Jun Yang. "A sampling-based approach to optimizing top-k queries in sensor networks." In Proceedings of the 2006 International Conference on Data Engineering, Atlanta, Georgia, USA, April 2006. Acceptance rate: 89 out of 456. [paper]
  109. Badrish Chandramouli, Jun Yang, and Amin Vahdat. "Distributed network querying with bounded approximate caching." In Proceedings of the 2006 International Conference on Database Systems for Advanced Applications, pages 374-388, Singapore, April 2006. Acceptance rate: 24.5 percent. [paper and report]
  110. Pankaj K. Agarwal, Junyi Xie, Jun Yang, and Hai Yu. "Monitoring continuous band-join queries over dynamic data." In Proceedings of the 2005 International Symposium on Algorithms and Computation, pages 349-359, Sanya, Hainan, China, December 2005. [paper]
  111. Hao He, Haixun Wang, Jun Yang, and Philip S. Yu. "Compact reachability labeling for graph-structured data." In Proceedings of the 2005 International Conference on Information and Knowledge Management, pages 594-601, Bremen, Germany, November 2005. Acceptance rate: 76 out of 425. [paper and report]
  112. Kamesh Munagala, Jun Yang, and Hai Yu. "Online view maintenance under a response-time constraint." In Proceedings of the 2005 European Symposium on Algorithms, pages 677-688, Palma de Mallorca, Spain, October 2005. [paper]
  113. Wenfei Fan, Zhaohui Wu, and Jun Yang, ed. Proceedings of the 2005 International Conference on Web-Age Information Management, Hangzhou, China, October 2005. Lecture Notes in Computer Science 3739. Springer. ISBN: 3-540-29227-6.
  114. Junyi Xie, Jun Yang, and Yuguo Chen. "On joining and caching stochastic streams." In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pages 359-370, Baltimore, Maryland, USA, June 2005. Acceptance rate: 65 out of 431. [paper and report]
  115. Adam Silberstein, Hao He, Ke Yi, and Jun Yang. "BOXes: efficient maintenance of order-based labeling for dynamic XML data." In Proceedings of the 2005 International Conference on Data Engineering, pages 285-296, Tokyo, Japan, April 2005. Acceptance rate: 67 out of 521. [paper and report]
  116. Hao He, Junyi Xie, Jun Yang, and Hai Yu. "Asymmetric batch incremental view maintenance." In Proceedings of the 2005 International Conference on Data Engineering, pages 106-117, Tokyo, Japan, April 2005. Acceptance rate: 67 out of 521. [paper]
  117. Junfei Geng and Jun Yang. "AutoBib: automatic extraction of bibliographic information on the Web." In Proceedings of the 2004 International Database Engineering and Applications Symposium, pages 193-204, Coimbra, Portugal, July 2004. [paper]
  118. Ke Yi, Hao He, Ioana Stanoi, and Jun Yang. "Incremental maintenance of XML structural indexes." In Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pages 491-502, Paris, France, June 2004. Acceptance rate: 69 out of 431. [paper]
  119. Adam Silberstein and Jun Yang. "NeXSort: sorting XML in external memory." In Proceedings of the 2004 International Conference on Data Engineering, pages 695-706, Boston, Massachusetts, USA, April 2004. Acceptance rate: 63 out of 441. [paper and report]
  120. Hao He and Jun Yang. "Multiresolution indexing of XML for frequent queries." In Proceedings of the 2004 International Conference on Data Engineering, pages 683-694, Boston, Massachusetts, USA, April 2004. Acceptance rate: 63 out of 441. [paper and report]
  121. Jun Yang and Jennifer Widom. "Incremental computation and maintenance of temporal aggregates." The VLDB Journal, 12(3):262-283, 2003. [paper]
  122. Zhiyuan Chen, Li Chen, Jian Pei, Yufei Tao, Haixun Wang, Wei Wang, Jiong Yang, Jun Yang, and Donghui Zhang. "Recent progress on selected topics in database research: a report by nine young chinese researchers working in the united states." Journal of Computer Science and Technology, 18(5):538-552, September 2003.
  123. Pankaj K. Agarwal, Lars Arge, Jun Yang, and Ke Yi. "I/O-efficient structures for orthogonal range-max and stabbing-max queries." In Proceedings of the 2003 European Symposium on Algorithms, pages 7-18, Budapest, Hungary, September 2003.
  124. Xiao Huang, Qiang Xue, and Jun Yang. "TupleRank and implicit relationship discovery in relational databases." In Proceedings of the 2003 International Conference on Web-Age Information Management, pages 445-457, Chengdu, China, August 2003. Acceptance rate: 30 out of 258. [paper and report]
  125. Ke Yi, Hai Yu, Jun Yang, Gangqiang Xia, and Yuguo Chen. "Efficient maintenance of materialized top-k views." In Proceedings of the 2003 International Conference on Data Engineering, pages 189-200, Bangalore, India, March 2003. Acceptance rate: 51 out of 378. [paper and report]
  126. Jun Yang. "Temporal data warehousing." Ph.D. Dissertation, Stanford University, August 2001.
  127. Jun Yang and Jennifer Widom. "Incremental computation and maintenance of temporal aggregates." In Proceedings of the 2001 International Conference on Data Engineering, pages 51-60, Heidelberg, Germany, April 2001. Acceptance rate: 14 percent. Results in this paper are subsumed by those in the 2003 VLDB Journal paper by the same authors
  128. Wilburt Juan Labio, Jun Yang, Yingwei Cui, Hector Garcia-Molina, and Jennifer Widom. "Performance issues in incremental warehouse maintenance." In Proceedings of the 2000 International Conference on Very Large Data Bases, pages 461-472, Cairo, Egypt, September 2000. Acceptance rate: 53 out of 351.
  129. Jun Yang, Huacheng C. Ying, and Jennifer Widom. "TIP: a temporal extension to informix." In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, page 596, Dallas, Texas, USA, May 2000. Demonstration track.
  130. Jun Yang, Huacheng C. Ying, and Jennifer Widom. "TIP: a temporal extension to informix." In Proceedings of the 2000 International Conference on Extending Database Technology, Konstanz, Germany, March 2000. Demonstration track. An improved version was shown in SIGMOD 2000.
  131. Jun Yang and Jennifer Widom. "Temporal view self-maintenance." In Proceedings of the 2000 International Conference on Extending Database Technology, pages 395-412, Konstanz, Germany, March 2000. Acceptance rate: 16.7 percent.
  132. Hector Garcia-Molina, Wilburt Juan Labio, and Jun Yang. "Expiring data in a warehouse." In Proceedings of the 1998 International Conference on Very Large Data Bases, pages 500-511, New York City, New York, USA, August 1998. Acceptance rate: 16 percent.
  133. Jun Yang and Jennifer Widom. "Maintaining temporal views over non-temporal information sources for data warehousing." In Proceedings of the 1998 International Conference on Extending Database Technology, pages 389-403, Valencia, Spain, March 1998. Acceptance rate: 32 out of 191.
  134. Laura M. Haas, Donald Kossmann, Edward L. Wimmers, and Jun Yang. "Optimizing queries across diverse data sources." In Proceedings of the 1997 International Conference on Very Large Data Bases, pages 276-285, Athens, Greece, August 1997. Acceptance rate: 16 percent.
  135. Laura M. Haas, Donald Kossmann, Edward L. Wimmers, and Jun Yang. "An optimizer for heterogeneous systems with non-standard data and search capabilities." IEEE Data Engineering Bulletin, 19(4):37-44, December 1996.
  136. Steve G. Steinberg, Jun Yang, and Katherine A. Yelick. "Performance modeling and composition: a case study in cell simulation." In Proceedings of the 1996 International Parallel Processing Symposium, pages 68-74, Honolulu, Hawaii, USA, April 1996. Acceptance rate: 35 percent.
Technical reports:


Current funding: Past funding:

Honors and Awards

External Presentations and Demonstrations

  1. "Revisiting Robust Query Optimization," Invited talk at the Microsoft Gray Systems Lab Talk Series (10/08/2024), October 2024.
  2. "What Teaching Databases Taught me about Researching Databases," Invited talk at the Northwest Database Society Seminar Series, University of Washington, Seattle (04/11/2024), invited talk at the DATA Lab Seminar Series, Northeastern University (05/10/2024), and keynote talk at 3rd International Workshop on Data Systems Education (DataEd) at SIGMOD 2024, Santiago, Chile (06/09/2024), April 2024 - June 2024.
  3. Panel on Law and AI in China and the US, Duke University Law School, March 18, 2024.
  4. Panel on Good Reviewing Habits, VLDB 2023 PhD Workshop, Vancouver, Canada, August 28, 2023.
  5. "Adventure of a Computer Scientist in Fact-Checking," distinguished lecture, School of Computer Science, Georgia Institute of Technology, January 17, 2023.
  6. Round table on Systems for ML at the 2021 International Conference on Very Large Data Bases (VLDB 2021), August 2021.
  7. "Adventure of a Computer Scientist in Fact-Checking," seminar for Duke University Scholars, April 2021.
  8. "From Answering Questions to Questioning Answers," talk at the Natural and Applied Sciences Research Colloquia Series, Duke Kunshan University, November 2020.
  9. "Squash, Gardener and the Future of AI in Political Fact-Checking," talk with Bill Adair at the +DS (+DataScience) Virtual Learning Experiences Series, Duke University, October 2020.
  10. "Computational Fact Checking through Query Perturbations," talk at the Workshop on AI and Information Disorder, Global Forum on AI for Humanity, Paris, France, October 8, 2019.
  11. "Adventure of a Computer Scientist in Fact-Checking," talks at the School of Journalism and Communication, Renmin University (11/21/2018), and the School of Communication, Hong Kong Baptist University (5/7/2019), November 2018 - May 2019.
  12. "From Answering Questions to Questioning Answers," talks at the School of Information, Renmin University (11/21/2018), Department of Computer Science, Hong Kong Baptist University (5/8/2019), and Big Data Institute and Department of Computer Science and Engineering, Hong Kong University of Science and Technology (5/9/2019), November 2018 - May 2019.
  13. "Automated Pop-­Up Fact-­Checking: Challenges and Progress," presentation and system demonstration at the 2019 Computation+Journalism Symposium (COMPJ 2019), February 1, 2019 - February 2, 2019.
  14. "Do Numbers Lie," talk at TEDxDuke 2018, March 4, 2018.
  15. "Data Analytics in a Public Cloud: A User-Centric Perspective," presentation at the Huawei Innovation Research Program Exploratory collocated with the 2017 International Conference on Very Large Data Bases (VLDB 2017), August 2017.
  16. "Cumulon-D: Data Analytics in a Dynamic Spot Market," presentation at the 2017 International Conference on Very Large Data Bases (VLDB 2017), August 2017.
  17. "Data Management in Machine Learning: Challenges, Techniques, and Systems," tutorial the 2017 ACM SIGMOD International Conference on Management of Data (SIGMOD 2017) with Arun Kumar and Matthias Boehm, May 2017.
  18. "Do Numbers Lie," Science Cafe presentation (with Brett Walenz) at North Carolina Museum of Natural Sciences, October 27, 2016.
  19. "Cumulon: Matrix-based Data Analytics in the Cloud with Spot Instances," presentation at the 2016 International Conference on Very Large Data Bases (VLDB 2016), September 2016.
  20. "From Answering Questions to Questioning Answers (and Questions): Toward Computational Fact-Checking," talk at Duke Computer Science Summer Undergraduate Researchers Lunch Series, July 2016.
  21. "Cumulon: Simplifying Matrix-Based Data Analysis in the Cloud," talks at University of Texas at Arlington, University of Texas at Dallas, University of North Texas, and Wuhan University, April 2016.
  22. "From Answering Questions to Questioning Answers (and Questions): Toward Computational Fact-Checking," talk at Tsinghua University, June 2015.
  23. "Computational Journalism and Big Data," presentation at the Workshop on Journalism and Public Policy for Nanjing Media Professionals, Media Fellows Program, Sanford School of Public Policy, Duke University, December 2014.
  24. "Can Technology Change Fact-Checking?" panel at the American Press Institute "Truth in Politics 2014" Summit, December 2014.
  25. "Thoughts on TAR and Recent Computing Advances," presentation at the Duke Law School Master of Judicial Studies Program, June 2014.
  26. "From Answering Questions to Questioning Answers (and Questions): Toward Computational Fact-Checking," talk at MIT, Big Data Initiative, May 2014.
  27. "Big and Useful: What's in the Data for Me?" panel at the 2013 International Conference on Very Large Data Bases (VLDB 2013), August 2013.
  28. "Big Data: Not Just about the Size," presentation at the Forum of Future Data, Wuyishan, China, July 2012.
  29. "Problems in Computational Journalism," presentation at HP Labs, Beijing, China, June 2012.
  30. "Fun with Arrays and Matrices in RIOT," informal talk at Stanford InfoLab lunch, August 2011.
  31. "Computational Journalism: A Call to Arms to Database Researchers," presentation at the 2011 Conference on Innovative Data Systems Research (CIDR 2011), January 2011.
  32. "Scalable Continuous Query Processing and Result Dissemination," seminar at HP Labs, Beijing, China, August 2010.
  33. "Data-Driven Processing in Sensor Networks," seminar at Stanford University, January 2009.
  34. "A Sampling-Based Approach to Information Recovery," presentation at the 2008 Annual Meeting of the Institute for Operations Research and the Management Sciences (INFORMS 2008), October 2008.
  35. "Thoughts on Data Sharing: A Database Researcher's Perspective," presentation at the Primate Life History Working Group Meeting, NESCent (National Evolutionary Synthesis Center), August 2007.
  36. "Query Suspend and Resume," presentation at the 2007 ACM SIGMOD International Conference on Management of Data (SIGMOD 2007), June 2007.
  37. "Data-Driven Processing in Sensor Networks," seminars at University of Pennsylvania, University of Waterloo, and New England Database Society, April 2007 - October 2007.
  38. "Scalable Continuous Query Processing and Result Dissemination," seminars at IBM T. J. Watson Research Center, University of Maryland at College Park, University of Pittsburgh/Carnegie Mellon University Joint Database Seminar, Brown University, University of Illinois at Urbana-Champaign, and University of California at Berkeley, February 2006 - December 2006.
  39. "Continuous Query Processing over Networked Data," presentation at IBM Research Triangle Park University Day, October 2006.
  40. Panel discussion at SIGMOD '06 Life after Graduation Symposium, June 2006.
  41. "Scalable Continuous Query Processing and Result Dissemination," talk at the 2006 Southeast Workshop on Data and Information Management (SEWDIM 2006), March 2006.
  42. "Querying Networked Data," presentation at IBM Research Triangle Park University Day, October 2005.
  43. "An Overview of Database Research at Duke," presentation at inDuke Meeting, Duke University, May 2005.
  44. "Caching for Network Querying," presentation at SIGMOD '05 Program Committee Workshop, Stanford, California, February 2005.
  45. "Layers and Boxes: Efficient and Maintainable Indexes for XML," seminar at IBM T. J. Watson Research Center, July 2004.
  46. "AutoBib: Automatic Extraction of Bibliographic Information on the Web," presentation at the 2004 International Database Engineering and Applications Symposium (IDEAS 2004).
  47. "Post-Web-Age Information Management," panel discussion at the 2003 International Conference on Web-Age Information Management (WAIM 2003).
  48. "TupleRank and Implicit Relationship Discovery in Databases," presentation at the 2003 International Conference on Web-Age Information Management (WAIM 2003).
  49. "Problems in Database View Maintenance and Web Data Extraction," seminar at University of North Carolina at Greensboro, April 2003.
  50. "Efficient Maintenance of Materialized Top-k Views," presentation at the 2003 International Conference on Data Engineering (ICDE 2003).
  51. "Incremental Computation and Maintenance of Temporal Aggregates," presentation at the 2001 International Conference on Data Engineering (ICDE 2001).
  52. "Query Processing in Kidar," guest lecture for a course on database system implementation at Stanford University, Stanford, California, November 2000.
  53. "Performance Issues in Incremental Warehouse Maintenance," presentation at the 2000 International Conference on Very Large Data Bases (VLDB 2000).
  54. "TIP: A Temporal Extension to Informix," system demonstration at the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD 2000).
  55. "Temporal Data Warehousing," colloquia at Brown University, Cornell University, Duke University, Harvard University, Santa Clara University, State University of New York at Stony Brook, University of California at Santa Barbara, University of California at Santa Cruz, University of Southern California, Yale University, and IBM Almaden Research Center, February 2000 - May 2000.
  56. "TIP: A Temporal Extension to Informix," presentation and system demonstration at Stanford Database Workshop, Stanford, California, March 2000.
  57. "TIP: A Temporal Extension to Informix," presentation and system demonstration at Informix Corporation, Oakland, California, March 2000.
  58. "Temporal View Self-Maintenance," presentation at the 2000 International Conference on Extending Database Technology (EDBT 2000).
  59. "TIP: A Temporal Extension to Informix," system demonstration at the 2000 International Conference on Extending Database Technology (EDBT 2000).
  60. "Maintaining Temporal Views Over Non-Temporal Information Sources For Data Warehousing," presentation at the 1998 International Conference on Extending Database Technology (EDBT 1998).
  61. "Performance Modeling and Composition: A Case Study in Cell Simulation," presentation at the 1996 International Parallel Processing Symposium (IPPS 1996).


Student Advising

Former Postdoctoral Advisees: Current Ph.D. student(s): Graduated Ph.D. students: Current M.S. Student(s): Graduated M.S. students: Undergraduate theses supervised: Undergraduate research internship: Undergraduate independent studies: Ph.D. defense committee (not as primary advisor): Ph.D. preliminary exam committee (not as primary advisor): Ph.D. research initiation project committee (not as primary advisor): M.S. committee (not as primary advisor): Undergraduate thesis committee (not as primary advisor): Internships for high school students:


Service to the professional community:
  1. Managing Editor, Proceedings of the VLDB Endowment (PVLDB), and Chair, PVLDB Advisory Committee, January 2025 - present.
  2. Trustee of the VLDB Endowment, January 2024 - December 2029.
  3. Associate Editor, ACM SIGMOD International Conference on Management of Data (SIGMOD), January 2024 - January 2026.
  4. Review Quality Co-Chair, the 2025 International Conference on Data Engineering (ICDE 2025).
  5. Editor, Foundations and Trends (FnT) in Databases, July 2023 - present.
  6. Member, PVLDB Advisory Committee, September 2023 - present.
  7. Associate Editor, ACM Transactions on Database Systems (TODS), February 2015 - present.
  8. Editor-in-Chief, Proceedings of the VLDB Endowment (PVLDB), Vol. 16, April 2022 - September 2023.
  9. Associate Editor, Proceedings of the VLDB Endowment (PVLDB), April 2020 - March 2021 and April 2021 - March 2022.
  10. Guest Editor, IEEE Data Engineering Bulletin (DEBULL), January 2021 - September 2021 and January 2022 - September 2022.
  11. Associate Editor, ACM SIGMOD Record (SIGMODREC), January 2015 - present.
  12. Member, ACM SIGMOD Research Highlight Award Committee, 2019 - 2022.
  13. Program Committee, the 2020 ACM SIGMOD International Conference on Management of Data (SIGMOD 2020).
  14. Co-Chair, 2019 American Statistical Association (ASA) President's Initiative on The Role of Statistics and Computer Science in Fake News.
  15. Program Committee Member, 2019 Summer School Series on Methods for Computational Social Science, GESIS Leibniz Institute for Social Sciences.
  16. Program Committee Member, SIGKDD 2019 Workshop on Truth Discovery and Fact Checking: Theory and Practice.
  17. Steering Committee Member, 2019 International Workshop on Misinformation, Computational Fact-Checking and Credible Web.
  18. Program Committee Co-Chair, the 2019 International Conference on Database Systems for Advanced Applications (DASFAA 2019).
  19. Core Program Committee, the 2019 ACM SIGMOD International Conference on Management of Data (SIGMOD 2019).
  20. Program Vice Chair (on Data Science), the 2019 International Conference on Data Engineering (ICDE 2019).
  21. Panel Co-Chair, the 2019 International Conference on Data Engineering (ICDE 2019).
  22. Guest Editor, Special Issue on Combating Digital Misinformation, ACM Journal of Data and Information Quality (JDIQ), September 2017 - July 2019.
  23. Program Committee, the 2018 Workshop on Data Management for End-to-End Machine Learning (DEEM 2018).
  24. Program Committee, the 2018 International Workshop on the Web and Databases (WEBDB 2018).
  25. Program Committee, the 2018 International Conference on World Wide Web (WWW 2018).
  26. General Co-Chair, the 2017 ACM SIGMOD International Conference on Management of Data (SIGMOD 2017).
  27. Tutorial Program Committee, the 2016 ACM SIGMOD International Conference on Management of Data (SIGMOD 2016).
  28. Guest Editor, Special Issue on Visionary Ideas in Data Management, ACM SIGMOD Record (SIGMODREC), October 2014 - July 2015.
  29. Subject Area Editor (Database and Knowledge-Based Systems), Journal of Computer Science and Technology (JCST), December 2011 - December 2018.
  30. Program Committee, the 2016 Workshop on Human-In-the-Loop Data Analytics (HILDA 2016).
  31. Senior Program Committee, the 2015 International Conference on Information and Knowledge Management (CIKM 2015).
  32. Best Paper Selection Committee, the 2015 National Database Conference of China (NDBC 2015).
  33. Program Committee Group Leader, the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD 2015).
  34. General Co-Chair, the 2015 International Conference on Web-Age Information Management (WAIM 2015).
  35. Program Committee, the 2014 International Conference on Information and Knowledge Management (CIKM 2014).
  36. Review Board, Proceedings of the VLDB Endowment, August 2008 - March 2012, April 2013 - March 2015, and April 2018 - March 2019.
  37. Program Committee Co-Chair, the 2014 International Workshop on Bringing the Value of Big Data to Users (DATA4U 2014).
  38. Best Paper Selection Committee, the 2014 National Database Conference of China (NDBC 2014).
  39. Program Committee, the 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD 2014).
  40. Program Committee, the 2014 International Workshop on Exploratory Search in Databases and the Web (EXPLOREDB 2014).
  41. Senior Program Committee, the 2013 International Conference on Information and Knowledge Management (CIKM 2013).
  42. Demonstration Program Committee Co-Chair, the 2013 International Conference on Very Large Data Bases (VLDB 2013).
  43. Best Paper Selection Committee, the 2013 National Database Conference of China (NDBC 2013).
  44. Program Committee Area Chair (Streams, Sensor Networks, Complex Event Processing), the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD 2013).
  45. Associate Editor, IEEE Transactions on Knowledge and Data Engineering (TKDE), March 2009 - March 2013.
  46. Publicity Co-Chair, the 2013 International Conference on Database Systems for Advanced Applications (DASFAA 2013).
  47. Panel Co-Chair, the 2013 International Conference on Data Engineering (ICDE 2013).
  48. Senior Program Committee, the 2012 International Conference on Information and Knowledge Management (CIKM 2012).
  49. Best Paper Selection Committee, the 2012 National Database Conference of China (NDBC 2012).
  50. Program Committee, the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD 2012).
  51. Program Committee, the 2012 International Conference on Data Engineering (ICDE 2012).
  52. Program Committee, the 2011 International Conference on Data Engineering (ICDE 2011).
  53. Program Committee, the 2011 Conference on Innovative Data Systems Research (CIDR 2011).
  54. Program Committee, the 2010 International Conference on Very Large Data Bases (VLDB 2010).
  55. Program Committee, the 2010 International Workshop on Data Management for Sensor Networks (DMSN 2010).
  56. Program Committee Co-Chair, the 2010 International Conference on Web-Age Information Management (WAIM 2010).
  57. Program Committee, the 2010 International Conference on Data Engineering (ICDE 2010).
  58. Program Committee, the 2010 International Workshop on Ranking in Databases (DBRANK 2010).
  59. Program Committee, the 2009 International Workshop on Cloud Data Management (CLOUDDB 2009).
  60. Program Committee, the 2009 IFIP/ACM International Conference on Distributed Systems Platforms (MIDDLEWARE 2009).
  61. Program Committee, the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD 2009).
  62. Program Committee, the 2009 ACM Workshop on Data Engineering for Wireless and Mobile Access (MOBIDE 2009).
  63. Program Committee, the 2009 International Workshop on Scalable Stream Processing Systems (SSPS 2009).
  64. Regional Chair (America), the 2009 International Conference on Database Systems for Advanced Applications (DASFAA 2009).
  65. Program Committee, the 2009 International Conference on World Wide Web (WWW 2009).
  66. Program Committee, the 2009 International Workshop on Ranking in Databases (DBRANK 2009).
  67. Program Committee, the 2009 International Conference on Data Engineering (ICDE 2009).
  68. Program Committee, the 2009 Conference on Innovative Data Systems Research (CIDR 2009).
  69. Steering Committee Member, International Conference on Web-Age Information Management (WAIM), September 2008 - present.
  70. General Co-Chair and Program Committee Member, the 2008 International Workshop on Data Management for Sensor Networks (DMSN 2008).
  71. Program Committee, the 2008 International Conference on Information and Knowledge Management (CIKM 2008).
  72. Program Committee, the 2008 ACM Workshop on Data Engineering for Wireless and Mobile Access (MOBIDE 2008).
  73. Program Committee, the 2008 International Conference on Web-Age Information Management (WAIM 2008).
  74. Program Committee, the 2008 IEEE International Conference on Computational Science and Engineering (CSE 2008).
  75. Program Committee, the 2008 International Workshop on Scalable Stream Processing Systems (SSPS 2008).
  76. Program Committee, the 2008 International Conference on Very Large Data Bases (VLDB 2008).
  77. Program Committee, the 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD 2008).
  78. Program Committee, the 2008 International Conference on Data Engineering (ICDE 2008).
  79. Program Committee Co-Chair, the 2007 International Workshop on Data Management for Sensor Networks (DMSN 2007).
  80. Demonstration Program Committee, the 2007 International Conference on Very Large Data Bases (VLDB 2007).
  81. Program Committee, the 2007 International Conference on Scalable Information Systems (INFOSCALE 2007).
  82. Program Committee, the 2007 International Symposium on Large Spatio-Temporal Databases (SSTD 2007).
  83. Program Committee, the 2007 Joint Conference of the Asia-Pacific Web Conference and the International Conference on Web-Age Information Management (APWEBWAIM 2007).
  84. Program Committee, the 2007 ACM SIGMOD International Conference on Management of Data (SIGMOD 2007).
  85. Program Committee, the 2007 ACM SIGMOD International Conference on Management of Data (SIGMOD 2007) Ph.D. Workshop on Innovative Database Research.
  86. Program Committee, the 2007 Workshop on Networking Meets Databases (NETDB 2007).
  87. Program Committee, the 2007 International Workshop on Scalable Stream Processing Systems (SSPS 2007).
  88. Program Committee, the 2007 International Conference on Data Engineering (ICDE 2007).
  89. Program Committee, the 2006 International Conference on Information and Knowledge Management (CIKM 2006).
  90. Program Committee, the 2006 International Conference on Geosensor Networks (GSN 2006).
  91. Program Committee, the 2006 International Workshop on Data Management for Sensor Networks (DMSN 2006).
  92. Program Committee, the 2006 International XML Database Symposium (XSYM 2006).
  93. Program Committee, the 2006 International Conference on Very Large Data Bases (VLDB 2006) Ph.D. Workshop.
  94. Program Committee Co-Chair, the 2006 Southeast Workshop on Data and Information Management (SEWDIM 2006).
  95. Program Committee, the 2006 International Conference on Web-Age Information Management (WAIM 2006).
  96. Program Committee, the 2005 International Conference on Data Mining (ICDM 2005).
  97. Program Committee, the 2005 ACM International Workshop on Web Information and Data Management (WIDM 2005).
  98. Program Committee, the 2005 ACM SIGMOD International Conference on Management of Data (SIGMOD 2005).
  99. Program Committee, the 2005 International XML Database Symposium (XSYM 2005).
  100. Program Committee, the 2005 International Conference on Very Large Data Bases (VLDB 2005) Ph.D. Workshop.
  101. Program Committee, the 2005 International Conference on Database Systems for Advanced Applications (DASFAA 2005).
  102. Publications Chair, the 2005 International Conference on Web-Age Information Management (WAIM 2005).
  103. Program Committee, the 2004 International Conference on Data Mining (ICDM 2004).
  104. Program Committee, the 2004 International Conference on Very Large Data Bases (VLDB 2004).
  105. Program Committee, the 2004 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD 2004).
  106. Demonstration Program Committee, the 2004 ACM SIGMOD International Conference on Management of Data (SIGMOD 2004).
  107. Participant of the Summer Workshop on Developing the Field of Computational Journalism, Center for Advanced Study in Behavioral Sciences, Stanford, California, July 2009.
  108. Panelist for NSF, 2003, 2004, 2005, 2008, 2009, 2010, 2011, 2013, 2014 (twice), and 2016..
  109. Panelist for Department of Homeland Security, 2006.
  110. Expert Panelist on Cancer Reporting Information Technology, Office of the Assistant Secretary for Planning and Evaluation, Department of Health and Human Services, 2008 - 2009.
  111. Reviewer for Research Grants Council of Hong Kong, 2010, 2012.
  112. Reviewer for Natural Sciences and Engineering Research Council of Canada, 2008.
  113. Reviewer for Netherlands Organisation for Scientific Research, 2006.
  114. Associate Information Director, ACM SIGMOD, 2003 - present.
  115. Started Carolina Database Research Group (CDB) in 2003 with a group of database researchers in North Carolina and continue to be one of the main organizers.
  116. Publicity Chair, the 2004 International Conference on Mobile Data Management (MDM 2004).
  117. Reviewers for journals: ACM Transactions on Database Systems (TODS), The VLDB Journal (VLDBJ), IEEE Transactions on Knowledge and Data Engineering (TKDE), ACM Transactions on Programming Languages and Systems (TOPLAS), ACM SIGMOD Record (SIGMODREC), The Computer Journal (CJ), Information and Computation (IC), Information Processing Letters (IPL), IEEE Transactions on Mobile Computing (TMC), Data and Knowledge Engineering (DKE), IEEE Internet Computing (INTERNET), Information and Software Technology (IST), Journal of Systems and Software (JSS), Knowledge and Information Systems (KAIS), Ad Hoc and Sensor Wireless Networks (AHSWN), Journal of Research and Practice in Information Technology (JRPIT), Journal of Computer Science and Technology (JCST), Distributed and Parallel Databases (DPDB), International Journal of Computer Systems Science and Engineering (CSSE), LNCS Journal on Data Semantics (JODS), Electronics and Telecommunications Research Institute Journal (ETRI), Proceedings of the IEEE (PIEEE).
  118. Reviewers for conferences: ACM SIGMOD International Conference on Management of Data (SIGMOD), International Conference on Very Large Data Bases (VLDB), International Conference on Data Engineering (ICDE), ACM Symposium on Principles of Database Systems (PODS), International Conference on World Wide Web (WWW), International Conference on Information and Knowledge Management (CIKM), International Workshop on the Web and Databases (WEBDB), ACM Symposium on Cloud Computing (SOCC), International Symposium on Theoretical Aspects of Computer Science (STACS), European Symposium on Algorithms (ESA), International Conference on Distributed Computing Systems (ICDCS), International Conference on Mobile Systems, Applications, and Services (MOBISYS), USENIX Annual Technical Conference (USENIX), ACM Symposium on Parallel Algorithms and Architectures (SPAA).
  119. Designer of the ACM SIGMOD logo, IEEE Data Engineering logo, Stanford InfoLab's old logo, VLDB 2011 logo, and a number of others.
Service to Duke University and the Department of Computer Science: Other activities: