Skip to main content Link Search Menu Expand Document (external link)

Selected Publications

A list of Selected Publications and their descriptions can be found here.

All Publications (Chronological Order)


  • Cedric Renggli, Xiaozhe Yao, Luka Kolar, Luka Rimanic, Ana Klimovic, Ce Zhang. SHiFT: an efficient, flexible search engine for transfer learning. VLDB 2023.


  • Binhang Yuan, Yongjun He, Jared Quincy Davis, Tianyi Zhang, Tri Dao, Beidi Chen, Percy Liang, Christopher Re, Ce Zhang. Decentralized Training of Foundation Models in Heterogeneous Environments. NeurIPS 2022 (Oral Presentation 186/9600 = 1.9% submissions)

  • Jue WANG, Binhang Yuan, Luka Rimanic, Yongjun He, Tri Dao, Beidi Chen, Christopher Re, Ce Zhang. Fine-tuning Language Models over Slow Networks using Activation Quantization with Guarantees. NeurIPS 2022.

  • Jiawei Jiang, Lukas Burkhalter, Fangcheng Fu, Bolin Ding, Bo Du, Anwar Hithnawi, Bo Li, Ce Zhang. VF-PS: How to Select Important Participants in Vertical Federated Learning, Efficiently and Securely? NeurIPS 2022.

  • Zhuolin Yang, Zhikuan Zhao, Boxin Wang, Jiawei Zhang, Linyi Li, Hengzhi Pei, Bojan Karlaš, Ji Liu, Heng Guo, Ce Zhang, Bo Li. Improving Certified Robustness via Statistical Learning with Logical Reasoning. NeurIPS 2022.

  • Mintong Kang, Linyi Li, Maurice Weber, Yang Liu, Ce Zhang, Bo Li. Certifying Some Distributional Fairness with Subpopulation Decomposition. NeurIPS 2022.

  • Maurice Weber, Linyi Li, Boxin Wang, Bo Li, Ce Zhang. Certifying Out-of-Domain Generalization for Blackbox Functions. ICML 2022.

  • Shaoduo Gan, Xiangru Lian, Rui Wang, Jianbin Chang, Chengjun Liu, Hongmei Shi, Shengzhuo Zhang, Xianghong Li, Tengxu Sun, Jiawei Jiang, Binhang Yuan, Sen Yang, Ji Liu, Ce Zhang. BAGUA: Scaling up Distributed Learning with System Relaxations. VLDB 2022.

  • Susie Xi Rao, Shuai Zhang, Zhichao Han, Zitao Zhang, Wei Min, Zhiyao Chen, Yinan Shan, Yang Zhao, Ce Zhang. xFraud: Explainable Fraud Transaction Detection. VLDB 2022.

  • Gyeong-​In Yu, Saeed Amizadeh, Sehoon Kim, Artidoro Pagnoni, Ce Zhang, Byung-​Gon Chun, Markus Weimer, Matteo Interlandi. WindTunnel: Towards Differentiable ML Pipelines Beyond a Single Model. VLDB 2022.

  • Yang Li, Yu Shen, Huaijun Jiang, Wentao Zhang, Jixiang Li, Ji Liu, Ce Zhang, Bin Cui. Hyper-​Tune: Towards Efficient Hyper-parameter Tuning at Scale. VLDB 2022.

  • Zitao Li, Bolin Ding, Ce Zhang, Ninghui Li, Jingren Zhou. Federated Matrix Factorization with Privacy Guarantee. VLDB 2022.

  • Lijie Xu, Shuang Qiu, Binhang Yuan, Jiawei Jiang, Cedric Renggli, Shaoduo Gan, Kaan Kara, Guoliang Li, Ji Liu, Wentao Wu, Jieping Ye, Ce Zhang. In-Database Machine Learning with CorgiPile: Stochastic Gradient Descent without Full Data Shuffle. SIGMOD 2022.

  • Baoqing Cai, Yu Liu, Ce Zhang, Guangyu Zhang, Ke Zhou, Li Liu, Chunhua Li, Bin Cheng, Jie Yang, Jiashu Xing. HUNTER: An Online Cloud Database Hybrid Tuning System for Personalized Requirements. SIGMOD 2022.

  • DAPHNE Consortium. DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines. CIDR 2022.

  • Alfonso Amayuelas, Shuai Zhang, Xi Susie Rao, Ce Zhang. Neural Methods for Logical Reasoning over Knowledge Graphs. ICLR 2022.

  • Yuexiang Xie, Zhen WANG, Yaliang Li, Ce Zhang, Jingren Zhou, Bolin Ding. iFlood: A Stable and Effective Regularizer. ICLR 2022.

  • Gyri Reiersen, David Dao, Björn Lütjens, Konstantin Klemmer, Kenza Amara, Attila Steinegger, Ce Zhang, Xiaoxiang Zhu. ReforesTree: A Dataset for Estimating Tropical Forest Carbon Stock with Deep Learning and Aerial Imagery. AAAI 2022.

  • Yilmazcan Özyurt, Tobias Hatt, Ce Zhang, Stefan Feuerriegel. A Deep Markov Model for Clickstream Analytics in Online Shopping. WWW 2022.


  • Bojan Karlas, Peng Li, Renzhi Wu, Nezihe Merve Gürel, Xu Chu, Wentao Wu, Ce Zhang. Nearest Neighbor Classifiers over Incomplete Information: From Certain Answers to Certain Predictions. VLDB 2021.

  • Yang Li, Yu Shen, Wentao Zhang, Jiawei Jiang, Yaliang Li, Bolin Ding, Jingren Zhou, Zhi Yang. Wentao Wu, Ce Zhang, Bin Cui. VolcanoML: Speeding up End-to-End AutoML via Scalable Search Space Decomposition. VLDB 2021.

  • Jiawei Jiang, Shaoduo Gan, Yue Liu, Fanlin Wang, Gustavo Alonso, Ana Klimovic, Ankit Singla, Wentao Wu, Ce Zhang. Towards Demystifying Serverless Machine Learning Training. SIGMOD 2021.

  • Leonel Aguilar, David Dao, Shaoduo Gan, Nezihe Merve Gurel, Nora Hollenstein, Jiawei Jiang, Bojan Karlas, Thomas Lemmin, Tian Li, Yang Li, Susie Rao, Johannes Rausch, Cedric Renggli, Luka Rimanic, Maurice Weber, Shuai Zhang, Zhikuan Zhao, Kevin Schawinski, Wentao Wu, Ce Zhang. Ease.ML: A Lifecycle Management System for Machine Learning. CIDR 2021.

  • Peng Li, Xi Rao, Jeffinifer Blase, Yue Zhang, Xu Chu, Ce Zhang. CleanML: A Benchmark for Evaluating the Impact of Data Cleaning on ML Classification Tasks. ICDE 2021.

  • Nezihe Merve Gürel, Xiangyu Qi, Luka Rimanic, Ce Zhang, Bo Li. Knowledge Enhanced Machine Learning Pipeline against Diverse Adversarial Attacks. ICML 2021.

  • Hanlin Tang, Shaoduo Gan, Ammar Ahmad Awan, Samyam Rajbhandari, Conglong Li, Xiangru Lian, Ji Liu, Ce Zhang, Yuxiong He. 1-bit Adam: Communication Efficient Large-Scale Training with Adam’s Convergence Speed. ICML 2021.

  • Yujing Wang, Yaming Yang, Jiangang Bai, Mingliang Zhang, Jing Bai, Jing Yu, Ce Zhang, Gao Huang, Yunhai Tong. Evolving Attention with Residual Convolutions. ICML 2021.

  • Xupeng Miao, Nezihe Merve Gürel, Wentao Zhang*, Zhichao Han, Bo Li, Wei Min, Susie Xi Rao, Hansheng Ren, Yinan Shan, Yingxia Shao, Yujie Wang, Fan Wu, Hui Xue, Yaming Yang, Zitao Zhang, Yang Zhao, Shuai Zhang, Yujing Wang, Bin Cui, Ce Zhang. DeGNN: Improving Graph Neural Networks with Graph Decomposition. KDD 2021.

  • Yang Li, Yu Shen, Wentao Zhang, Yuanwei Chen, Huai Jun Jiang, Ming Chao Liu, Jiawei Jiang, Jinyang Gao, Wentao Wu, Zhi Yang, Ce Zhang, Bin Cui. OpenBox: A Generalized Black-box Optimization Service. KDD 2021.

  • Yuexiang Xie, Zhen Wang, Yaliang Li, Bolin Ding, Nezihe Merve Gürel, Ce Zhang, Minlie Huang, Wei Lin, Jingren Zhou. FIVES: Feature Interaction Via Edge Search for Large-Scale Tabular Data. KDD 2021.

  • Wenqi Jiang, Zhenhao He, Shuai Zhang, Kai Zeng, Liang Feng, Jiansong Zhang, Tongxuan Liu, Yong Li, Jingren Zhou, Ce Zhang, Gustavo Alonso. FleetRec: Large-Scale Recommendation Inference on Hybrid GPU-FPGA Clusters. KDD 2021.

  • Wenqi Jiang, Zhenhao He, Thomas B. Preußer, Shuai Zhang, Kai Zeng, Liang Feng, Jiansong Zhang, Tongxuan Liu, Yong Li, Jingren Zhou, Ce Zhang, and Gustavo Alonso. Accelerating Deep Recommendation Systems to Microseconds by Hardware and Data StructureSolutions. MLSys 2021.

  • Mohammad Reza Karimi, Nezihe Merve Gürel, Bojan Karlaš, Johannes Rausch, Ce Zhang, and Andreas Krause. Online Active Model Selection for Pre-trained Classifiers. AISTATS 2021.

  • Linyi Li, Maurice Weber, Xiaojun Xu, Luka Rimanic, Bhavya Kailkhura, Tao Xie, Ce Zhang, Bo Li. TSS: Transformation-Specific Smoothing for Robustness Certification. CCS 2021.

  • Boxin Wang, Fan Wu, Yunhui Long, Luka Rimanic, Ce Zhang, Bo Li. DataLens: Scalable Privacy Preserving Training via Gradient Compression and Aggregation. CCS 2021.

  • Johannes Rausch, Octavio Martinez, Fabian Bissig, Ce Zhang, Stefan Feuerriegel. DocParser: Hierarchical Structure Parsing of Document Renderings. AAAI 2021.

  • Yang Li, Yu Shen, Jiawei Jiang, Jinyang Gao, Ce Zhang, Bin Cui. MFES-HB: Efficient Hyperband with Multi-Fidelity Quality Measurements. AAAI 2021.

  • Nora Hollenstein, Federico Pirovano, Ce Zhang, Lena Jäger, Lisa Beinborn. Multilingual language models predict human reading behavior. NAACL 2021.

  • Ruoxi Jia, Fan Wu, Xuehui Sun, Jiacen Xu, David Dao, Bhavya Kailkhura, Ce Zhang, Bo Li, Dawn Song. Scalability vs. Utility: Do We Have to Sacrifice One for the Other in Data Importance Quantification? CVPR 2021.

  • Shuai Zhang, Huoyu Liu, Aston Zhang, Yue Hu, Ce Zhang, Yumeng Li, Tanchao Zhu, Shaojian He, Wenwu Ou. Learning User Representations with Hypercuboids for Recommender Systems. WSDM 2021

  • Cedric Renggli, Luka Rimanic, Nezihe Merve Gurel, Bojan Karlas, Wentao Wu, and Ce Zhang. A Data Quality-Driven View of MLOps. IEEE Data Engineering Bulletin 2021.

  • Maurice Weber, Nana Liu, Bo Li, Ce Zhang, Zhikuan Zhao. Optimal Provable Robustness of Quantum Classification via Quantum Hypothesis Testing. npj Quantum Information 2021.

  • Nora Hollenstein, Cedric Renggli, Benjamin Glaus, Maria Barrett, Marius Troendle, Nicolas Langer, Ce Zhang. Decoding EEG brain activity for multi-modal natural language processing. Frontiers in Human Neuroscience 2021.


  • Ji Liu, Ce Zhang. Distributed Learning Systems with First-Order Methods. (Foundations and Trends® in Databases series) 2020.

  • Nezihe Merve Gürel, Kaan Kara, Alen Stojanov, Tyler Smith, Thomas Lemmin, Dan Alistarh, Markus Püschel and Ce Zhang. Compressive Sensing Using Iterative Hard Thresholding with Low Precision Data Representation: Theory and Applications. IEEE Transactions on Signal Processing 2020.

  • Fangcheng Fu, Yuzheng Hu, Yihan He, Jiawei Jiang, Yingxia Shao, Ce Zhang, Bin Cui. Don’t Waste Your Bits! Squeeze Activations and Gradients for Deep Neural Networks via TinyScript. ICML 2020.

  • Luka Rimanic, Cedric Renggli, Bo Li, Ce Zhang. On Convergence of Nearest Neighbor Classifiers over Feature Transformations. NeurIPS 2020.

  • Zhiqiang Tao, Yaliang Li, Bolin Ding, Ce Zhang, Jingren Zhou, Yun Fu. Learning to Mutate with Hypergradient Guided Population. NeurIPS 2020.

  • Defu Cao, Yujing Wang, Juanyong Duan, Ce Zhang, Xia Zhu, Congrui Huang, Yunhai Tong, Bixiong Xu, Jing Bai, Jie Tong, Qi Zhang. Spectral Temporal Graph Neural Network for Multivariate Time-series Forecasting. NeurIPS 2020. (Spotlight Presentation: 280/9454 = 2.96% submissions)

  • Bojan Karlaš, Matteo Interlandi, Cedric Renggli, Wentao Wu, Ce Zhang, Deepak Mukunthu Iyappan Babu, Jordan Edwards, Chris Lauren, Andy Xu and Markus Weimer. Building Continuous Integration Services for Machine Learning. KDD 2020 (Oral Presentation, Applied Data Science 44/756 = 5.8% submissions)

  • Zhipeng Zhang, Wentao Wu, Jiawei Jiang, Lele Yu, Bin Cui, Ce Zhang. ColumnSGD: A Column-oriented Framework for Distributed Stochastic Gradient Descent. ICDE 2020.

  • Giuseppe Russo, Nora Hollenstein, Claudiu Musat, Ce Zhang. Control, Generate, Augment: A Scalable Framework for Multi-Attribute Text Generation. Findings of EMNLP 2020.

  • Nora Hollenstein, Marius Troendle, Ce Zhang and Nicolas Langer. ZuCo 2.0: A Dataset of Physiological Recordings During Natural Reading and Annotation. LREC 2020.

  • Maurice Weber, Cedric Renggli, Helmut Grabner, Ce Zhang. Observer Dependent Lossy Image Compression. GCPR 2020.

  • Yang Li, Jiawei Jiang, Jinyang Gao, Yingxia Shao, Ce Zhang, Bin Cui. Efficient Automatic CASH via Rising Bandits. AAAI 2020.

  • Yujing Wang, Yaming Yang, Yiren Chen, Jing Bai, Ce Zhang, Guinan Su, Xiaoyu Kou, Yunhai Tong, Mao Yang, Lidong Zhou. TextNAS: A Neural Architecture Search Space tailored for Text Representation. AAAI 2020.

  • Zhipeng Zhang, Wentao Wu, Jiawei Jiang, Lele Yu, Bin Cui, Ce Zhang. ColumnSGD: A Column-oriented Framework for Distributed Stochastic Gradient Descent. ICDE 2020.

  • Yunyan Guo, Zhipeng Zhang, Jiawei Jiang, Wentao Wu, Ce Zhang, Bin Cui, Jianzhong Li. Model Averaging in Distributed Machine Learning: A Case Study with Apache Spark. VLDB Journal 2020.

  • Tianhao Wang, Johannes Rausch, Ce Zhang, Ruoxi Jia, Dawn Song. A Principled Approach to Data Valuation for Federated Learning. Book Chapter in Federated Learning: Privacy and Incentive 2020.

  • Christian Pfeiffer, Nora Hollenstein, Ce Zhang, Nicolas Langer. Neural dynamics of sentiment processing during naturalistic sentence reading. NeuroImage Journal 2020.

  • Hussein Hassan-Harrirou, Ce Zhang, and Thomas Lemmin. RosENet: Improving Binding Affinity Prediction by Leveraging Molecular Mechanics Energies with an Ensemble of 3D Convolutional Neural Networks. J. Chem. Inf. Model. 2020.


  • Ruoxi Jia, David Dao, Boxin Wang, Frances Ann Hubis, Nezihe Merve Gurel, Bo Li, Ce Zhang, Costas J. Spanos, Dawn Song. Efficient Task-Specific Data Valuation for Nearest Neighbor Algorithms. VLDB 2019.

  • Kaan Kara, Ken Eguro, Ce Zhang, Gustavo Alonso. ColumnML: Column-Store Machine Learning with On-The-Fly Data Transformation. VLDB 2019.

  • Zeke Wang, Kaan Kara, Hantian Zhang, Gustavo Alonso, Onur Mutlu, and Ce Zhang. Accelerating Generalized Linear Models with MLWeaving: A One-Size-Fits-All System for Any-precision Learning. VLDB 2019.

  • Nora Hollenstein, Antonio De La Torre, Nicolas Langer and Ce Zhang. CogniVal: A Framework for Cognitive Word Embedding Evaluation. CoNLL 2019.

  • Chen Yu, Hanlin Tang, Cedric Renggli, Simon Kassing, Ankit Singla, Dan Alistarh, Ce Zhang, Ji Liu. Distributed Learning over Unreliable Networks. ICML 2019.

  • Marc Fischer, Mislav Balunovic, Dana Drachsler-Cohen, Timon Gehr, Ce Zhang, Martin Vechev. DL2: Training and Querying Neural Networks with Logic. ICML 2019.

  • Cedric Renggli, Bojan Karlas, Bolin Ding, Feng Liu, Kevin Schawinski, Wentao Wu, Ce Zhang. Continuous Integration of Machine Learning Models: A Rigorous Yet Practical Treatment. SysML 2019.

  • Vojislav Dukic, Sangeetha Abdu Jyothi, Bojan Karlas, Muhsen Owaida, Ce Zhang, Ankit Singla. Is advance knowledge of flow sizes a plausible assumption? NSDI 2019.

  • Nora Hollenstein and Ce Zhang. Entity Recognition at First Sight: Improving NER with Eye Movement Information. NAACL 2019.

  • Ruoxi Jia, David Dao, Boxin Wang, Frances Ann Hubis, Merve Gurel, Nick Hynes, Bo Li, Ce Zhang, Dawn Song, Costas J. Spanos. Towards Efficient Data Valuation Based on the Shapley Value. AISTATS 2019.

  • Chen Yu, Bojan Karlas, Jie Zhong, Ce Zhang, Ji Liu. AutoML from Service Provider’s Perspective: Multi-device, Multi-tenant Model Selection with GP-EI. AISTATS 2019.

  • Zhipeng Zhang, Bin Cui, Wentao Wu, Ce Zhang, Lele Yu, Jiawei Jiang. MLlib*: Fast Training of GLMs using Spark MLlib. ICDE (Industry) 2019.

  • John T Halloran, Hantian Zhang, Kaan Kara, Cedric Renggli, Matthew The, Ce Zhang, David M Rocke, Lukas Käll, William Stafford Noble. Speeding up Percolator. Journal of Proteome Research 2019.

  • Nina Glaser, O. Ivy Wong, Kevin Schawinski, and Ce Zhang. RadioGAN – Translations between different radio surveys with generative adversarial networks. Monthly Notices of the Royal Astronomical Society 2019.


  • T Li, J Zhong, J Liu, W Wu, Ce Zhang. Towards Multi-tenant Resource Sharing for Machine Learning Workloads. VLDB 2018.

  • Yu Liu, Hantian Zhang, Luyuan Zeng, Wentao Wu, Ce Zhang. MLBench: Benchmarking Machine Learning Services Against Human Experts. VLDB 2018.

  • Hanlin Tang, Shaoduo Gan, Ce Zhang, Ji Liu. Communication Compression for Decentralized Training. NIPS 2018.

  • J Jiang, B Cui, Ce Zhang, F Fu. DimBoost: Boosting Gradient Boosting Tree to Higher Dimensions. SIGMOD 2018.

  • Hanlin Tang, Xiangru Lian, Ming Yan, Ce Zhang, Ji Liu. D2: Decentralized Training over Decentralized Data. ICML 2018.

  • X Lian, W Zhang, Ce Zhang, J Liu. Asynchronous Decentralized Parallel Stochastic Gradient Descent. ICML 2018.

  • H Guo, K Kara, Ce Zhang. Layerwise Systematic Scan: Deep Boltzmann Machines and Beyond. AISTATS 2018.

  • Jonathan Rotsztejn, Nora Hollenstein and Ce Zhang. ETH-DS3Lab at SemEval-2018 Task 7: Effectively Combining Recurrent and Convolutional Neural Networks for Relation Classification and Extraction. SemEval 2018. (SemEval Task 7 Top Ranked System; Task 7 Best Paper)

  • D Grubic, L Tam, D Alistarh, Ce Zhang. Synchronous Multi-GPU Deep Learning with Low-Precision Communication: An Experimental Study. EDBT 2018.

  • Nora Hollenstein, Jonathan Rotsztejn, Marius Tröndle, Andreas Pedroni, Ce Zhang, and Nicolas Langer. ZuCo, a simultaneous EEG and eye-tracking resource for natural sentence reading. Scientific Data 2018.

  • H Huang, C Zheng, J Zeng, W Zhou, S Zhu, P Liu, I Molloy, S Chari, Ce Zhang, Q Guan. A Large-scale Study of Android Malware Development Phenomenon on Public Malware Submission and Scanning Platform. IEEE Transactions on Big Data 2018.

  • Dominic Stark, Barthelemy Launet, Kevin Schawinski, Ce Zhang, Michael Koss, M Dennis Turp, Lia F Sartori, Hantian Zhang, Yiru Chen, Anna K Weigel. PSFGAN: a generative adversarial network system for separating quasar point sources and host galaxy light. Monthly Notices of the Royal Astronomical Society 2018.

  • Lia F Sartori, Kevin Schawinski, Benny Trakhtenbrot, Neven Caplar, Ezequiel Treister, Michael J Koss, C Megan Urry, Ce Zhang. A model for AGN variability on multiple time-scales. Monthly Notices of the Royal Astronomical Society 2018.

  • Sandro Ackermann, Kevin Schawinski, Ce Zhang, Anna K. Weigel, M. Dennis Turp. Using transfer learning to detect galaxy mergers. Monthly Notices of the Royal Astronomical Society 2018.

  • M. Dennis Turp, Kevin Schawinski, Ce Zhang. Exploring galaxy evolution with generative models. Astronomy and Astrophysics 2018.


  • L Yu, B Cui, Ce Zhang, Y Shao. LDA*: A Robust and Large-scale Topic Modeling System. VLDB 2017.

  • Z Zhang, Y Shao, B Cui, Ce Zhang. An experimental evaluation of simrank-based similarity search algorithms. VLDB 2017.

  • X Lian, Ce Zhang, H Zhang, CJ Hsieh, W Zhang, J Liu. Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent. NIPS 2017. (Oral Presentation 40/3240 = 1.2% submissions)

  • H Zhang, J Li, K Kara, D Alistarh, J Liu, Ce Zhang. The ZipML Framework for Training Models with End-to-End Low Precision: The Cans, the Cannots, and a Little Bit of Deep Learning. ICML 2017.

  • J Jiang, B Cui, Ce Zhang, L Yu. Heterogeneity-aware distributed parameter servers. SIGMOD 2017.

  • K M Owaida, H Zhang, G Alonso, Ce Zhang. Scalable Inference of Decision Tree Ensembles: Flexible Design for CPU-FPGA Platforms. FPL 2017.

  • K Kara, D Alistarh, G Alonso, O Mutlu, Ce Zhang. FPGA-accelerated Dense Linear Machine Learning: A Precision-Convergence Trade-off. FCCM 2017.

  • J Jiang, J Jiang, B Cui, Ce Zhang. TencentBoost: A Gradient Boosting Tree System with Parameter Server. ICDE (Industrial Track) 2017.

  • K Schawinski, Ce Zhang, H Zhang, L Fowler, GK Santhanam. Generative Adversarial Networks recover features in astrophysical images of galaxies beyond the deconvolution limit. Monthly Notices of the Royal Astronomical Society 2017.


  • H Huang, C Zheng, J Zeng, W Zhou, S Zhu, P Liu, S Chari, Ce Zhang. Android malware development on public malware scanning platforms: A large-scale data-driven study. IEEE Big Data 2016.

  • Kun-Hsing Yu, Ce Zhang, Gerald J Berry, Russ B Altman, Christopher Re, Daniel L Rubin, Michael Snyder. Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features. Nature Communications 2016.

  • Christopher De Sa, Alex Ratner, Christopher Re, Jaeho Shin, Feiran Wang, Sen Wu, Ce Zhang. DeepDive: declarative knowledge base construction. ACM SIGMOD Record 2016.

  • Christopher De Sa, Alex Ratner, Christopher Re, Jaeho Shin, Feiran Wang, Sen Wu, Ce Zhang. Incremental knowledge base construction using DeepDive. The VLDB Journal 2016. (Bests of VLDB 2015)

  • Xinghao Pan, Maximilian Lam, Stephen Tu, Dimitris S. Papailiopoulos, Ce Zhang, Michael I. Jordan, Kannan Ramchandran, Christopher Ré, Benjamin Recht: CYCLADES: Conflict-free Asynchronous Machine Learning. NIPS 2016.

  • Ce Zhang, Jaeho Shin, Christopher Re, Michael Cafarella, Feng Niu. Extracting databases from dark data with DeepDive. SIGMOD (Industry Track) 2016.

  • Ioannis Mitliagkas, Ce Zhang, Stefan Hadjis, Christopher Ré. Asynchrony begets momentum, with an application to deep learning. Allerton 2016.

  • Ce Zhang, Arun Kumar, and Christopher Re. Materialization optimizations for feature selection workloads. TODS 2016. (Bests of SIGMOD 2014)


  • Jaeho Shin, Sen Wu, Feiran Wang, Christopher De Sa, Ce Zhang, and Christopher Re. Incremental knowledge base construction using DeepDive. VLDB 2015. (SIGMOD Research Highlight Award)

  • Emily Mallory, Ce Zhang, Christopher Re, and Russ Altman. Large-scale extraction of gene interactions from full text literature using DeepDive. Bioinformatics 2015.

  • Christopher De Sa, Ce Zhang, Kunle Olukotun, and Christopher Re. Rapidly mixing Gibbs sampling for a class of factor graphs using hierarchy width. NIPS 2015.

  • Christopher De Sa, Ce Zhang, Kunle Olukotun, and Christopher Re. Taming the wild: A unified analysis of Hogwild!-style algorithms. NIPS 2015.

  • Gabor Angeli, Sonal Gupta, Melvin Johnson Premkumar, Christopher D Manning, Christopher Re, Julie Tibshirani, Jean Y Wu, Sen Wu, and Ce Zhang. Stanford’s distantly supervised slot filling systems for KBP 2014. Text Analysis Conference Proceedings 2015.


  • Ce Zhang and Christopher Re. DimmWitted: A study of main-memory statistical analytics. VLDB 2014.

  • Shanan Peters, Ce Zhang, Miron Livny, and Christopher Re. A machine-compiled macroevolutionary history of Phanerozoic life. PLoS One 2014.

  • Ce Zhang, Arun Kumar, and Christopher Re. Materialization optimizations for feature selection workloads. SIGMOD 2014. (SIGMOD Best Paper Award)

  • Yingbo Zhou, Utkarsh Porwal, Ce Zhang, Hung Q. Ngo, Long Nguyen, Christopher Re, and Venu Govindaraju. Parallel feature selection inspired by group testing. NIPS 2014.


  • Ce Zhang and Christopher Re. Towards high-throughput Gibbs sampling at scale: a study across storage managers. SIGMOD 2013.

  • Srikrishna Sridhar, Stephen J. Wright, Christopher Re, Ji Liu, Victor Bittorf, and Ce Zhang. An approximate, efficient LP solver for LP rounding. NIPS 2013.

  • Michael Anderson, Dolan Antenucci, Victor Bittorf, Matthew Burgess, Michael J. Cafarella, Arun Ku- mar, Feng Niu, Yongjoo Park, Christopher Re, and Ce Zhang. Brainwash: A data system for feature engineering. CIDR 2013.

  • Young Chol Song, Henry A. Kautz, James F. Allen, Mary D. Swift, Yuncheng Li, Jiebo Luo, and Ce Zhang. A Markov logic framework for recognizing complex events from multimodal data. ICMI 2013.

  • John R. Frank, Max Kleiman-Weiner, Daniel A. Roberts, Feng Niu, Ce Zhang, Christopher Re, Ian Soboroff. Building an entity-centric stream filtering test collection for TREC 2012. TREC 2013.


  • Ce Zhang, Feng Niu, Christopher Re, and Jude W. Shavlik. Big data versus the crowd: Looking for relationships in all the right places. ACL 2012.

  • Feng Niu, Ce Zhang, Christopher Re, and Jude W. Shavlik. Scaling inference for Markov logic via dual decomposition. ICDM 2012.

  • John R. Frank, Max Kleiman-Weiner, Daniel A. Roberts, Feng Niu, Ce Zhang, Christopher Ré, Ian Soboroff. Building an Entity-Centric Stream Filtering Test Collection for TREC 2012. TREC 2012.

  • Feng Niu, Ce Zhang, Christopher Re, and Jude W. Shavlik. Elementary: Large-scale knowledge-base construction via machine learning and statistical inference. Int. J. Semantic Web Inf. Syst 2012.


  • Junjie Yao, Bin Cui, Qiaosha Han, Ce Zhang, Yanhong Zhou. Modeling User Expertise in Folksonomies by Fusing Multi-type Features. DASFAA 2011.


  • Bin Cui, Anthony K. H. Tung, Ce Zhang, Zhe Zhao. Multiple feature fusion for social media applications. SIGMOD 2010.

  • Bin Cui, Ce Zhang, Gao Cong. Content-enriched classifier for web video classification. SIGIR 2010.


  • Xin Cao, Gao Cong, Bin Cui, Christian S. Jensen, Ce Zhang. The use of categorization information in language models for question retrieval. CIKM 2009.

  • Ce Zhang, Bin Cui, Gao Cong, Yu-Jing Wang. A Revisit of Query Expansion with Different Semantic Levels. DASFAA 2009.


  • Bin Cui, Bei Pan, Heng Tao Shen, Ying Wang, Ce Zhang. Video Annotation System Based on Categorizing and Keyword Labelling. DASFAA 2009.