About Me
I am an Associate Professor in Computer Science at ETH Zurich. The goal of my research is to make machine learning techniques widely accessible—while being cost-efficient and trustworthy—to everyone who wants to use them to make our world a better place.
Before ETH, I finished my PhD at the University of Wisconsin-Madison and spent another year as a postdoctoral researcher at Stanford, both advised by Chris Ré. I did my undergraduate study at Peking University, advised by Bin Cui.
With a background in data management, we believe in a system approach. Our current research focues on building next-generation machine learning platforms and systems that are data-centric, human-centric, and declaratively scalable.
[Email: ce.zhang@inf.ethz.ch] [Google Scholar] [Twitter]
Check out a summary of our research here. We are also building up two exciting start-ups:
- Together Computer: A decentralized cloud for artificial intelligence, with the mission to bring the world’s computation together to support the open models ecosystem.
- Modulos.AI: A data-centric AI enterprise platform helps you find the errors, noise and bias in your data so you can build fairer and better AI, faster.
As a research group, one of our most important duties is to nurture the next generation of leaders, which arguably gives us the most significant long-lasting impact to the society. Over the years, students graduated from our groups have became professors in great universities and researchers and engineers in leading industrial companies. See a list and their achievements here.
We are super lucky to have been involved in several community building efforts, from machine learning systems, data management for ML, to Data-centric AI. Here are a few recent positioning papers on various topics (joint work with many fellow researchers):
- Advances, challenges and opportunities in creating data for trustworthy AI (Nature Machine Intelligence), with Weixin Liang, Girmaw Tadesse, Daniel Ho, Li Fei-Fei, Matei Zaharia, and James Zou.
- DataPerf: Benchmarks for Data-Centric AI Development (MLCommons), with a great consortium of researchers who are passionated about data quality for ML and data iterations.
- A Data Quality-Driven View of MLOps (IEEE Data Engineering Bulletin).
- MLSys: The New Frontier of Machine Learning Systems, the positioning paper for the first MLSys conference, with a awesome consortium of researchers from machine learning, systems, data management, security, computer architecture, etc.
We are always looking for top candidates for PhDs and Postdocs with background in systems or theory on data management, mathematical optimization, and machine learning. Feel free to reach out!
News and Student Highlights
- Binhang Yuan is joining Hong Kong University of Science and Technology as an assistant professor;
- Nezihe Merve Gurel joins TU Delft as an assistant professor;
- ERC Grant. Blog: What are we going to build with an ERC?
- Generation Google Scholarship for Nezihe Merve Gurel;
- Jiawei Jiang joins Wuhan University as a full professor;
- Nezihe Merve Gurel joins the Board of Directors of WiML;
- ICLR Outstanding Paper for Shuai Zhang;
- Nora Hollenstein joins University of Copenhagen as an assistant professor;
- Google Focused Research Award, 2018.
- MIT Technology Review Latin American Innovators under 35 for Leonel Aguilar;
- SNSF Eccellenza Professorial Fellowship for Thomas Lemmin, joins University of Bern as assistant professor;
- IBM Q Best paper award for Zhikuan Zhao;
- CoNLL special award for the best paper on research inspired by human language learning and processing for Nora Hollenstein
- SIGMOD Research Hightlight Award, 2015.
- SIGMOD Best Paper Award, 2014.
Recent/Upcoming Events
NeurIPS 2022
- 11/29/2022 09:30-11:30 (PST): Come by our WiML roundtable if you are interested in ML Systems.
- We will also present five papers, including one Oral and two Spotlights, on decentralized learning, vertical federated learning, reasoning, and robustness
- Binhang Yuan, Yongjun He: Decentralized Training of Foundation Models in Heterogeneous Environments (Oral)
- 11/30/2022, 09:30-11:00 (PST): Poster Session
- 12/07/2022, 17:45-18:00 (PST): Virtual Panel for Oral Papers
- Binhang Yuan, Jue Wang: Fine-tuning Language Models over Slow Networks using Activation Quantization with Guarantees
- 11/29/2022, 09:30-11:00 (PST): Poster Session
- Jiawei Jiang: VF-PS: How to Select Important Participants in Vertical Federated Learning, Efficiently and Securely?
- 12/06/2022, 09:00-11:00 (PST): Spotlight Presentation
- Zhuolin Yang, Zhikuan Zhao: Improving Certified Robustness via Statistical Learning with Logical Reasoning
- 12/01/2022, 14:30-16:00 (PST): Poster Session
- Mintong Kang, Linyi Li: Certifying Some Distributional Fairness with Subpopulation Decomposition
- 11/30/2022, 14:30-16:00 (PST): Poster Session
- 12/06/2022, 17:00-19:00 (PST): Spotlight Presentation
- Binhang Yuan, Yongjun He: Decentralized Training of Foundation Models in Heterogeneous Environments (Oral)
VLDB 2022
- We will present five papers, including the Bagua system that enables decentralization, asynchronization, and communication compression for scalable distributed learning, and other topics on fraud detection (with eBay), differentiable ML pipelines, AutoML, and federated learning:
- Shaoduo Gan, Xiangru Lian: BAGUA: Scaling up Distributed Learning with System Relaxations
- 09/08/2022, 15:30-17:00: [Paper] [System] [Pytorch Lightning]
- Susie Rao, Shuai Zhang: xFraud: Explainable Fraud Transaction Detection
- 09/07/2022, 23:30-01:00: [Paper]
- Yang Li Hyper-Tune: Towards Efficient Hyper-parameter Tuning at Scale
- 09/06/2022, 13:30-15:00: [Paper]
- Zitao Li: Federated Matrix Factorization with Privacy Guarantee
- 09/06/2022, 23:30-01:00: [Paper]
- Gyeong-in Yu: WindTunnel: Towards Differentiable ML Pipelines Beyond a Single Model
- 09/08/2022, 10:30-12:00: [Paper]
- Shaoduo Gan, Xiangru Lian: BAGUA: Scaling up Distributed Learning with System Relaxations