PAPER Production Machine Learning Pipelines: Empirical Analysis and Optimization Opportunities. Doris Xin, Hui Miao, Aditya Parameswaran, Neoklis Polyzotis. Proceedings of the 2021 International Conference on Management of Data (SIGMOD), Xi’an, Shaanxi, China and Virtual. June 2021.

PAPER Enhancing the Interactivity of Dataframe Queries by Leveraging Think Time. Doris Xin, Devin Petersohn, Dixin Tang, Yifan Wu, Joseph E. Gonzalez, Joseph M. Hellerstein, Anthony D. Joseph, Aditya G. Parameswaran. IEEE Data Engineering Bulletin, Issue on Data Validation for Machine Learning. May 2021.

PAPER Whither AutoML? Understanding the Role of Automation in Machine Learning Workflows. Doris Xin, Eva Wu, Doris Lee, Niloufar Salehi, Aditya Parameswaran. International Conference on Human Factors in Computing Systems (CHI), Yokohama, Japan and Virtual. May 2021

PAPER Towards Scalable Dataframe Systems. Devin Petersohn, Stephen Macke, Doris Xin, William Ma, Doris Lee, Xiangxi Mo Joseph E. Gonzalez, Joseph M. Hellerstein, Anthony D. Joseph, Aditya Parameswaran. 46th International Conference on Very Large Data Bases (VLDB). September 2020

PAPER Demystifying a Dark Art: Understanding Real-World Machine Learning Model Development. Angela Lee*, Doris Xin*, Doris Lee*, Aditya Parameswaran (* equal contribution). HILDA Workshop at SIGMOD International Conference on Management of Data. June 2020.
[Technical Report] [Slides]

PAPER Extending Relational Query Processing with ML Inference. Konstantinos Karanasos, Matteo Interlandi, Doris Xin, Fotis Psallidas, Rathijit Sen, Kwanghyun Park, Ivan Popivanov, Supun Nakandal, Subru Krishnan, Markus Weimer, Yuan Yu, Raghu Ramakrishnan, Carlo Curino. The Conference on Innovative Data Systems Research (CIDR). January 2020.

PAPER Helix: Holistic Optimization for Accelerating Iterative Machine Learning. Doris Xin, Stephen Macke, Litian Ma, Jialin Liu, Shuchen Song, Aditya Parameswaran. 45th International Conference on Very Large Data Bases (VLDB), Los Angeles, USA. August 2019
[Technical Report] [Slides] [Poster]

PAPER A Human-in-the-loop Perspective on AutoML: Milestones and the Road Ahead. Doris Jung-Lin Lee*, Stephen Macke*, Doris Xin*, Angela Lee, Silu Huang, Aditya Parameswaran (* equal contribution). IEEE Data Engineering Bulletin, Issue on DB4AI and AI4DB. June 2019

PAPER Active learning on heterogeneous information networks: A multi-armed bandit approach. Doris Xin, Ahmed El-Kishky, De Liao, Brandon Norick, Jiawei Han. IEEE International Conference on Data Mining (ICDM). November 2018

PAPER Helix: Accelerating Human-in-the-loop Machine Learning (Demo). Doris Xin, Litian Ma, Jialin Liu, Stephen Macke, Shuchen Song, Aditya Parameswaran. 44th International Conference on Very Large Data Bases (VLDB), Rio de Janeiro, Brazil. September 2018

PAPER How Developers Iterate on Machine Learning Workflows – A Survey of the Applied Machine Learning Literature. Doris Xin, Litian Ma, Shuchen Song, Aditya Parameswaran. IDEA Workshop at KDD Int’l Conf. on Knowledge Discovery and Data Mining, London, UK. August 2018

PAPER Accelerating Human-in-the-loop Machine Learning: Challenges and Opportunities. Doris Xin, Litian Ma, Jialin Liu, Stephen Macke, Shuchen Song, Aditya Parameswaran. DEEM Workshop at SIGMOD Int’l Conf. on Management of Data, Houston, USA. June 2018

PAPER Folding: Why Good Models Sometimes Make Spurious Recommendations. Doris Xin, Nicolas Mayoraz, Hubert Pham, Karthik Lakshmanan, John R Anderson. Proceedings of the Eleventh ACM Conference on Recommender Systems, Como, Italy. August 2017

PAPER MLlib: Machine Learning in Apache Spark. Xiangrui Meng, Joseph Bradley, Burak Yavuz, Evan Sparks, Shivaram Venkataraman, Davies Liu, Jeremy Freeman, DB Tsai, Manish Amde, Sean Owen, Doris Xin, Reynold Xin, Michael J Franklin, Reza Zadeh, Matei Zaharia, Ameet Talwalkar. The Journal of Machine Learning Research (JMLR). January 2016

PAPER Parallel Computation Using Active Self-Assembly. Moya Chen, Doris Xin, Damien Woods. Natural Computing. June 2015

PAPER LASER: A Scalable Response Prediction Platform for Online Advertising. Deepak Agarwal, Bo Long, Jonathan Traupman, Doris Xin, Liang Zhang (authors listed alphabetically by last name). Proceedings of the 7th ACM International Conference on Web Search and Data Mining (WSDM), New York, USA. February 2014