Pratyush Maini

Carnegie Mellon University / DatologyAI

prof_pic.jpg

đź“Ť 70% Pitt. | 20% Cali | 10% Delhi

~~
I like to Observe. Look for Patterns. Ponder over these Generalizations. Try to Refute them.
   Or otherwise prove their Validity. And re-image their Applications in alternate spheres
~~

I am a PhD student in the Machine Learning Department at Carnegie Mellon University, and a founding member of DatologyAI. I am advised by Prof. Zico Kolter and Prof. Zachary Lipton. My research goal is to make Machine Learning systems trustworthy to the extent that they can be safely and reliably deployed outside the comfort of our research labs. Previously,

Collaborate? I am always excited to exchange research perspectives and hop on to new research endeavors. If you are interested, reach out via email!

Bio: If you need a bio for a talk, please use this:

Talks

Publications

(15) Understanding Hallucinations in Diffusion Models through Mode Interpolation
Sumukh Aithal, Pratyush Maini, Zack Lipton, Zico Kolter
Conference on Neural Information Processing Systems (NeurIPS) 2024
DMLR @ International Conference on Machine Learning (ICML) 2024
TLDR | Paper | Citation

(14) LLM Dataset Inference: Did you train on my dataset?
Pratyush Maini*, Hengrui Jia*, Nicolas Papernot, Adam Dziedzic
Conference on Neural Information Processing Systems (NeurIPS) 2024
Oral @ Private NLP Workshop, ACL 2024
TLDR | Paper | Citation

(13) Rethinking LLM Memorization through the Lens of Adversarial Compression
Avi Schwarzschild*, Zhili Feng*, Pratyush Maini, Zack Lipton, Zico Kolter
Conference on Neural Information Processing Systems (NeurIPS) 2024
Best Paper @ Data Contamination Detection and Auditing Workshop, ACL 2024
TLDR | Paper | Citation

(12) TOFU: A Task of Fictitious Unlearning for LLMs
Pratyush Maini*, Zhili Feng*, Avi Schwarzschild*, Zack Lipton, Zico Kolter
Set-LLM @ ICLR 2024
Conference on Language Modeling (COLM) 2024
TLDR | Paper | Website | Citation

(11) Scaling Laws for Data Filtering—Data Curation cannot be Compute Agnostic
Sachin Goyal*, Pratyush Maini*, Zachary C. Lipton, Aditi Raghunathan, J. Zico Kolter
Conference on Computer Vision and Pattern Recognition (CVPR) 2024
Best Paper @ Data Problems for Foundation Models (ICLR) 2024 Workshop
TLDR | Paper | Citation

(10) Rephrasing the Web: A Recipe for Compute & Data-Efficient Language Modeling
Pratyush Maini*, Skyler Seto*, He Bai, David Grangier, Yizhe Zhang, Navdeep Jaitly
Association for Computational Linguistics (ACL) 2024
TLDR | Paper | Citation

(9) Can Neural Network Memorization be Localized?
Pratyush Maini, Michael Curtis Mozer, Hanie Sedghi, Zachary Chase Lipton, J Zico Kolter, Chiyuan Zhang
International Conference on Machine Learning (ICML) 2023
TLDR | Paper | Website | Slides | Poster | Citation

(8) T-MARS: Improving Visual Representations by Circumventing Text Feature Learning
Pratyush Maini*, Sachin Goyal*, Zachary C. Lipton, Zico Kolter, Aditi Raghunathan
International Conference on Learning Representations (ICLR) 2024
DMLR @ International Conference on Machine Learning (ICML) 2023
Datacomp Workshop @ ICCV 2023
TLDR | Paper | Website | Poster | Citation

(7) Model-tuning Via Prompts Makes NLP Models Adversarially Robust
Mrigank Raman*, Pratyush Maini*, Zico Kolter, Zachary C. Lipton, Danish Pruthi
Empirical Methods in Natural Language Processing (EMNLP) 2023
AdvML-Frontiers @ International Conference on Machine Learning (ICML) 2023
TLDR | Paper | Slides | Poster | Citation

(6) Characterizing Datapoints via Second-Split Forgetting
Pratyush Maini, Saurabh Garg, Zachary C. Lipton, Zico Kolter
Advances in Neural Information Processing Systems (NeurIPS) 2022
SCIS @ International Conference on Machine Learning (ICML) 2022
TLDR | Paper | Slides | Poster | Citation

(5) Dataset Inference: Ownership Resolution in Machine Learning
Pratyush Maini, Mohammad Yaghini, Nicolas Papernot
International Conference on Learning Representations (ICLR) 2021
Privacy Preserving Machine Learning (PPML) Workshop at NeurIPS 2020
Workshop on Dataset Curation and Security (WDCS) at NeurIPS 2020
TLDR | Paper | Video | Slides | Poster | Citation

(4) Data-Free Model Extraction
Jean-Baptiste Truong*, Pratyush Maini*, Robert Walls, Nicolas Papernot
Conference on Computer Vision and Pattern Recognition (CVPR) 2021
TLDR | Paper | Code | Poster | Citation

(3) Perturbation Type Categorization for Multiple $\ell_p$ Bounded Adversarial Robustness
Pratyush Maini, Xinyun Chen, Bo Li, Dawn Song
Conference on Uncertainty in Artificial Intelligence (UAI) 2022
ICML Workshop on Uncertainty and Robustness in Deep Learning
TLDR | Paper | Citation

(2) Adversarial Robustness Against the Union of Multiple Perturbation Models
Pratyush Maini, Eric Wong, Zico Kolter
International Conference on Machine Learning (ICML) 2020
TLDR | Paper | Video | Slides | Code | Citation

(1) Why and when should you pool? Analyzing Pooling in Recurrent Architectures
Pratyush Maini, Keshav Kolluru, Danish Pruthi, Mausam
Findings of the Association for Computational Linguistics: EMNLP 2020 (EMNLP)
BlackBoxNLP 2020
TLDR | Paper | Slides | Code | Blog | Poster | Citation


* = equal contribution

Academic Service

Reviewer for:
ML: NeurIPS 2020-2024*; ICLR 2021-2024 (Highlighted Reviewer in 2022)*; ICML 2022-2024
NLP: NAACL 2021; EMNLP 2021, 2022
Others: IEEE S&P 2021*, CVPR 2022, AISTATS 2022


* = external reviewer


latest posts

selected publications