~~ I like to Observe. Look for Patterns. Ponder over these Generalizations. Try to Refute them.
Or otherwise prove their Validity. And re-image their Applications in alternate spheres ~~
Pratyush Maini
I am a PhD student in the Machine Learning Department at Carnegie Mellon Univeristy, and a founding member of DatologyAI. I am advised by Prof. Zico Kolter and Prof. Zachary Lipton. My research goal is to make Machine Learning systems trustworthy to the extent that they can be safely and reliably deployed outside the comfort of our research labs. Previously,
Collaborate? I am always excited to exchange research perspectives and hop on to new research endeavors. If you are interested, reach out via email!
Bio: If you need a bio for a talk, please use this:
Pratyush is a Ph.D. student in the Machine Learning Department at Carnegie Mellon University, and a founding member of DatologyAI. In his work, he has developed scalable and performant methods for improving the quality of data that we train machine learning models on. He has also developed methods that allow us to evaluate, locate, and mitigate the memorization of data points by neural networks. His works have been recognized through a best paper award nomination at NeurIPS, and multiple oral and spotlight talks at important ML conferences.
Talks
- October 2024: Mentorship Panel at COLM @ Penn-MLR
- September 2024: Guest Lecture on Data Curation @ CMU-10605
- August 2024: LLM Dataset Inference @ Private-NLP, ACL 2024
- August 2024: Rethink Memorization with Adversarial Compression @ CONDA Workshop, ACL 2024 (Best Paper Talk)
- August 2024: LLM Dataset Inference @ Google Privacy Seminar
- June 2024: Rephrasing The Web @ Princeton NLP Group
- April 2024: TOFU @ Responsible AI Reading Group at AWS
- April 2024: Rephrasing The Web @ Together AI Research Group
- March 2024: Rephrasing The Web @ Sambanova Research Group
- February 2024: Can Neural Network Memorization be Localized @ ML PDG Karlsruhe
- November 2023: Can Neural Network Memorization be Localized @ Ellis Reading Group on Mathematics of Deep Learning
- October 2023: T-MARS @ ICCV 2023, Datacomp Workshop
- September 2023: T-MARS @ Ludwig Schmidt’s lab
- June 2022: Characterizing Datapoints via Second-split Forgetting @ SCIS ICML 2022
Publications
(15) Understanding Hallucinations in Diffusion Models through Mode Interpolation
Sumukh Aithal, Pratyush Maini, Zack Lipton, Zico Kolter
Conference on Neural Information Processing Systems(NeurIPS) 2024
DMLR @ International Conference on Machine Learning (ICML) 2024
TLDR | Paper | Citation
(14) LLM Dataset Inference: Did you train on my dataset?
Pratyush Maini*, Hengrui Jia*, Nicolas Papernot, Adam Dziedzic
Conference on Neural Information Processing Systems(NeurIPS) 2024
Oral @ Private NLP Workshop, ACL 2024
TLDR | Paper | Citation
(13) Rethinking LLM Memorization through the Lens of Adversarial Compression
Avi Schwarzschild*, Zhili Feng*, Pratyush Maini, Zack Lipton, Zico Kolter
Conference on Neural Information Processing Systems(NeurIPS) 2024
Best Paper @ Data Contam Workshop, ACL 2024
TLDR | Paper | Citation
(12) TOFU: A Task of Fictitious Unlearning for LLMs
Pratyush Maini*, Zhili Feng*, Avi Schwarzschild*, Zack Lipton, Zico Kolter
Set-LLM @ ICLR 2024
COLM 2024 (Conference on Language Modeling)
TLDR | Paper | Website | Citation
(11) Scaling Laws for Data Filtering—Data Curation cannot be Compute Agnostic
Sachin Goyal*, Pratyush Maini*, Zachary C. Lipton, Aditi Raghunathan, J. Zico Kolter
CVPR 2024
Data Problems for Foundation Models (ICLR) 2024
TLDR | Paper | Citation
@inproceedings{goyal2024scaling, title={Scaling Laws for Data Filtering—Data Curation cannot be Compute Agnostic}, author={Goyal, Sachin and Maini, Pratyush and Lipton, Zachary C and Raghunathan, Aditi and Kolter, J Zico}, booktitle={CVPR2024}, year={2024} }
(10) Rephrasing the Web: A Recipe for Compute & Data-Efficient Language Modeling
Pratyush Maini*, Skyler Seto*, He Bai, David Grangier, Yizhe Zhang, Navdeep Jaitly
Association for Computational Linguistics (ACL) 2024
TLDR | Paper | Citation
@inproceedings{maini2024rephrasing, title={Rephrasing the Web: A Recipe for Compute & Data-Efficient Language Modeling}, author={Maini, Pratyush and Seto, Skyler and Bai, He and Grangier, David and Zhang, Yizhe and Jaitly, Navdeep}, booktitle={arXiv}, year={2024} }
(9) Can Neural Network Memorization be Localized?
Pratyush Maini, Michael Curtis Mozer, Hanie Sedghi, Zachary Chase Lipton, J Zico Kolter, Chiyuan Zhang
International Conference on Machine Learning (ICML) 2023
TLDR | Paper | Website | Slides | Poster | Citation
- We show that memorization is typically not localized to specific model layers, rather is confined to a small fraction of neurons dispersed across the model.
- We propose Example-Tied Dropout that can confine memorization to a pre-defined set of neurons, which can then be thrown away at test time.
@inproceedings{maini2023memorization, title={Can Neural Network Memorization Be Localized?}, author={Maini, Pratyush and Mozer, Michael C and Sedghi, Hanie and Lipton, Zachary C and Kolter, J Zico and Zhang, Chiyuan}, booktitle={International Conference on Machine Learning}, year={2023} }
(8) T-MARS: Improving Visual Representations by Circumventing Text Feature Learning
Pratyush Maini*, Sachin Goyal*, Zachary C. Lipton, Zico Kolter, Aditi Raghunathan
ICLR 2024
DMLR @ International Conference on Machine Learning (ICML) 2023
Datacomp Workshop @ ICCV 2023
TLDR | Paper | Website | Poster | Citation
@article{maini2023tmars, title={T-MARS: Improving Visual Representations by Circumventing Text Feature Learning}, author={Maini, Pratyush and Goyal, Sachin and Lipton, Zachary C and Kolter, J Zico and Raghunathan, Aditi}, booktitle={Arxiv}, year={2023} }
(7) Model-tuning Via Prompts Makes NLP Models Adversarially Robust
Mrigank Raman*, Pratyush Maini*, Zico Kolter, Zachary C. Lipton, Danish Pruthi
EMNLP 2023
AdvML-Frontiers @ International Conference on Machine Learning (ICML) 2023
TLDR | Paper | Video | Slides | Poster | Citation
- We analyze the forgetting and learning dynamics of neural networks to characterize different types of hard examples as belonging to mislabeled, rare and complex categories.
- Mislabeled Examples : Learnt Late, Forgotten Early
- Rare Examples: Learnt Late, Forgotten Late
- Complex Examples: Learnt Late, Never Forgotten
(6) Characterizing Datapoints via Second-Split Forgetting
Pratyush Maini, Saurabh Garg, Zachary C. Lipton, Zico Kolter
Conference on Neural Information Processing Systems(NeurIPS) 2022
SCIS @ International Conference on Machine Learning (ICML) 2022
TLDR | Paper | Video | Slides | Poster | Citation
- We analyze the forgetting and learning dynamics of neural networks to characterize different types of hard examples as belonging to mislabeled, rare and complex categories.
- Mislabeled Examples : Learnt Late, Forgotten Early
- Rare Examples: Learnt Late, Forgotten Late
- Complex Examples: Learnt Late, Never Forgotten
@inproceedings{ maini2022characterizing, title={Characterizing Datapoints via Second-Split Forgetting}, author={Pratyush Maini and Saurabh Garg and Zachary Chase Lipton and J Zico Kolter}, booktitle={Advances in Neural Information Processing Systems}, editor={Alice H. Oh and Alekh Agarwal and Danielle Belgrave and Kyunghyun Cho}, year={2022}, url={https://openreview.net/forum?id=yKDKNzjHg8N} }
(5) Dataset Inference: Ownership Resolution in Machine Learning
Pratyush Maini, Mohammad Yaghini, Nicolas Papernot
International Conference on Learning Representations (ICLR) 2021
Privacy Preserving Machine Learning (PPML) Workshop at NeurIPS 2020
Workshop on Dataset Curation and Security (WDCS) at NeurIPS 2020
TLDR | Paper | Video | Slides | Poster | Citation
- Dataset Inference (DI) resolves model ownership without the need for retraining; and does not have a trade-off with task accuracy.
- We prove that the success of Membership Inference decreases as overfitting reduces, whereas DI is independent of the same.
- We introduce a new method for black-box ownership resolution that requires less than 50 private training points from the victim’s dataset.
@article{maini2021dataset, title={Dataset Inference: Ownership Resolution in Machine Learning}, author={Pratyush Maini and Mohammad Yaghini and Nicolas Papernot}, booktitle={ICLR 2021}, year={2020}, url={https://openreview.net/forum?id=hvdKKV2yt7T}, note={Spotlight at ICLR 2021} }
(4) Data-Free Model Extraction
Jean-Baptiste Truong*, Pratyush Maini*, Robert Walls, Nicolas Papernot
Conference on Computer Vision and Pattern Recognition (CVPR) 2021
TLDR | Paper | Code | Poster | Citation
@article{truong2021data, title={Data-Free Model Extraction}, author={Jean-Baptiste Truong* and Pratyush Maini* and Robert J. Walls and Nicolas Papernot}, booktitle={arXiv preprint arXiv:2011.14779}, year={2021}, url={https://arxiv.org/abs/2011.14779}, note={under review at CVPR 2021}, }
(3) Perturbation Type Categorization for Multiple $\ell_p$ Bounded Adversarial Robustness
Pratyush Maini, Xinyun Chen, Bo Li, Dawn Song
Conference on Uncertainty in Artificial Intelligence (UAI) 2022
ICML Workshop on Uncertainty and Robustness in Deep Learning
TLDR | Paper | Citation
@InProceedings{maini2022perturbation, title = {Perturbation Type Categorization for Multiple $\ell_p$ Bounded Adversarial Robustness}, author = {Pratyush Maini and Xinyun Chen and Bo Li and Dawn Song}, booktitle = {Proceedings of The 38th Uncertainty in Artificial Intelligence Conference}, year = {2022}, series = {Proceedings of Machine Learning Research}, url={https://openreview.net/pdf?id=Oe2XI-Aft-k}, }
(2) Adversarial Robustness Against the Union of Multiple Perturbation Models
Pratyush Maini, Eric Wong, Zico Kolter
International Conference on Machine Learning (ICML) 2020
TLDR | Paper | Video | Slides | Code | Citation
@inproceedings{maini2020adversarial, title={Adversarial Robustness Against the Union of Multiple Perturbation Models}, author={Pratyush Maini and Eric Wong and J. Zico Kolter}, booktitle={International Conference on Machine Learning}, year={2020}, url = "https://arxiv.org/abs/1909.04068" }
(1) Why and when should you pool? Analyzing Pooling in Recurrent Architectures
Pratyush Maini, Keshav Kolluru, Danish Pruthi, Mausam
EMNLP (Findings) 2020
BlackBoxNLP 2020
TLDR | Paper | Video | Slides | Code | Blog | Poster | Citation
- Pooling (and attention) help improve learning ability and positional invariance of BiLSTMs.
- Pooling helps improve sample efficiency (low-resource settings) and is particularly beneficial when important words lie away from the end of the sentence.
- Our proposed pooling technique, max-attention (MaxAtt), helps improve upon past approaches on standard accuracy metrics, and is more robust to distribution shift.
@inproceedings{maini2020pool, title = "Why and when should you pool? Analyzing Pooling in Recurrent Architectures", author = "Maini, Pratyush and Kolluru, Keshav and Pruthi, Danish and {Mausam}", booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020", year = "2020", address = "Online", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/2020.findings-emnlp.410", note = {Also presented at BlackBoxNLP'20} }
* = equal contribution
Academic Service
Reviewer for:
ML: NeurIPS 2022, 2021, 2020; ICLR 2022 (Highlighted Reviewer), 2021*; ICML 2022
*NLP: NAACL 2021; EMNLP 2022, 2021
Others: IEEE S&P 2021*, CVPR 2022, AISTATS 2022
* = external reviewer