arXiv 2025
Safety Pretraining: Toward the Next Generation of Safe AI
Pratyush Maini*, Sachin Goyal*, Dylan Sam*, Alex Robey, Yash Savani, Yiding Jiang, Andy Zou, Matt Fredrikson, Zachary C. Lipton, J. Zico Kolter
A framework for training AI systems to be inherently safer through specialized pretraining.
ACL 2024
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling
Pratyush Maini, Skyler Seto, He Bai, David Grangier, Yizhe Zhang, Navdeep Jaitly
Paraphrasing web data into Q/A pairs significantly improves language model training efficiency.
CVPR 2024
Best Paper (DPFM Workshop)
Scaling Laws for Data Filtering — Data Curation cannot be Compute Agnostic
Sachin Goyal, Pratyush Maini, Zachary C. Lipton, Aditi Raghunathan, J. Zico Kolter
Training compute-optimal models requires data filtering that scales with available compute.
NeurIPS 2024
Oral (Private-NLP Workshop)
LLM Dataset Inference: Did you train on my dataset?
Pratyush Maini*, Hengrui Jia*, Nicolas Papernot, Adam Dziedzic
Black-box detection of whether a dataset was used to train an LLM.
COLM 2024
Oral (Set-LLM Workshop)
TOFU: A Task of Fictitious Unlearning for LLMs
Pratyush Maini*, Zhili Feng*, Avi Schwarzschild*, Zachary C. Lipton, J. Zico Kolter
Benchmarking machine unlearning methods for large language models.
ICML 2023
Can Neural Network Memorization Be Localized?
Pratyush Maini, Michael C. Mozer, Hanie Sedghi, Zachary C. Lipton, J. Zico Kolter, Chiyuan Zhang
Individual neurons and layers do not solely determine what a neural network memorizes.
ICLR 2021
Spotlight
Dataset Inference: Ownership Resolution in Machine Learning
Pratyush Maini, Mohammad Yaghini, Nicolas Papernot
First work on dataset inference: determining if a dataset was used for training.
ICML 2025
Memorization Sinks: Isolating Memorization during LLM Training
Gaurav R. Ghosal, Pratyush Maini, Aditi Raghunathan
ICML 2025Oral (Dig-BUGS)
Unlocking Post-hoc Dataset Inference with Synthetic Data
Bihe Zhao, Pratyush Maini, Franziska Boenisch, Adam Dziedzic
ICML 2025
STAMP Your Content: Proving Dataset Membership via Watermarked Rephrasings
Saksham Rastogi, Pratyush Maini, Danish Pruthi
arXiv 2025
OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics
Vineeth Dorna, Anmol Mekala, Wenlong Zhao, Andrew McCallum, Zachary C. Lipton, J. Zico Kolter, Pratyush Maini
Workshop 2025Oral (MemFM)
MAGIC: Diffusion Model Memorization Auditing via Generative Image Compression
Gunjan Dhanuka, Sumukh K. Aithal, Avi Schwarzschild, Zhili Feng, J. Zico Kolter, Zachary C. Lipton, Pratyush Maini
ICLR Blogpost 2025
Reassessing EMNLP 2024's Best Paper: Does Divergence-Based Calibration for MIAs Hold Up?
Pratyush Maini, Anshuman Suri
NeurIPS 2024Oral (Private-NLP)
LLM Dataset Inference: Did you train on my dataset?
Pratyush Maini*, Hengrui Jia*, Nicolas Papernot, Adam Dziedzic
Black-box detection of whether a dataset was used to train an LLM.
NeurIPS 2024Best Paper (CONDA)
Rethinking LLM Memorization through the Lens of Adversarial Compression
Avi Schwarzschild, Zhili Feng, Pratyush Maini, Zachary C. Lipton, J. Zico Kolter
NeurIPS 2024
Understanding Hallucinations in Diffusion Models through Mode Interpolation
Sumukh K. Aithal, Pratyush Maini, Zachary C. Lipton, J. Zico Kolter
COLM 2024Oral (Set-LLM)
TOFU: A Task of Fictitious Unlearning for LLMs
Pratyush Maini*, Zhili Feng*, Avi Schwarzschild*, Zachary C. Lipton, J. Zico Kolter
Benchmarking machine unlearning methods for large language models.
ICML 2023
Can Neural Network Memorization Be Localized?
Pratyush Maini, Michael C. Mozer, Hanie Sedghi, Zachary C. Lipton, J. Zico Kolter, Chiyuan Zhang
Individual neurons and layers do not solely determine what a neural network memorizes.
ICLR 2021Spotlight
Dataset Inference: Ownership Resolution in Machine Learning
Pratyush Maini, Mohammad Yaghini, Nicolas Papernot
First work on dataset inference: determining if a dataset was used for training.
arXiv 2025
BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining
Pratyush Maini, Vineeth Dorna, Parth Doshi, Aldo Carranza, Fan Pan, Jack Urbanek, Paul Burstein, Alex Fang, Alvin Deng, Amro Abbas, Brett Larsen, Cody Blakeney, Charvi Bannur, Christina Baek, Darren Teh, David Schwab, Haakon Mongstad, Haoli Yin, Josh Wills, Kaleigh Mentzer, Luke Merrick, Ricardo Monti, Rishabh Adiga, Siddharth Joshi, Spandan Das, Zhengping Wang, Bogdan Gaza, Ari Morcos, Matthew Leavitt
arXiv 2025
Safety Pretraining: Toward the Next Generation of Safe AI
Pratyush Maini*, Sachin Goyal*, Dylan Sam*, Alex Robey, Yash Savani, Yiding Jiang, Andy Zou, Matt Fredrikson, Zachary C. Lipton, J. Zico Kolter
A framework for training AI systems to be inherently safer through specialized pretraining.
ICLR Blogpost 2025
Peeking Behind Closed Doors: Risks of LLM Evaluation by Private Data Curators
Harshay Bansal, Pratyush Maini
ACL 2024
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling
Pratyush Maini, Skyler Seto, He Bai, David Grangier, Yizhe Zhang, Navdeep Jaitly
Paraphrasing web data into Q/A pairs significantly improves language model training efficiency.
CVPR 2024Best Paper (DPFM Workshop)
Scaling Laws for Data Filtering — Data Curation cannot be Compute Agnostic
Sachin Goyal, Pratyush Maini, Zachary C. Lipton, Aditi Raghunathan, J. Zico Kolter
Training compute-optimal models requires data filtering that scales with available compute.
ICLR 2024Oral (Datacomp)
T-MARS: Improving Visual Representations by Circumventing Text Feature Learning
Pratyush Maini, Sachin Goyal, Zachary C. Lipton, J. Zico Kolter, Aditi Raghunathan
NeurIPS 2022Best Paper Nominee; Oral (SCIS)
Characterizing Datapoints via Second-Split Forgetting
Pratyush Maini, Saurabh Garg, Zachary C. Lipton, J. Zico Kolter
EMNLP 2023
Model-tuning via Prompts Makes NLP Models Adversarially Robust
Mrigank Raman, Pratyush Maini, J. Zico Kolter, Zachary C. Lipton, Danish Pruthi
UAI 2022
Perturbation Type Categorization for Multiple l_p Bounded Adversarial Robustness
Pratyush Maini, Xinyun Chen, Bo Li, Dawn Song
ICML 2020
Adversarial Robustness Against the Union of Multiple Perturbation Models
Pratyush Maini, Eric Wong, J. Zico Kolter
CVPR 2021
Data-free Model Extraction
Jean-Baptiste Truong*, Pratyush Maini*, Robert J. Walls, Nicolas Papernot
EMNLP 2020
Why and When Should You Pool? Analyzing Pooling in Recurrent Architectures
Pratyush Maini, Keshav Kolluru, Danish Pruthi, Mausam
arXiv 2025
BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining
Pratyush Maini, Vineeth Dorna, Parth Doshi, et al.
arXiv 2025
Safety Pretraining: Toward the Next Generation of Safe AI
Pratyush Maini*, Sachin Goyal*, Dylan Sam*, et al.
ICML 2025
Memorization Sinks: Isolating Memorization during LLM Training
Gaurav R. Ghosal, Pratyush Maini, Aditi Raghunathan
ICML 2025Oral (Dig-BUGS)
Unlocking Post-hoc Dataset Inference with Synthetic Data
Bihe Zhao, Pratyush Maini, Franziska Boenisch, Adam Dziedzic
ICML 2025
STAMP Your Content: Proving Dataset Membership via Watermarked Rephrasings
Saksham Rastogi, Pratyush Maini, Danish Pruthi
arXiv 2025
OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking
Vineeth Dorna, Anmol Mekala, et al., Pratyush Maini
Workshop 2025Oral (MemFM)
MAGIC: Diffusion Model Memorization Auditing via Generative Image Compression
Gunjan Dhanuka, Sumukh K. Aithal, et al., Pratyush Maini
ICLR Blogpost 2025
Reassessing EMNLP 2024's Best Paper
Pratyush Maini, Anshuman Suri
ICLR Blogpost 2025
Peeking Behind Closed Doors: Risks of LLM Evaluation by Private Data Curators
Harshay Bansal, Pratyush Maini
NeurIPS 2024Oral (Private-NLP)
LLM Dataset Inference: Did you train on my dataset?
Pratyush Maini*, Hengrui Jia*, Nicolas Papernot, Adam Dziedzic
NeurIPS 2024Best Paper (CONDA)
Rethinking LLM Memorization through the Lens of Adversarial Compression
Avi Schwarzschild, Zhili Feng, Pratyush Maini, Zachary C. Lipton, J. Zico Kolter
NeurIPS 2024
Understanding Hallucinations in Diffusion Models through Mode Interpolation
Sumukh K. Aithal, Pratyush Maini, Zachary C. Lipton, J. Zico Kolter
COLM 2024Oral (Set-LLM)
TOFU: A Task of Fictitious Unlearning for LLMs
Pratyush Maini*, Zhili Feng*, Avi Schwarzschild*, Zachary C. Lipton, J. Zico Kolter
ACL 2024
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling
Pratyush Maini, Skyler Seto, He Bai, David Grangier, Yizhe Zhang, Navdeep Jaitly
CVPR 2024Best Paper (DPFM)
Scaling Laws for Data Filtering
Sachin Goyal, Pratyush Maini, Zachary C. Lipton, Aditi Raghunathan, J. Zico Kolter
ICLR 2024Oral (Datacomp)
T-MARS: Improving Visual Representations by Circumventing Text Feature Learning
Pratyush Maini, Sachin Goyal, Zachary C. Lipton, J. Zico Kolter, Aditi Raghunathan
ICML 2023
Can Neural Network Memorization Be Localized?
Pratyush Maini, Michael C. Mozer, Hanie Sedghi, Zachary C. Lipton, J. Zico Kolter, Chiyuan Zhang
EMNLP 2023
Model-tuning via Prompts Makes NLP Models Adversarially Robust
Mrigank Raman, Pratyush Maini, J. Zico Kolter, Zachary C. Lipton, Danish Pruthi
NeurIPS 2022Best Paper Nominee; Oral
Characterizing Datapoints via Second-Split Forgetting
Pratyush Maini, Saurabh Garg, Zachary C. Lipton, J. Zico Kolter
UAI 2022
Perturbation Type Categorization for Multiple l_p Bounded Adversarial Robustness
Pratyush Maini, Xinyun Chen, Bo Li, Dawn Song
ICLR 2021Spotlight
Dataset Inference: Ownership Resolution in Machine Learning
Pratyush Maini, Mohammad Yaghini, Nicolas Papernot
CVPR 2021
Data-free Model Extraction
Jean-Baptiste Truong*, Pratyush Maini*, Robert J. Walls, Nicolas Papernot
ICML 2020
Adversarial Robustness Against the Union of Multiple Perturbation Models
Pratyush Maini, Eric Wong, J. Zico Kolter
EMNLP 2020
Why and When Should You Pool? Analyzing Pooling in Recurrent Architectures
Pratyush Maini, Keshav Kolluru, Danish Pruthi, Mausam