Publications

arXiv 2025

Safety Pretraining: Toward the Next Generation of Safe AI

Pratyush Maini*, Sachin Goyal*, Dylan Sam*, Alex Robey, Yash Savani, Yiding Jiang, Andy Zou, Matt Fredrikson, Zachary C. Lipton, J. Zico Kolter

A framework for training AI systems to be inherently safer through specialized pretraining.

ACL 2024

Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling

Pratyush Maini, Skyler Seto, He Bai, David Grangier, Yizhe Zhang, Navdeep Jaitly

Paraphrasing web data into Q/A pairs significantly improves language model training efficiency.

CVPR 2024 Best Paper (DPFM Workshop)

Scaling Laws for Data Filtering — Data Curation cannot be Compute Agnostic

Sachin Goyal, Pratyush Maini, Zachary C. Lipton, Aditi Raghunathan, J. Zico Kolter

Training compute-optimal models requires data filtering that scales with available compute.

NeurIPS 2024 Oral (Private-NLP Workshop)

LLM Dataset Inference: Did you train on my dataset?

Pratyush Maini*, Hengrui Jia*, Nicolas Papernot, Adam Dziedzic

Black-box detection of whether a dataset was used to train an LLM.

COLM 2024 Oral (Set-LLM Workshop)

TOFU: A Task of Fictitious Unlearning for LLMs

Pratyush Maini*, Zhili Feng*, Avi Schwarzschild*, Zachary C. Lipton, J. Zico Kolter

Benchmarking machine unlearning methods for large language models.

ICML 2023

Can Neural Network Memorization Be Localized?

Pratyush Maini, Michael C. Mozer, Hanie Sedghi, Zachary C. Lipton, J. Zico Kolter, Chiyuan Zhang

Individual neurons and layers do not solely determine what a neural network memorizes.

ICLR 2021 Spotlight

Dataset Inference: Ownership Resolution in Machine Learning

Pratyush Maini, Mohammad Yaghini, Nicolas Papernot

First work on dataset inference: determining if a dataset was used for training.