Pratyush Maini
I like to observe. Look for patterns. Ponder over these generalizations. Try to refute them. Or otherwise prove their validity. And re-imagine their applications in alternate spheres.
I am a PhD candidate in Machine Learning at Carnegie Mellon University and a founding member of DatologyAI, advised by Zico Kolter and Zachary Lipton.
My research studies how training data shapes the behavior, memorization, and reliability of foundation models, with the goal of making them safe and deployable beyond controlled research settings. My research is partially supported by the OpenAI Cybersecurity Award.
Research Highlights
Foundations of Synthetic Pretraining
How rewriting web data became an industry standard.
Redefining AI Safety
Embedding safety natively into pretraining.
Machine Unlearning
The most widely used benchmark for LLM forgetting.
Foundations of Synthetic Pretraining
I proposed that LLMs should be pretrained on synthetic rephrases of web data: the same content, rewritten into styles that we care about at deployment time. Rephrasing the Web (ACL 2024) showed a 3x training speedup with no quality loss. BeyondWeb scaled this to trillion-token regimes. This is now standard practice, publicly reported by NVIDIA (Nemotron-CC), Microsoft (Phi-4), Moonshot AI (Kimi K2), xAI (Grok), and Arcee AI (Trinity Large, the strongest American open-weight model, with data curated by DatologyAI).
Related work: visual data curation (ICLR 2024), scaling laws for data filtering (CVPR 2024), dataset inference (ICLR 2021 Spotlight), LLM dataset inference (NeurIPS 2024), watermarked rephrasings (ICML 2025).
Redefining AI Safety
Safety Pretraining (NeurIPS 2025) embeds safety directly into the pretraining process, not as a post-hoc patch. This reduces attack success rates from 38.8% to 8.3% with no performance cost, and the gains hold up after fine-tuning. OpenAI, Anthropic, the UK AI Safety Institute, and Cambridge/Oxford have since published related efforts.
This builds on Memorization Sinks (ICML 2025) and localizing memorization (ICML 2023). See also: when to introduce safety interventions during training.
Machine Unlearning
TOFU (COLM 2024) is the most widely used benchmark for LLM unlearning. OpenUnlearning (NeurIPS 2025) packages 12+ methods and 10+ metrics into one framework, tested across 450+ models. Adversarial Compression (NeurIPS 2024, Best Paper at ACL Workshop) is an information-theoretic measure that catches memorization other metrics miss, now applied to copyright and diffusion model auditing (ICML Workshop 2025 Oral).