10-799: Data Privacy, Memorization and Copyright in Generative AI

Fall 2024

Operation Veritas: Watermarking and Detection of AI-Generated Content

Situation

With the 2028 elections approaching, there’s a high risk of AI-generated disinformation flooding information channels. We need to develop methods to watermark AI-generated content and detect it, even when attempts have been made to conceal its origin.

What We Have

  • Access to the FLUX.1-dev model via Hugging Face (black-forest-labs/FLUX.1-dev)
  • Starter code for basic watermarking and detection
  • High-performance computing resources for model training and evaluation
  • A dataset of human-written and AI-generated text samples
  • Access to other language models for comparison and potential use in watermark removal

Blue Team Mission

Your mission is to develop a robust system for watermarking and detecting AI-generated content.

Objectives:

  1. Design and implement an invisible watermarking technique for the FLUX model’s outputs
  2. Develop reliable methods to detect the presence of these watermarks
  3. Create advanced AI-generated content detection algorithms that work even in the absence of watermarks
  4. Ensure watermarking doesn’t significantly impact the quality or meaning of the generated text
  5. Develop a public education strategy to help citizens identify AI-generated content

Red Team Mission

Your mission is to challenge the Blue Team’s solutions, simulating the actions of malicious actors trying to spread undetected AI-generated content.

Objectives:

  1. Develop methods to remove or obscure the watermarks without significantly altering the content
  2. Create AI-generated content that can evade the Blue Team’s detection methods
  3. Use other language models or techniques to paraphrase watermarked content while preserving its meaning
  4. Explore potential weaknesses in the watermarking or detection systems
  5. Simulate disinformation campaigns to test the resilience of the Blue Team’s solutions

Evaluation Criteria

  • Invisibility of watermarks to human readers
  • Accuracy of watermark detection
  • Robustness of watermarks against removal attempts
  • Effectiveness of AI-generated content detection, with and without watermarks
  • Practicality and scalability of the proposed solutions
  • Quality and potential impact of the public education strategy

The success of this mission is crucial for maintaining the integrity of information during the upcoming elections. Both teams play a vital role in creating a robust system to combat AI-generated disinformation.

Overview

This course will cover various topics concerning data privacy, such as differential privacy, extracting training data from models, unlearning techniques to remove such data, and legal issues related to data memorization and copyright. The class blends theory and practice, starting with an understanding of why data privacy matters, looking at past legal cases, and building a foundation in privacy and machine learning (concepts such as differential privacy and membership inference).

The highlight of the course will be two case studies, where students will be divided into blue and red teams to find and defend against privacy vulnerabilities in a contemporary generative model.

Learning Objectives

  • Gain an understanding of data privacy in context of machine learning, its importance, and techniques used
  • Explore privacy challenges in generative AI, LLMs, and diffusion models
  • Work in teams to identify, remove and defend privacy vulnerabilities in AI models
  • Analyze legal, ethical, and practical aspects of AI-generated content, including copyright

Prerequisites

  • Basic machine learning concepts, background in deep learning
  • Familiarity with Python programming (Pytorch)
  • Interest in data privacy and legal issues in AI

Course Information

  • Instructor: Pratyush Maini (pratyushmaini@cmu.edu) | Course Advisors: Zack Lipton, Zico Kolter, and Daphne Ippolito
  • Schedule: Tuesdays and Thursdays, 5:00 PM - 6:20 PM
  • Location: GHC-4301
  • Office Hours: Thursdays, 10:00 AM - 11:00 AM
  • Elective: This course is an official 6 credit elective for MS and PhD students in ML@CMU. Any student in the SCS can take it.

Frequently Asked Questions

Why should I take this course?

  1. There are too many students at CMU focused on building state-of-the-art models, but we don’t talk enough about their societal impacts, the data they’re trained on, or think about artists as stakeholders in this process.
  2. Plus, breaking things is fun! The assignments in this course will be quite different from what you expect in a general course, and will attempt to gamify learning.

How much time would it consume?

  1. Expect it to take about as much time as any typical 12-unit course, but for half the semester.
  2. Most of the evaluation is experiment-based, through team battles between defenders and attackers. These competitions have a low entry bar, but no defined ceiling on how well you can do. There’s no one right answer to the assignments. It’s up to you to channel your curiosity and push yourself to do the best you can.

Is this course for me?

  1. If you have a background in PyTorch, have trained models before, and understand basic algebra, backpropagation, gradients, etc., you should have the necessary pre-reqs to follow along. If you’ve worked on adversarial attacks before, you’ll be in a great spot.
  2. From the legal side, I don’t expect people to come with a lot of background. The goal is to build that understanding together, have open conversations, and share thoughts.

Course Structure

Three main themes, each explored through an in-depth case study:

  1. Data Privacy and Differential Privacy
  2. Data Memorization and Unlearning Copyrighted Content
  3. Legal Issues, Ethics, and Detecting AI-Generated Content

Assessment

  • Red Teams Projects (40%): Two challenges
  • Blue Teams Projects (50%): Two defenses
  • Class Participation (10%): Discussions

Schedule

Theme 1: Data Privacy and Differential Privacy

Date Topic Reading Activities/Announcements
Oct 22 (Tue) The Birth of the Printing Press & The Anatomy of a Threat Model - The Work of Art in the Age of Mechanical Reproduction by Walter Benjamin
- The Protection of Information in Computer Systems
Lecture and Discussion: Overview of privacy issues in ML, privacy breaches
Announcement: Sign up for GPU credits, HW 1 Release
Oct 24 (Thu) Differential Privacy - Deep Learning with Differential Privacy
- The Algorithmic Foundations of Differential Privacy (Chapters 1-3)
- Privacy in Machine Learning: A Survey
- Robust De-anonymization of Large Sparse Datasets
Hands-on Exercise: Implementing differential privacy in simple models
Oct 29 (Tue) Data Privacy Class Activity - Materials provided in class In-Class Activity: Simulating privacy attacks and defenses

Theme 2: Data Memorization and Unlearning Copyrighted Content

Case Study: Operation Poké-Purge

Date Topic Reading Activities/Announcements
Oct 31 (Thu) Data Memorization in ML Models - The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks
- Membership Inference Attacks Against Machine Learning Models
Extra Reading: Extracting Training Data from Large Language Models
Lecture and Discussion: Understanding memorization and its impacts
Nov 5 (Tue) Measuring and Mitigating Memorization - A Closer Look at Memorization in Deep Networks
- Rethinking LLM Memorization through the Lens of Adversarial Compression
- Extracting Memorized Training Data via Decomposition
Lecture and Discussion: Techniques to measure and reduce memorization
Announcement: Homework for Theme 3 assigned
Nov 7 (Thu) Unlearning Techniques - Machine Unlearning
- Towards Making Systems Forget with Machine Unlearning
Extra Reading: Task of Fictitious Unlearning
Hands-on Exercise: Implementing unlearning methods
Resource: Erasing Concepts from Diffusion Models
Nov 12 (Tue) Case Study 1: Operation Poké-Purge - Case Study 1 Discussion Case Study Discussion: Pokémon unlearning challenge, team matchups & strategy discussion

Case Study: Operation Veritas

Date Topic Reading Activities/Announcements
Nov 14 (Thu) Guest Lecture: Legal and Ethical Issues in AI - Generative AI Lawsuits Timeline: Legal Cases vs. OpenAI, Microsoft, Anthropic, Nvidia, Intel and More
- The Files are in the Computer: On Copyright, Memorization, and Generative AI
- Notice from US Copyright
Lecture and Discussion: AI ethics, legal implications, and copyright issues
Nov 19 (Tue) What is Fair Learning? - Fair Learning
- Unfair Learning
Extra Reading: Talkin’ ‘Bout AI Generation: Copyright and the Generative-AI Supply Chain
Class Activity: Open Discussion Floor
Nov 21 (Thu) Detecting & Watermarking AI-Generated Content - Defending Against Neural Fake News
- Automatic Detection of Machine Generated Text: A Critical Survey
- A Watermark for Large Language Models
Extra Reading: Deepfake Detection
Lecture and Discussion: Watermarking methods
Nov 26 (Tue) Case Study 2: Operation Veritas - Case Study 2 Briefing Case Study Discussion: Election integrity challenge, team matchups and strategy discussion

Final Presentations

Date Topic Activities/Announcements
Nov 28 (Thu) Thanksgiving Break Break Models
Dec 3 (Tue) Wrap Up Presentations: Students present case study results

Note: This schedule is subject to change. Please check regularly for updates.


Additional Notes:

  • GPU Credits Sign-Up: Please ensure you sign up for GPU credits by Oct 24 (Thu) to participate in hands-on exercises.

  • Homework Assignments: Homework for the next theme will be announced on the first day of the previous theme.

  • Extra Reading: Optional materials are provided for deeper understanding and exploration of topics.

Case Studies

This course features two in-depth case studies that allow students to apply theoretical knowledge to practical challenges. Each case study corresponds to one of the main themes of the course.

Case Study 1: Operation Pokémon Purge

Theme: Data Memorization and Unlearning Copyrighted Content

This case study challenges you to tackle the problem of unlearning copyrighted content, specifically Nintendo’s top 100 Pokémon, from an advanced diffusion model while maintaining its core functionality.

Case Study 2: Operation Veritas

Theme: Legal Issues, Copyright, and Detecting AI-Generated Content

In this final case study, you’ll develop and test watermarking and detection systems for AI-generated content in the context of potential election disinformation.

For each case study, students will be divided into red teams (challengers) and blue teams (defenders) to apply concepts learned in practical situations. Detailed instructions and resources for each case study will be provided in the respective briefing documents.

Class Activity: Data Privacy and Differential Privacy

While not a formal case study, we will explore the theme of Data Privacy and Differential Privacy through in-class activities and discussions. These exercises will provide hands-on experience with privacy-preserving methods and differential privacy techniques.

  • Activity Date: October 29, 2024
  • Details will be provided in class

Logistics

Late Submissions

  • 4 total grace days with a maximum of 2 grace days per assignment.
  • After grace days: 50% penalty up to 24 hours late, no credit after.
  • Extensions available for medical, family/personal emergencies, or university-approved absences.
  • Request 5 days prior to deadline in case of planned absences.

Audit and Pass/Fail

  • No official auditing; unofficial attendance welcome
  • Pass/Fail allowed (check program requirements)

Academic Integrity

  • Use Generative AI tools. Disclose!
  • Group study encouraged, but no sharing of written notes/code between teams
  • Searching for prior solutions/research papers is encouraged. Disclose!
  • Protect your work from copying
  • Violations result in grade reduction or course failure