CSCI 3388: Fall 2025

Overview

Generative AI (GenAI) is reshaping the future of humanity. From ChatGPT revolutionizing work and productivity to self-driving cars navigating complex environments, GenAI is driving a transformative revolution across every facet of society. At the core, GenAI constructs probabilistic density functions to model language, images, proteins, molecules, and beyond.

This course not only teaches you how to build DeepSeek-R1 and Stable Diffusion from scratch, but also takes you on a deep dive into the mathematical foundations that power GenAI:

🎲 Probability & Statistics: To model uncertainty with functions.
🤖 Non-Convex Optimization (Deep Learning) – To train neural networks to approximate intricate functions.
📡 Information Theory – To enforce low-dimensional structures within the function.
🌀 Dynamical Systems – To model the evolution of the function over time and space.
🌪️ Stochastic Processes – To model the evolution of the function with controlled randomness.

This ambitious course culminates in a final project where you design a startup mockup with GenAI tools, aiming to create impact for the common good at the intersection of technology and business.

Prerequisites:

MATH: Linear Algebra (MATH 2210, MATH 2211), Multivariate Calculus (MATH 2202, MATH 2203)
CSCI: Randomness (CSCI 2244), one 3000-level AI course

Schedule

Theme	Date	Topic	Materials	Assignments	Final Project
	Tue, Aug 26	Lec 1. Introduction What: demos, formulation Why: opportunities, risks How: syllabus, logistics	The Social Dilemma	lab1
Language Generation (Week 1-7)	Symbolic Representation
	Thu, Aug 28	Lec 2. Logic and Grammars Logic: propositional logic, first-order logic Grammar: lexicon, syntax, Eliza	Eliza	hw1 (sol)
	Statistical Representation
	Tue, Sep 2	Lec 3. N-Gram Language Models Probability: language modeling Statistics: MLE, MAP N-Gram: smoothing, perplexity	Sasha Rush: LLM In 5 Formulas	lab2	survey
	Neural Representation
	Thu, Sep 4	Lec 4. Linear Models Task: regression, classification Data: (X, Y-continuous/discrete) Model: linear layer, sigmoid/softmax Loss: MSE/cross entropy Optimization: analytical/SGD	3Blue1Brown: Essence of Linear Algebra, Essence of Calculus	hw2 (sol)
	Tue, Sep 9	Lec 5. Multi-Layer Perceptrons Model: Activation, Dropout layer Optimization: Backpropagation		lab3
	Thu, Sep 11	Lec 6. Word Embeddings Counting: Word-context matrix Learning: NLM, Word2Vec (Skip-gram, CBOW) Vector Properties: Analogies	Papers: NLM, Word2Vec	hw3 (sol)
	Tue, Sep 16	Lec 7. Recurrent Neural Networks RNN: Hidden state & sequential modeling LSTM: Gates & cell state for long memory NLP applications: encoder, decoder	Papers: RNN, LSTM	lab4
	Thu, Sep 18	Lec 8. Attention Module Machine Translation: Seq2Seq Attention Mechanism: Query, Key, Value	Papers: Seq2Seq, Attention	hw4 (sol)	team info due
	Tue, Sep 23	Lec 9. Transformer Model Encoder: MHA, FFN,LayerNorm, Residual Connection Decoder: Masked Attn, Cross Attn Embedding: BPE, Sinuositional Encoding	Attention is All You Need! Paper: Transformer Code: Annotated Transformer
	Thu, Sep 25	Lec 10. GPT 1-3 GPT-1: Finetuning GPT-2: Zero-shot GPT-3: Few-shot (In-context Learning)	Scaling is all you need! Papers: GPT-1, GPT-2, GPT-3	lab5	problem statement due (Fri)
	Tue, Sep 30	Lec 11. InstructGPT Supervised Fine-Tuning Reinforcement Learning from Human Feedback	Alignment is all you need! Paper: InstructGPT
	Thu, Oct 2	Lec 12. Reinforcement Learning Basics MDP Framework: state, action, reward Value-based RL: Bellman Equation (Q-learning) Policy-based RL: ε-greedy		hw5 (sol)
	Tue, Oct 7	Lec 13. ChatGPT REINFORCE RLHF: PPO (ChatGPT), GRPO (DeepSeekR1) Supervised HF: DPO
	Thu, Oct 9	Mid-term Exam	topics, practice (sol)
	Tue, Oct 14	No Class (Happy Fall Break)
Image Generation (Week 8-12)
	Thu, Oct 16	Lec 14. Image Basics and Filtering Digital Representation: 2D arrays, grayscale, RGB channels Filtering: Impluse, Box, Laplacian		lab6
	Statistical Representation
	Tue, Oct 21	Lec 15. Statistical Image Modeling Independent n-gram: Efros & Leung, image quilting Gaussian: Artistic style transfer
	Neural Representation
	Thu, Oct 23	Lec 16. CNN and U-Net Models 1D: PixelRNN 2D: CNN, U-Net		lab7
	Tue, Oct 28	Lec 17. Generative Adversarial Networks (GANs) Framework: Generator vs. discriminator game Architecture Theoretical results: JS divergence Challenges: Mode collapse, training instability, solutions
	Thu, Oct 30	Lec 18. Variational Autoencoders (VAEs) VAE Architecture: Encoder q(z\|x), decoder p(x\|z), prior p(z) ELBO Objective: Reconstruction + KL divergence Training: Reparameterization trick, KL vanishing		lab8
	Tue, Nov 4	Lec 19. Normalizing Flows Change of Variables: Exact likelihood computation Flow Architectures: RealNVP, Glow, continuous flows Properties: Invertibility, differentiability, composition
	Thu, Nov 6	Lec 20. Energy-based Models Energy-based model: Potential energy function, Boltzmann distribution Energy-based model: Potential energy function, Boltzmann distribution
	Tue, Nov 11	Lec 21. Diffusion Models (VAE view) Text-to-Image: DALL·E 2, Midjourney, Imagen Advanced Guidance: Classifier & classifier-free guidance Latent Diffusion: Stable Diffusion, faster training
	Thu, Nov 13	Lec 22. Diffusion Models (EBM view) Score-based model: Score matching, denoising diffusion
Guest Lectures (Week 13-14)
	Tue, Nov 18	Lec 23. Image Style Transfer Siyu Huang (Clemson University)
	Thu, Nov 20	Lec 24. Molecule Generation for Drug Discovery Wengong Jin (Northeastern University)
	Tue, Nov 25	Lec 25. Biomedical 3D Generation Jiancheng Yang (Aalto University)
	Thu, Nov 27	No Class (Happy Thanksgiving)
Finals (Week 15-16)	Tue, Dec 2	Project Presentations I
	Thu, Dec 4	Project Presentations II
	Thu, Dec. 11				report/code due

Staff & Office Hours

Donglai Wei

Instructor

Omer Yurekli

Zimeng Yang

Name	Office hours
Donglai Wei	(Tu/W) 3-4 pm @ Rm 528F, 245 Beacon ST
Omer Yurekli	(M) 10-11 am, (F) 12-1 pm @ Rm 122, 245 Beacon ST
Zimeng Yang	(W) 2-4 pm @ Rm 122, 245 Beacon ST

Email/Slack Donglai for any other things to help you succeed in the course.

Course information

1. Get help (besides office hours)

Slack: For labs/psets/final projects, we will create dedicated channels for you to ask public questions. If you cannot make your post public (e.g., due to revealing problem set solutions), please create a room with instructors and TAs separately, or come to office hours. Please note, however, that the course staff cannot provide help debugging code, and there is no guarantee that they'll be able to answer last-minute assignment questions before the deadline. We also appreciate it when you respond to questions from other students! If you have an important question that you would prefer to discuss over email, you may email the course staff, or the instructor directly.
Support: The university counseling services center provides a variety of programs and activities.
Accommodations for students with disabilities: If you are a student with a documented disability seeking reasonable accommodations in this course, please contact Kathy Duggan, (617) 552-8093, dugganka@bc.edu, at the Connors Family Learning Center regarding learning disabilities and ADHD, or Rory Stein, (617) 552-3470, steinr@bc.edu, in the Disability Services Office regarding all other types of disabilities, including temporary disabilities. Advance notice and appropriate documentation are required for accommodations.

2. Assignments/Grading

Submission: Submit the required files to Canvas.
10% - Attendance: Each week has a coding exercise in Colab to help you gain the hands-on understanding about the material.
15% - Labs (weekly, Colab): Hands‑on GenAI implementations.
15% - HWs (weekly, hand-written): Mathematical exercises (derivations/proofs) to solidify theory.
25% - Midterm (closed‑book, handwritten): Focused on mathematical concepts from HWs; one double-page notes sheet allowed.
35% - Final project (startup mockup with GenAI): Proposal (5%), Milestone demo (10%), Final demo (10%), Written report (5%), Individual reflection/peer eval (5%). (More info)

3. Academic policy

Late policy: You'll have 10 late days each (counting weekends) for labs and HWs respectively over the course of the semester. Each time you use one, you may submit an assignment one day late without penalty. You are allowed to use multiple late days on a single assignment. For example, you can use all of your days at once to turn in one assignment a week late. You do not need to notify us when you use a late day; we'll deduct it automatically. If you run out of late days and still submit late, your assignment will be penalized at a rate of 10% per day. We will not provide additional late time, except under exceptional circumstances, and for these we'll require documentation (e.g., a doctor's note). Please note that the late days are provided to help you deal with minor setbacks, such as routine illness or injury, paper deadlines, interviews, and computer problems; these do not generally qualify for an additional extension.
Academic integrity: While you are encouraged to discuss homework assignments with GenAI or other students, your programming work must be completed individually. Thus it is acceptable to learn from GenAI or another student the general idea for writing program code to perform a particular task, or the technique for solving a mathematical problem, but unacceptable for two students to prepare their assignments together and submit what are essentially two copies of identical work. If you have any uncertainty about the application of this policy, please check with me. Failure to comply with these guidelines will be considered a violation of the University policies on academic integrity. Please make sure that you are familiar with these policies.

4. Additional resource (Free student accounts!)

LLM: ChatGPT, Claude, Gemini
IDE: Cursor

Acknowledgements: This course draws heavily from ...

CSCI 3388 Generative AI: Mathematics and Applications

Instructor: Donglai Wei Fall 2025 (TT 1:30-2:45 pm) 245 Beacon ST Room 125

Overview

Schedule

Staff & Office Hours

Course information