Diffusion LLM Part 1: Diffusion Fundamentals -- From DDPM to Score Matching
Forward/Reverse Process, ELBO, Simplified Loss, Score Function -- the mathematical principles of diffusion models explained intuitively.

Diffusion LLM Part 1: Diffusion Fundamentals -- From DDPM to Score Matching
To understand Diffusion-based language models, you first need to understand Diffusion models themselves. In this post, we cover the core principles of Diffusion that have been proven in image generation. There is some math involved, but I have included intuitive explanations alongside the formulas, so you can follow the flow even if the equations feel unfamiliar.
This is the first installment of the Diffusion LLM series. See the Hub post for a series overview.
The Core Idea Behind Diffusion
The idea behind Diffusion models is surprisingly simple.
Related Posts

From Evaluation to Deployment — The Complete Fine-tuning Guide
Evaluate with Perplexity, KoBEST, ROUGE-L. Merge adapters with merge_and_unload(), convert to GGUF, deploy via vLLM/Ollama. Overfitting prevention, data quality, hyperparameter guide.

QLoRA + Custom Dataset — Fine-tune 7B on a Single T4 GPU
Fine-tune Qwen 2.5 7B on a T4 16GB using QLoRA (4-bit NormalFloat + LoRA). Korean dataset preparation guide, NF4/Double Quantization/Paged Optimizer explained, Wandb monitoring.

Mastering LoRA — Fine-tune a 7B Model on a Single Notebook
From LoRA theory to hands-on Qwen 2.5 7B fine-tuning. Train only 0.18% of parameters while achieving 98% of full fine-tuning performance. VRAM reduced from 130GB to 18GB.