LLM Reasoning Failures Part 1: Structural Limitations -- Scaling Won't Fix These
Reversal Curse, Counting, Compositional Reasoning — fundamental Transformer failures tested across 7 models.

LLM Reasoning Failures Part 1: Structural Limitations -- Scaling Won't Fix These
This is the first installment in our series dissecting LLM reasoning failures. In this post, we cover three fundamental limitations that persist no matter how much you scale the model or expand the training data.
- The Reversal Curse
- Counting Failures
- The Compositional Reasoning Wall
Related Posts

From Evaluation to Deployment — The Complete Fine-tuning Guide
Evaluate with Perplexity, KoBEST, ROUGE-L. Merge adapters with merge_and_unload(), convert to GGUF, deploy via vLLM/Ollama. Overfitting prevention, data quality, hyperparameter guide.

QLoRA + Custom Dataset — Fine-tune 7B on a Single T4 GPU
Fine-tune Qwen 2.5 7B on a T4 16GB using QLoRA (4-bit NormalFloat + LoRA). Korean dataset preparation guide, NF4/Double Quantization/Paged Optimizer explained, Wandb monitoring.

Mastering LoRA — Fine-tune a 7B Model on a Single Notebook
From LoRA theory to hands-on Qwen 2.5 7B fine-tuning. Train only 0.18% of parameters while achieving 98% of full fine-tuning performance. VRAM reduced from 130GB to 18GB.