LLM Reasoning Failures Part 2: Cognitive Biases — Inherited from Human Data
Anchoring, Order Bias, Sycophancy, Confirmation Bias — cognitive biases from RLHF and training data, tested across 7 models.

LLM Reasoning Failures Part 2: Cognitive Biases — Inherited from Human Data
LLMs learn from human-generated text. The problem is, they inherit human biases along with it.
In Part 1 we examined structural limitations like the Reversal Curse, counting failures, and compositional reasoning breakdowns — fundamental architectural constraints that persist regardless of scale. This Part 2 focuses on robustness issues. The model doesn't necessarily give wrong answers; rather, its answers shift depending on how you ask the question.
Structural limitations cannot be fixed by scaling models up. But cognitive biases are different. They stem from biased training data and RLHF, so they are in principle improvable. The problem is that they are still observed across every model today.
We tested 4 cognitive biases across 7 models: GPT-4o, GPT-4o-mini, o3-mini, Claude Sonnet 4.5, Claude Haiku 4.5, Gemini 2.5 Flash, and Gemini 2.5 Flash-Lite.
Related Posts

From Evaluation to Deployment — The Complete Fine-tuning Guide
Evaluate with Perplexity, KoBEST, ROUGE-L. Merge adapters with merge_and_unload(), convert to GGUF, deploy via vLLM/Ollama. Overfitting prevention, data quality, hyperparameter guide.

QLoRA + Custom Dataset — Fine-tune 7B on a Single T4 GPU
Fine-tune Qwen 2.5 7B on a T4 16GB using QLoRA (4-bit NormalFloat + LoRA). Korean dataset preparation guide, NF4/Double Quantization/Paged Optimizer explained, Wandb monitoring.

Mastering LoRA — Fine-tune a 7B Model on a Single Notebook
From LoRA theory to hands-on Qwen 2.5 7B fine-tuning. Train only 0.18% of parameters while achieving 98% of full fine-tuning performance. VRAM reduced from 130GB to 18GB.