Research2h ago

Show HN: Duplicate 3 layers in a 24B LLM, logical deduction .22→.76. No training

What it is

Think of a transformer model as a tower of 24 processing floors. Researchers found that certain consecutive floors—say floors 12, 13, 14—form a complete "reasoning department." Copy-paste that entire department into the tower, and the model literally runs its logical reasoning pipeline twice in sequence. No retraining. Just architectural cloning.

Why it matters

This changes the economics of model improvement. Instead of training bigger models from scratch (expensive, slow), you might boost reasoning by identifying and duplicating the right 3-layer circuit. If you're running local models on consumer hardware, this is a path to better performance without buying H100s. The technique worked on AMD gaming GPUs—RX 7900 XT and RX 6950 XT—making it accessible.

Key details

•Method replicates David Ng's RYS (Replicate YourSelf) technique on a 24B parameter model
•Logical deduction benchmark score increased from 0.22 to 0.76 (245% improvement)
•"Reasoning circuits" identified as contiguous 3-4 layer blocks that function as cognitive units
•Runs on consumer AMD hardware: RX 7900 XT + RX 6950 XT tested successfully
•Code available on GitHub at alainnothere/llm-circuit-finder

Sources

Show HN: Duplicate 3 layers in a 24B LLM, logical deduction .22→.76. No training(hn)