SmolGPT - GPT From Scratch
Modern decoder-only transformer (RoPE, RMSNorm, SwiGLU, flash attention) built from scratch as an optimisation sandbox.
A foundation for research into small, locally-hosted reasoning models - architectures and training recipes you can take apart and modify, instead of treating an API as a black box.