Back to portfolio
Section 14

Transformer Architectures

Deep dives into model internals: Building Multi-Head Attention mechanisms from the ground up.

Projects in this section: 0

Multi-Head Attention from Scratch
Transformer ArchitecturesGitHub

Multi-Head Attention from Scratch

Building the Attention mechanism tensor by tensor.

Transformer LLM from scratch
Transformer ArchitecturesGitHub

Transformer LLM from scratch

Complete transformer-based language model built from scratch.