Model Size Calculator
Estimate transformer model parameters and GPU memory requirements. Calculate weights for attention, FFN, embeddings, and plan GPU infrastructure for training or inference.
Model Architecture
Related Calculators
You might also find these calculators useful
Plan Your LLM Infrastructure
Running large language models requires understanding their memory footprint. Our Model Size Calculator helps you estimate parameters and GPU memory requirements for transformers, whether you're training a custom model or deploying for inference. Based on EleutherAI's Transformer Math and Kipply's parameter counting formulas.
Understanding Model Size and Memory
Transformer models consist of attention layers, feed-forward networks, and embeddings. The classic formula P ≈ 12Ld² estimates parameters from layers (L) and hidden dimension (d). Memory requirements depend on precision (FP32/FP16/INT8) and whether you're training (requires optimizer states and gradients) or running inference (requires KV cache).
Parameter Formula
P = 12 × L × d_model² + V × d_modelWhy Calculate Model Size?
GPU Planning
Determine if your model fits on a single GPU or requires multi-GPU setups with tensor/pipeline parallelism.
Cost Estimation
GPU memory requirements directly impact cloud compute costs. Right-size your infrastructure to avoid overspending.
Architecture Design
When designing custom models, understand the parameter/memory tradeoffs of different layer configurations.
Quantization Planning
See how INT8 or INT4 quantization reduces memory requirements, enabling larger models on consumer GPUs.
How to Use This Calculator
Frequently Asked Questions
Training requires: 1) Model weights, 2) Optimizer states (AdamW stores momentum and variance = 8 bytes/param), 3) Gradients (4 bytes/param), 4) Activations for backpropagation. Rule of thumb: training needs ~16-20 bytes per parameter in mixed precision, while inference needs only 2 bytes per parameter in FP16.