# NVIDIA Tensor Cores are specialized GPU units for matrix operations - the foundation of neural networks

**Date:** 2025-12-18  
**Tags:** GPU, AI, Hardware  
**URL:** https://kelexine.is-a.dev/til/tensor-cores-gpu

---

TIL: NVIDIA Tensor Cores are specialized GPU units for matrix operations - the foundation of neural networks. They perform mixed-precision matrix multiply-accumulate (FP16 input, FP32 accumulate) in a single operation. H100 Tensor Cores provide 500+ TFLOPS for AI, enabling efficient LLM training and inference.


```python
# Enable Tensor Core acceleration
import torch

# Automatic mixed precision uses Tensor Cores
with torch.cuda.amp.autocast():
    # FP16 matrix ops on Tensor Cores
    output = model(input)  # 2-4x faster
    loss = criterion(output, target)

# TensorFloat-32 for A100/H100
torch.backends.cuda.matmul.allow_tf32 = True
```




---

*This content is available at [kelexine.is-a.dev/til/tensor-cores-gpu](https://kelexine.is-a.dev/til/tensor-cores-gpu)*
