Machine Learning — Hard
Key points
- BERT uses bidirectional encoder with masked language modeling
- GPT uses unidirectional decoder with causal language modeling
- BERT focuses on understanding tasks, GPT on text generation
- BERT has next sentence prediction, GPT predicts next token
- BERT is better for understanding tasks, GPT for text generation
Ready to go further?
Related questions
