Natural Language Processing
Text Classification using Transformers
Build a Transformer from scratch
7 min readMar 12, 2021
1. Coding Transformer network in PyTorch
In this part, we will try to understand the Encoder-Decoder architecture of the Multi-Head Self-Attention Transformer network with some code in PyTorch. There won’t be any theory…