GELU, the ReLU Successor? Gaussian Error Linear Unit Explained

Konstantinos Poulinakis
Towards AI
Published in
4 min readAug 30, 2022

--

Photo by Willian B. on Unsplash

In this tutorial we aim to comprehensively explain how Gaussian Error Linear Unit, GELU activation works.

Can we combine regularization and activation functions? In 2016 a paper from authors Dan Hendrycks and Kevin Gimpel came out. Since then, the paper now has been updated 4 times. The authors introduced a new activation function, the Gaussian Error Linear

--

--