GELU, the ReLU Successor? Gaussian Error Linear Unit Explained
Published in
4 min readAug 30, 2022
In this tutorial we aim to comprehensively explain how Gaussian Error Linear Unit, GELU activation works.
Can we combine regularization and activation functions? In 2016 a paper from authors Dan Hendrycks and Kevin Gimpel came out. Since then, the paper now has been updated 4 times. The authors introduced a new activation function, the Gaussian Error Linear…