Bayesian Layers: A Module for Neural Network Uncertainty
16 Dec 2018, Prathyush SPAs demonstration, we fit a 10-billion parameter ‘Bayesian Transformer’ on 512 TPUv2 cores, which replaces attention layers with their Bayesian counterpart. 🔥
For more details, visit the source.