Differentiable parallel approximate sorting networks

  

Simple (approximate) differentiable sorting and argsorting of vectors, so you can backprop through a sort() or argmax(). Using softmax sorting networks

Write With Transformer

  

See how a modern neural network auto-completes your text 🤗

Visualizing and Measuring the Geometry of BERT

  

Transformer architectures show significant promise for natural language processing. Given that a single pretrained model can be fine-tuned to perform well on many different tasks, these networks appear to extract generally useful linguistic features. A natural question is how such networks represent this information internally.

How does a neural net represent language?

  

Transformer architectures show significant promise for natural language processing. Given that a single pretrained model can be fine-tuned to perform well on many different tasks, these networks appear to extract generally useful linguistic features. A natural question is how such networks represent this information internally. This paper describes qualitative and quantitative investigations of one particularly effective model, BERT

https://arxiv.org/abs/1906.02715

Speech2Face: Learning the Face Behind a Voice

  

How much can we infer about a person’s looks from the way they speak? In this paper, we study the task of reconstructing a facial image of a person from a short audio recording of that person speaking. We design and train a deep neural network to perform this task using millions of natural videos of people speaking from Internet/Youtube.

Few-Shot Adversarial Learning of Realistic Neural Talking Head Models

  

Several recent works have shown how highly realistic human head images can be obtained by training convolutional neural networks to generate them. In order to create a personalized talking head model, these works require training on a large dataset of images of a single person. However, in many practical scenarios, such personalized talking head models need to be learned from a few image views of a person, potentially even a single image.

MS Interpret - Fit interpretable models. Explain blackbox ML

  

InterpretML is an open-source python package for training interpretable models and explaining blackbox systems. Interpretability is essential for:

Model debugging - Why did my model make this mistake? Detecting bias - Does my model discriminate? Human-AI cooperation - How can I understand and trust the model’s decisions? Regulatory compliance - Does my model satisfy legal requirements? High-risk applications - Healthcare, finance, judicial, …

Distilling the Knowledge in a Neural Network

  

A very simple way to improve the performance of almost any machine learning algorithm is to train many different models on the same data and then to average their predictions. Unfortunately, making predictions using a whole ensemble of models is cumbersome and may be too computationally expensive to allow deployment to a large number of users, especially if the individual models are large neural nets.

Think Globally, Act Locally

  

A Deep Neural Network Approach to High-Dimensional Time Series Forecasting

“DeepGLO outperforms state-of-the-art approaches on various datasets; for example, we see more than 30% improvement in WAPE over other methods”

MixMatch - A Holistic Approach to Semi-Supervised Learning

  

Semi-supervised learning has proven to be a powerful paradigm for leveraging unlabeled data to mitigate the reliance on large labeled datasets. In this work, we unify the current dominant approaches for semi-supervised learning to produce a new algorithm, MixMatch, that works by guessing low-entropy labels for data-augmented unlabeled examples and mixing labeled and unlabeled data using MixUp.