Differentiable parallel approximate sorting networks

17 Jun 2019, Prathyush SP

Simple (approximate) differentiable sorting and argsorting of vectors, so you can backprop through a sort() or argmax(). Using softmax sorting networks

Write With Transformer

13 Jun 2019, Prathyush SP

See how a modern neural network auto-completes your text 🤗

Visualizing and Measuring the Geometry of BERT

07 Jun 2019, Prathyush SP

Transformer architectures show significant promise for natural language processing. Given that a single pretrained model can be fine-tuned to perform well on many different tasks, these networks appear to extract generally useful linguistic features. A natural question is how such networks represent this information internally.

How does a neural net represent language?

03 Jun 2019, Prathyush SP

Transformer architectures show significant promise for natural language processing. Given that a single pretrained model can be fine-tuned to perform well on many different tasks, these networks appear to extract generally useful linguistic features. A natural question is how such networks represent this information internally. This paper describes qualitative and quantitative investigations of one particularly effective model, BERT

https://arxiv.org/abs/1906.02715

Speech2Face: Learning the Face Behind a Voice

01 Jun 2019, Prathyush SP

How much can we infer about a person’s looks from the way they speak? In this paper, we study the task of reconstructing a facial image of a person from a short audio recording of that person speaking. We design and train a deep neural network to perform this task using millions of natural videos of people speaking from Internet/Youtube.

Few-Shot Adversarial Learning of Realistic Neural Talking Head Models

22 May 2019, Prathyush SP

Several recent works have shown how highly realistic human head images can be obtained by training convolutional neural networks to generate them. In order to create a personalized talking head model, these works require training on a large dataset of images of a single person. However, in many practical scenarios, such personalized talking head models need to be learned from a few image views of a person, potentially even a single image.

MS Interpret - Fit interpretable models. Explain blackbox ML

18 May 2019, Prathyush SP

InterpretML is an open-source python package for training interpretable models and explaining blackbox systems. Interpretability is essential for:

Model debugging - Why did my model make this mistake? Detecting bias - Does my model discriminate? Human-AI cooperation - How can I understand and trust the model’s decisions? Regulatory compliance - Does my model satisfy legal requirements? High-risk applications - Healthcare, finance, judicial, …

Distilling the Knowledge in a Neural Network

16 May 2019, Prathyush SP

A very simple way to improve the performance of almost any machine learning algorithm is to train many different models on the same data and then to average their predictions. Unfortunately, making predictions using a whole ensemble of models is cumbersome and may be too computationally expensive to allow deployment to a large number of users, especially if the individual models are large neural nets.

Think Globally, Act Locally

12 May 2019, Prathyush SP

A Deep Neural Network Approach to High-Dimensional Time Series Forecasting

“DeepGLO outperforms state-of-the-art approaches on various datasets; for example, we see more than 30% improvement in WAPE over other methods”

MixMatch - A Holistic Approach to Semi-Supervised Learning

09 May 2019, Prathyush SP

Semi-supervised learning has proven to be a powerful paradigm for leveraging unlabeled data to mitigate the reliance on large labeled datasets. In this work, we unify the current dominant approaches for semi-supervised learning to produce a new algorithm, MixMatch, that works by guessing low-entropy labels for data-augmented unlabeled examples and mixing labeled and unlabeled data using MixUp.