Differentiable parallel approximate sorting networks
17 Jun 2019, Prathyush SP   Simple (approximate) differentiable sorting and argsorting of vectors, so you can backprop through a sort() or argmax(). Using softmax sorting networks
Simple (approximate) differentiable sorting and argsorting of vectors, so you can backprop through a sort() or argmax(). Using softmax sorting networks
See how a modern neural network auto-completes your text 🤗
Transformer architectures show significant promise for natural language processing. Given that a single pretrained model can be fine-tuned to perform well on many different tasks, these networks appear to extract generally useful linguistic features. A natural question is how such networks represent this information internally.
Transformer architectures show significant promise for natural language processing. Given that a single pretrained model can be fine-tuned to perform well on many different tasks, these networks appear to extract generally useful linguistic features. A natural question is how such networks represent this information internally. This paper describes qualitative and quantitative investigations of one particularly effective model, BERT
https://arxiv.org/abs/1906.02715
How much can we infer about a person’s looks from the way they speak? In this paper, we study the task of reconstructing a facial image of a person from a short audio recording of that person speaking. We design and train a deep neural network to perform this task using millions of natural videos of people speaking from Internet/Youtube.
Several recent works have shown how highly realistic human head images can be obtained by training convolutional neural networks to generate them. In order to create a personalized talking head model, these works require training on a large dataset of images of a single person. However, in many practical scenarios, such personalized talking head models need to be learned from a few image views of a person, potentially even a single image.
InterpretML is an open-source python package for training interpretable models and explaining blackbox systems. Interpretability is essential for:
Model debugging - Why did my model make this mistake? Detecting bias - Does my model discriminate? Human-AI cooperation - How can I understand and trust the model’s decisions? Regulatory compliance - Does my model satisfy legal requirements? High-risk applications - Healthcare, finance, judicial, …
A very simple way to improve the performance of almost any machine learning algorithm is to train many different models on the same data and then to average their predictions. Unfortunately, making predictions using a whole ensemble of models is cumbersome and may be too computationally expensive to allow deployment to a large number of users, especially if the individual models are large neural nets.
A Deep Neural Network Approach to High-Dimensional Time Series Forecasting
“DeepGLO outperforms state-of-the-art approaches on various datasets; for example, we see more than 30% improvement in WAPE over other methods”
Semi-supervised learning has proven to be a powerful paradigm for leveraging unlabeled data to mitigate the reliance on large labeled datasets. In this work, we unify the current dominant approaches for semi-supervised learning to produce a new algorithm, MixMatch, that works by guessing low-entropy labels for data-augmented unlabeled examples and mixing labeled and unlabeled data using MixUp.