Faster Neural Network Training with Data Echoing

25 Jul 2019, Prathyush SP

In the twilight of Moore’s law, GPUs and other specialized hardware accelerators have dramatically sped up neural network training. However, earlier stages of the training pipeline, such as disk I/O and data preprocessing, do not run on accelerators. As accelerators continue to improve, these earlier stages will increasingly become the bottleneck.

Deep TabNine

19 Jul 2019, Prathyush SP

Amazing!! Deep Learning-based NLP techniques are going to revolutionize the way we write software. Here’s Deep TabNine, a GPT-2 model trained on around 2 million files from GitHub. #nlproc

🥁🥁🥁 Welcome to 'pytorch-transformers', the 👾 library for Natural Language Processing!

18 Jul 2019, Prathyush SP

PyTorch-Transformers is a library of state-of-the-art pre-trained models for Natural Language Processing (NLP).

The library currently contains PyTorch implementations, pre-trained model weights, usage scripts and conversion utilities for the following models:

BERT (from Google) released with the paper BERT
GPT (from OpenAI) released with the paper Improving Language Understanding
GPT-2 (from OpenAI) released with the paper Language Models are Unsupervised Multitask Learners
Transformer-XL (from Google/CMU) released with the paper Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
XLNet (from Google/CMU) released with the paper XLNet: Generalized Autoregressive Pretraining for Language Understanding
XLM (from Facebook) released together with the paper Cross-lingual Language Model Pretraining

The Staggering Cost of Training SOTA AI Models

09 Jul 2019, Prathyush SP

Synced recently reported on XLNet, a new language model developed by CMU and Google Research which outperforms the previous SOTA model BERT (Bidirectional Encoder Representations from Transformers) on 20 language tasks including SQuAD, GLUE, and RACE; and has achieved SOTA results on 18 of these tasks.

DeepSDF

07 Jul 2019, Prathyush SP

Computer graphics, 3D computer vision and robotics communities have produced multiple approaches to representing 3D geometry for rendering and reconstruction. These provide trade-offs across fidelity, efficiency and compression capabilities. In this work, we introduce DeepSDF, a learned continuous Signed Distance Function (SDF) representation of a class of shapes that enables high quality shape representation, interpolation and completion from partial and noisy 3D input data.

Gen

06 Jul 2019, Prathyush SP

A general-purpose probabilistic programming system with programmable inference.

Literature on graph representation learning

05 Jul 2019, Prathyush SP

This is a paper list about deep learning for graphs.

XLNet: Generalized Autoregressive Pretraining for Language Understanding

22 Jun 2019, Prathyush SP

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy.

A Growing Neural Gas Network Learns Topologies

20 Jun 2019, Prathyush SP

They propose growing “neural gas” as an incremental network model that learns topological relations by using a Hebbian-like learning rule.

https://t.co/Rb4T1MHzF7

POET

19 Jun 2019, Prathyush SP

Endlessly Generating Increasingly Complex and Diverse Learning Environments and their Solutions through the Paired Open-Ended Trailblazer