Faster Neural Network Training with Data Echoing

  

In the twilight of Moore’s law, GPUs and other specialized hardware accelerators have dramatically sped up neural network training. However, earlier stages of the training pipeline, such as disk I/O and data preprocessing, do not run on accelerators. As accelerators continue to improve, these earlier stages will increasingly become the bottleneck.

Deep TabNine

  

Amazing!! Deep Learning-based NLP techniques are going to revolutionize the way we write software. Here’s Deep TabNine, a GPT-2 model trained on around 2 million files from GitHub. #nlproc

🥁🥁🥁 Welcome to 'pytorch-transformers', the 👾 library for Natural Language Processing!

  

PyTorch-Transformers is a library of state-of-the-art pre-trained models for Natural Language Processing (NLP).

The library currently contains PyTorch implementations, pre-trained model weights, usage scripts and conversion utilities for the following models:

  • BERT (from Google) released with the paper BERT
  • GPT (from OpenAI) released with the paper Improving Language Understanding
  • GPT-2 (from OpenAI) released with the paper Language Models are Unsupervised Multitask Learners
  • Transformer-XL (from Google/CMU) released with the paper Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context 
  • XLNet (from Google/CMU) released with the paper ​XLNet: Generalized Autoregressive Pretraining for Language Understanding 
  • XLM (from Facebook) released together with the paper Cross-lingual Language Model Pretraining

The Staggering Cost of Training SOTA AI Models

  

Synced recently reported on XLNet, a new language model developed by CMU and Google Research which outperforms the previous SOTA model BERT (Bidirectional Encoder Representations from Transformers) on 20 language tasks including SQuAD, GLUE, and RACE; and has achieved SOTA results on 18 of these tasks.

DeepSDF

  

Computer graphics, 3D computer vision and robotics communities have produced multiple approaches to representing 3D geometry for rendering and reconstruction. These provide trade-offs across fidelity, efficiency and compression capabilities. In this work, we introduce DeepSDF, a learned continuous Signed Distance Function (SDF) representation of a class of shapes that enables high quality shape representation, interpolation and completion from partial and noisy 3D input data.

Gen

  

A general-purpose probabilistic programming system with programmable inference.

Literature on graph representation learning

  

This is a paper list about deep learning for graphs.

XLNet: Generalized Autoregressive Pretraining for Language Understanding

  

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy.

A Growing Neural Gas Network Learns Topologies

  

They propose growing “neural gas” as an incremental network model that learns topological relations by using a Hebbian-like learning rule.

https://t.co/Rb4T1MHzF7

POET

  

Endlessly Generating Increasingly Complex and Diverse Learning Environments and their Solutions through the Paired Open-Ended Trailblazer