- Deep Gaussian Mixture Models
- Variational Dropout Sparsifies Deep Neural Networks
- MinimalRNN: Toward More Interpretable and Trainable Recurrent Neural Networks
- A Resizable Mini-batch Gradient Descent based on a Randomized Weighted Majority
- Simple And Efficient Architecture Search for Convolutional Neural Networks
- High-dimensional dynamics of generalization error in neural networks
- Matrix capsules with EM routing
- Data Augmentation Generative Adversarial Networks
- Variance Reduced methods for Non-convex Composition Optimization
- Simple And Efficient Architecture Search for Convolutional Neural Networks
- Learning and Real-time Classification of Hand-written Digits With Spiking Neural Networks
- The (Un)reliability of saliency methods
- Variational Walkback: Learning a Transition Operator as a Stochastic Recurrent Net
- Deep Neural Networks as Gaussian Processes
- Deep Recurrent Gaussian Process with Variational Sparse Spectrum Approximation
- Fraternal Dropout
- Don't Decay the Learning Rate, Increase the Batch Size
- Reparameterizing the Birkhoff Polytope for Variational Permutation Inference
- Variational Inference based on Robust Divergences
- Binary Classification from Positive-Confidence Data
- Kronecker Recurrent Units
- Bayesian GAN
- Learning how to explain neural networks: PatternNet and PatternAttribution
- Bayesian Recurrent Neural Networks
- Bayesian Optimization with Gradients
- Dataset Augmentation in Feature Space
- Revisiting Distributionally Robust Supervised Learning in Classification
- Make SVM great again with Siamese kernel for few-shot learning
- Structure Discovery in Nonparametric Regression through Compositional Kernel Search
- Bayesian Hypernetworks
- Efficient Methods and Hardware for Deep Learning
- Seven neurons memorizing sequences of alphabetical images via spike-timing dependent plasticity
- Understanding Deep Learning Generalization by Maximum Entropy
- Deep Bayesian Active Learning with Image Data
- Aggregated Wasserstein Metric and State Registration for Hidden Markov Models
- Opening the Black Box of Deep Neural Networks via Information
- Neural Network Gradient Hamiltonian Monte Carlo
- Fixing Weight Decay Regularization in Adam
- Unsupervised Deep Embedding for Clustering Analysis
- Unsupervised Learning by Predicting Noise
- Preparing for the Unknown: Learning a Universal Policy with Online System Identification
- Deep Hyperspherical Learning
- Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
- Metric Learning-based Generative Adversarial Network
- Stochastic gradient descent performs variational inference, converges to limit cycles for deep networks
- Information-theoretic analysis of generalization capability of learning algorithms
- A Bayesian Perspective on Generalization and Stochastic Gradient Descent
- Training Neural Networks Without Gradients: A Scalable ADMM Approach
- Neural Discrete Representation Learning
- Bayesian Uncertainty Estimation for Batch Normalized Deep Networks
- The Implicit Bias of Gradient Descent on Separable Data
- Adaptive Sampling Strategies for Stochastic Optimization
- Learning One-hidden-layer Neural Networks with Landscape Design
- Attacking Binarized Neural Networks
- Training GANs with Optimism
- The Riemannian Geometry of Deep Generative Models
- Normalized Direction-preserving Adam
- Deep Gaussian Covariance Network
- mixup: Beyond Empirical Risk Minimization
- Learned Optimizers that Scale and Generalize
- Improving training of deep neural networks via Singular Value Bounding
- Supervised Classification: Quite a Brief Overview
- Distributed Second-Order Optimization using Kronecker-Factored Approximations
- First-order Methods Almost Always Avoid Saddle Points
- A Bayesian Perspective on Generalization and Stochastic Gradient Descent
- Generalization in Deep Learning
- Training Feedforward Neural Networks with Standard Logistic Activations is Feasible
- Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks
- Energy-Based Unsupervised Learning
- Deep Variational Information Bottleneck
- Learning Deep Architectures via Generalized Whitened Neural Networks
- First Efficient Convergence for Streaming k-PCA: a Global, Gap-Free, and Near-Optimal Rate
- Learning Scalable Deep Kernels with Recurrent Structure
- On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
- Optimizing Neural Networks with Kronecker-factored Approximate Curvature
- A Tutorial on Energy-Based Learning
- Detecting and Explaining Crisis
- Breaking the Softmax Bottleneck: A High-Rank RNN Language Model
- Improving Distributional Similarity with Lessons Learned from Word Embeddings
- Language as a Latent Variable: Discrete Generative Models for Sentence Compression
- Compressing Word Embeddings via Deep Compositional Code Learning
- Improving Negative Sampling for Word Representation using Self-embedded Features
- Unsupervised Neural Machine Translation
- Unsupervised Machine Translation Using Monolingual Corpora Only
- Using k-way Co-occurrences for Learning Word Embeddings
- Paraphrase Generation with Deep Reinforcement Learning
- One-shot and few-shot learning of word embeddings
- Understanding Neural Networks through Representation Erasure
- Baselines and Bigrams: Simple, Good Sentiment and Topic Classification
- Generating Natural Adversarial Examples
- Deconvolutional Paragraph Representation Learning
- Entity Embeddings with Conceptual Subspaces as a Basis for Plausible Reasoning
- 日本語単語ベクトルの構築とその評価
- Word Translation Without Parallel Data
- Learning the Dimensionality of Word Embeddings
- Using matrices to model symbolic relationship
- Implementing the Deep Q-Network
- A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning
- Rainbow: Combining Improvements in Deep Reinforcement Learning
- TreeQN and ATreeC: Differentiable Tree Planning for Deep Reinforcement Learning
- Distributed Prioritized Experience Replay
- Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations
- Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
- Safe Model-based Reinforcement Learning with Stability Guarantees
- Representation Learning by Learning to Count
- Unsupervised learning of object frames by dense equivariant image labelling
- DiracNets: Training Very Deep Neural Networks Without Skip-Connections
- Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations
- Genetic CNN
- Semi and Weakly Supervised Semantic Segmentation Using Generative Adversarial Network
- Dynamic Routing Between Capsules
- Progressive Growing of GANs for Improved Quality, Stability, and Variation
- One pixel attack for fooling deep neural networks
- Many Paths to Equilibrium: GANs Do Not Need to Decrease a Divergence At Every Step
- Unsupervised Cross-Domain Image Generation
- StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks
- Learning Wasserstein Embeddings
- Do Convolutional Neural Networks Learn Class Hierarchy?
- Residual Connections Encourage Iterative Inference
- ステレオ立体視のために両眼の間で比較される脳内情報は何か?
- A Kronecker-factored approximate Fisher matrix for convolution layers
- One Millisecond Face Alignment with an Ensemble of Regression Trees
- The Numerics of GANs
- Neural Color Transfer between Images
- Casual 3D Photography
- The Mondrian Process
- Bayesian Policy Search with Policy Priors
- Variational MCMC
- Shared Segmentation of Natural Scenes Using Dependent Pitman-Yor Processes
- Pitman-Yor Diffusion Trees
- Stochastic Memoizer for Sequence Data
- Parallel WaveNet: Fast High-Fidelity Speech Synthesis
- High-fidelity speech synthesis with WaveNet
- Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention
- Deep Voice 3: 2000-Speaker Neural Text-to-Speech
- Tacotron: Towards End-to-End Speech Synthesis
- Exploring Speech Enhancement with Generative Adversarial Networks for Robust Speech Recognition
- Robust Speech Recognition Using Generative Adversarial Networks
- Improving speech recognition by revising gated recurrent units
- Acoustic Modeling for Google Home
- Singing Voice Separation with Deep U-Net Convolutional Networks
- A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech Enhancement
- Adversarial Semi-Supervised Audio Source Separation applied to Singing Voice Extraction
- Ternary Residual Networks
- Fast Information-theoretic Bayesian Optimisation
- An Information-Theoretic Analysis of Deep Latent-Variable Models
- An Information Theoretic Perspective on Multiple Classifier Systems
- Information Geometry Connecting Wasserstein Distance and Kullback-Leibler Divergence via the Entropy-Relaxed Transportation Problem
- Deep Scattering: Rendering Atmospheric Clouds with Radiance-Predicting Neural Networks
- Interactive Wood Combustion for Botanical Tree Models
- Learning Explanatory Rules from Noisy Data
- Few-shot Autoregressive Density Estimation: Towards Learning to Learn Distributions
- Exact solutions to the nonlinear dynamics of learning in deep linear neural networks
- In-Place Initializable Arrays
- On Tensor Train Rank Minimization: Statistical Efficiency and Scalable Algorithm
- Robust Transient Dynamics and Brain Functions
- A neural algorithm for a fundamental computing problem
- Cortical microcircuits as gated-recurrent neural networks
- Learning with three factors: modulating Hebbian plasticity with errors
- A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs
- Comparing Hebbian Semantic Vectors Across Language
- The hippocampus as a predictive map
- Building Machines that Learn and Think for Themselves
- Building Machines That Learn and Think Like People
- A Gentle Introduction to Calculating the BLEU Score for Text in Python
- Understanding the Mixture of Softmaxes (MoS)
- An On-device Deep Neural Network for Face Detection
- Why Deep Learning and NLP Don't Get Along Well?
- Using Machine Learning to predict parking difficulty
- Using machine learning for insurance pricing optimization
- Statistical Machine Learning: Spring 2017
- How to unit test machine learning code.
- Optimizing deeper networks with KFAC in PyTorch.
- How Adversarial Attacks Work
- Deep learning and Backprop in the Brain (CCN 2017)