This is a place to save the deep learning references that I believe are valueable and helpful.
How to read and understand a scientific paper: a guide for non-scientists
Paper | Authors | Application | comment |
---|---|---|---|
Efficient BackProp | Yann LeCun | π | |
Practical recommendations for gradient-based training of deep architectures | Yoshua Bengio | - | π |
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift | YSergey Ioffe, Christian Szegedy | - | π |
Understanding the difficulty of training deep feedforward neural networks | Xavier Glorot, Yoshua Bengio | - | π |
Visualizing Data using t-SNE | Laurens van der Maaten, Geoffrey Hinton | - | π |
Accelerating t-SNE using Tree-Based Algorithms | Laurens van der Maaten | - | π |
- Tradeoff batch size vs. number of iterations to train a neural network
- Deep Learning Book π
- Neural Networks and Deep Learning π
- Tuning the learning rate in Gradient Descent π
- Softmax Regression π
- Visualizing MNIST: An Exploration of Dimensionality Reduction π
Paper | Authors | Application | comment |
---|---|---|---|
Image Style Transfer Using Convolutional Neural Networks | Leon A. Gatys, Alexander S. Ecker, Matthias Bethge | Style Transfer | |
Depth Map Prediction from a Single Imageusing a Multi-Scale Deep Network | David Eigen, Christian Puhrsch, Rob Fergus | - | |
Dynamic Routing Between Capsules | Sara Sabour, Nicholas Frosst, Geoffrey E. Hinton | - | |
Densely Connected Convolutional Networks | Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger | - | My own implementation of Densenet as a python module can be found here. |
Gradient Based Learning Applied to Document Recognition | Yann LeCun, LΓ©on Bottou, Yoshua Bengio, Patrick Haffner | - | |
How transferable are features in deep neural networks? | Jason Yosinski, Jeff Clune, Yoshua Bengio, Hod Lipson | - | π |
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification | Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun | - | π |
Image Segmentation Using Deep Learning: A Survey | Shervin Minaee, Yuri Boykov, Fatih Porikli, Antonio Plaza, Nasser Kehtarnavaz, Demetri Terzopoulos | - | |
Deep Residual Learning for Image Recognition | Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun | - | π |
The Importance of Skip Connections in Biomedical Image Segmentation | Michal Drozdzal, Eugene Vorontsov, Gabriel Chartrand, Samuel Kadoury, Chris Pal | - | π |
Fully Convolutional Networks for Semantic Segmentation | Evan Shelhamer, Jonathan Long, Trevor Darrell | - | π |
U-Net: Convolutional Networks for Biomedical Image Segmentation | Olaf Ronneberger, Philipp Fischer, and Thomas Brox | Segmantic Segmentation | ππ |
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs | Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille | Segmantic Segmentation | π Impl. |
Gated-SCNN: Gated Shape CNNs for Semantic Segmentation | Towaki Takikawa, David Acuna, Varun Jampani, Sanja Fidler | Segmantic Segmentation | π Impl. |
FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation | Huikai Wu, Junge Zhang, Kaiqi Huang, Kongming Liang, Yizhou Yu | Segmantic Segmentation | π Impl. |
On Power Jaccard Losses for Semantic Segmentation | David Duque-Arias, Santiago Velasco-Forero, Jean-Emmanuel Deschaud, Francois Goulette, Andres Serna, Etienne Decenciere and Beatriz Marcotegui | Segmantic Segmantation | π Loss functions for segmentation tasks |
Locating Objects Without Bounding Boxes | Javier Ribera, David GΓΌera, Yuhao Chen, Edward J. Delp | Object Location (Loss function | π implementation |
- Review of Deep Learning Algorithms for Object Detection
- Depth Map Prediction from a Single Image using a Multi-Scale Deep Network
- Capsule Networks
- Deconvolution and Checkerboard Artifacts
- Convolutional Neural Networks (CNNs / ConvNets)
- An Overview of ResNet and its Variants π
- Losses used for Image Segmentation Problems
Paper | Authors | Application | comment |
---|---|---|---|
Visualizing and Understanding Recurrent Networks | Andrej Karpathy, Justin Johnson, Li Fei-Fei | - | π |
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling | Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, Yoshua Bengio | - | |
An Empirical Exploration of Recurrent Network Architectures | Rafal Jozefowicz, Wojciech Zaremba, Ilya Sutskever | - | |
LSTM: A Search Space Odyssey | Klaus Greff, Rupesh K. Srivastava, Jan Koutn ΜΔ±k, Bas R. Steunebrink, J Μurgen Schmidhuber | - | |
An Empirical Exploration of Recurrent Network Architectures | Rafal Jozefowicz, Wojciech Zaremba, Ilya Sutskever | - | |
Massive Exploration of Neural Machine Translation Architectures | Denny Britz, Anna Goldie, Minh-Thang Luong, Quoc Le | - | |
WAVENET: A GENERATIVEMODEL FORRAWAUDIO | AΓ€ron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu | Deep generative model of raw audio waveforms | |
How to Generate a Good Word Embedding? | Siwei Lai, Kang Liu, Liheng Xu, Jun Zhao | - | |
Systematic evaluation of CNN advances on the ImageNet by | Dmytro Mishkin, Nikolay Sergievskiy, Jiri Matas | - | π |
Efficient Estimation of Word Representations inVector Space | Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean | - | π |
Distributed Representations of Words and Phrasesand their Compositionality | Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, Jeffrey Dean | - | π |
Neural Machine Translation by Jointly Learning to Align and Translate | Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio | - | |
Learning Phrase Representations using RNN EncoderβDecoderfor Statistical Machine Translation | Kyunghyun Cho, Bart van Merri Μenboe, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, Yoshua Bengio | - | |
Effective Approaches to Attention-based Neural Machine Translation | Minh-Thang Luong, Hieu Pham, Christopher D. Manning | - | |
Training Tips for the Transformer Model | Martin Popel, OndΕej Bojar | - |
Application | Cell | Layers | Size | Vocabulary | Learning Rate | Paper | |
---|---|---|---|---|---|---|---|
Speech Recognition (large vocabulary) | LSTM | 5, 7 | 600, 1000 | 82K, 500K | -- | -- | Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition |
Speech Recognition | LSTM | 1, 3, 5 | 250 | -- | -- | 0.001 | Speech Recognition with Deep Recurrent Neural Networks |
Machine Translation (seq2seq) | LSTM | 4 | 1000 | Source: 160K, Target: 80K | 1,000 | -- | Sequence to Sequence Learning with Neural Networks |
Image Captioning | LSTM | -- | 512 | -- | 512 | (fixed) | Show and Tell: A Neural Image Caption Generator |
Image Generation | LSTM | -- | 256, 400, 800 | -- | -- | -- | DRAW: A Recurrent Neural Network For Image Generation |
Question Answering | LSTM | 2 | 500 | -- | 300 | -- | A Long Short-Term Memory Model for Answer Sentence Selection in Question Answering |
Text Summarization | GRU | 200 | Source: 119K, Target: 68K | 100 | 0.001 | Sequence-to-Sequence RNNs for Text Summarization |
- The Unreasonable Effectiveness of Recurrent Neural Networks π
- Preprocessing text before use RNN π
- Sentiment Analysis - A very good tutorial! π
- Understanding LSTM Networks
- WaveNet: A generative model for raw audio
- Word2Vec Tutorial - The Skip-Gram Model π
- Applying word2vec to Recommenders and Advertising
- Natural Language Processing Key Terms, Explained
- Attention and Augmented Recurrent Neural Networks
Paper | Authors | Application | comment |
---|---|---|---|
Generative Adversarial Nets | Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio | - | π |
UNSUPERVISED REPRESENTATION LEARNING WITH DEEP CONVOLUTIONAL GENERATIVE ADVERSARIAL NETWORKS | Alec Radford, Luke Metz | - | π |
Improved Techniques for Training GANs | Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, Xi Chen | - | π |
Fine-Grained Car Detection for Visual Census Estimation | Tim Gebru, Jonathan Krause, Yilun Wang, Duyun Chen, Jia Deng, Li Fei Fei | - | |
CycleGAN Face-off | Xiaohan Jin, Ye Qi Shangxuan Wu | - | π |
Image-to-Image Translation with Conditional Adversarial Networks | Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Efros | - | π |
High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs | Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, Bryan Catanzaro | - | |
Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks | Jun-Yan Zhu, Taesung Park, Phillip Isola, Alexei A. Efros | - | |
Augmented CycleGAN: Learning Many-to-Many Mappings from Unpaired Data | Amjad Almahairi, Sai Rajeswar, Alessandro Sordoni, Philip Bachman, Aaron Courville | - | π |
StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation | Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, Jaegul Choo | - | π |
Least Squares Generative Adversarial Networks | Xudong Mao, Qing Li, Haoran Xie, Raymond Y.K. Lau, Zhen Wang, Stephen Paul Smolley | - | π |
Sampling Generative Networks | Tom White | - | π π |
Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network | Wenzhe Shi, Jose Caballero, Ferenc HuszΓ‘r, Johannes Totz, Andrew P. Aitken, Rob Bishop, Daniel Rueckert, Zehan Wang | - | π |
Instance Normalization: The Missing Ingredient for Fast Stylization | Dimitry Ulyanov, Andrea Vedaldi, Victor Lempitsky | Replacement of BatchNorms | |
Taming Transformers for High-Resolution Image Synthesis | Patrick Esser, Robin Rombach, BjΓΆrn Ommer | - |
- Improved GAN (Semi-supervised GAN)
- Semi-Supervised Learning π
- Attacking Machine Learning with Adversarial Examples
- iGAN: Interactive Image Generation via Generative Adversarial Networks
- Image to Image Demo
- StarGAN - Official PyTorch Implementation π
- CycleGAN and pix2pix in PyTorch
- ganhacks ππ
- Taming Transformers for High-Resolution Image Synthesis
Paper | Authors | Application | comment |
---|---|---|---|
Feedback Control For Cassie With Deep Reinforcement Learning | Zhaoming Xie, Glen Berseth, Patrick Clary, Jonathan Hurst, Michiel van de Panne | - | π |
Convergence of Optimistic and Incremental Q-Learning | Eyal Even-Dar, Yishay Mansour | Q-Table initialization | π |
Issues in Using Function Approximation for Reinforcement Learning | Sebastian Thrun, Anton Schwartz | - | π |
Deep Reinforcement Learning with Double Q-learning | Hado van Hasselt, Arthur Guez, David Silver | - | π |
Prioritized Experience Replay | Tom Schaul, John Quan, Ioannis Antonoglou, David Silver | - | π |
Dueling Network Architectures for Deep Reinforcement Learning | Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, Nando de Freitas | - | π |
- Deep Traffic
- DeepLearningFlappyBird
- SnakeAI
- Markov Chain Monte Carlo Without all the Bullshit
- World scale inverse reinforcement learning in Google Maps
Paper | Authors | Application | comment |
---|---|---|---|
SGDR: STOCHASTIC GRADIENT DESCENT WITH WARM RESTARTS | Ilya Loshchilov & Frank Hutter | - | π |
- Optimizers Explained - Adam, Momentum and Stochastic Gradient Descent π
- Tuning the learning rate in Gradient Descent π
- Loss functions
- deepmind research papers
- Reading Barcodes on Hooves: How Deep Learning Is Helping Save Endangered Zebras
- Tesla autopilot
- Attacking Machine Learning with Adversarial Examples
- AI, Deep Learning, and Machine Learning: A Primer
- Deep Learning State of the Art (2020) | MIT Deep Learning Series π
- Better Deep Learning - Train Faster, Reduce Overfitting, and Make Better Predictions
- How to attack a machine learning model?
- Reading game frames in Python with OpenCV - Python Plays GTA V
- CLIP: Connecting Text and Images π
- Introduction to VQGAN+CLIP π π