Skip to content

minseok0809/classic-ai-paper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

93 Commits
 
 

Repository files navigation

Classic AI Paper


Talos
Greek & Roman Mythology


Automatic Slave
Aristotle. Politics (BC 350)


Automata
Hero. About Automata (1589)


Tree of Porphyry
Porphyry (3 Century)


Takwin
Abir Ibn Hayyan. Book of Stones (7-8 Century)


Ars magna
Ramon Llull (1232–1315)


Golem
Rabbi Loew (15-16th century)


Problem of Points
Pascal’s Letters to Fermat on the "Problem of Points" (1654)


Chain Rule
Godefroy-Guillaume Leibnitz (1676)


Binary arithmetic
Godefroy-Guillaume Leibnitz. Explication de l’arithmétique binaire, qui se sert des seuls caractères O et I avec des remarques sur son utilité et sur ce qu’elle donne le sens des anciennes figures chinoises de Fohy (1703)


Laputa
Jonathan Swift. Gulliver's Travels (1726)


Bayes' theorem
Thomas Bayes. An Essay Towards Solving a Problem in the Doctrine of Chance (1763)


Ordinary Least Squares (OLS)
Legendre (1805)
Gauss (1809)


Boolean Algebra
George Boole. The Laws of Thought (1854)


Entropy
Ludwig Boltzmann. On the Relationship between the Second Fundamental Theorem of the Mechanical Theory of Heat and Probability Calculations Regarding the Conditions for Thermal Equilibrium (1877)
Claude E. Shannon. A Mathematical Theory of Communication (1948)


First-Order Logic
Gottlob Frege. Concept Writing (1879)


Anomaly Detection
K. Pearson. On lines and planes of closest fit to systems of points in space (Philosophical Magazine 1901)


Principia Mathematica
Russell and Whitehead. Principia Mathematica (1910-1913)


El Ajedrecista
Torres Quevedo (1912)


Ising model
Ernst Ising and Wilhelm Lenz. The Ising model (or Lenz–Ising model) (1925)


Incompleteness Theorems
Kurt Friedrich Gödel. On Formally Undecidable Propositions of Principia Mathematica and Related Systems (1931)


Lambda Calculus
Alonzo Church. An Unsolvable Problem of Elementary Number Theory (1936)


Logic Gate
Claude E. Shannon. A Symbolic Analysis of Relay and Switching Circuits (1937)


Z3
Konrad Zuse (1938-1941)
R. Rojas. How to make Zuse's Z3 a universal computer (1998)


Nimatron
Edward Condon (1940)


Process Control Computer
Konrad Zuse (1941)


McCulloch & Pitss Model
Warren McCulloch and Walter Pitss et al. A Logical Calculus of The Ideas Immanent in Nervous Activity (1943)


Gradient Descent (GD)
C. Lemarechal. Cauchy and the Gradient Method. Doc Math Extra, pp. 251-254. (2012)


Von Neumann Architecture
J Von Neumann. First Draft of a Report on EDVAC (1945)


As We May Think
Vannevar Bush. As We May Think (1945)


Cybernetics
Norbert Wiener. Cybernetics: Or Control and Communication in the Animal and the Machine (1948)


Turing Machine
A. M. Turing. On Computable Numbers, with an Application to the Entscheidungsproblem (1936)
A. M. Turing. Intelligent Machinery (1948)


Hebbian Learning
Donald O. Hebb. The organization of behaviour (1949)


Playing Chess
Claude Elwood Shannon. Programming a Computer for Playing Chess (Philosophical Magazine 1950)


Turing Test
A. M. Turing. Computing Machinery and Intelligence (1950)


Artificial Intelligence
John McCarthy, Marvin L. Minsky, Nathaniel Rochester, and Claude E. Shannon. A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence (1955)


Bounded Rationality
Simon, Herbert A. A Behavioral Model of Rational Choice (1955)


Logic Theorist
By Allen Newell, Herbert A. Simon, and Cliff Shaw (1955)


Perceptron
Frank Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain (Psychological Review 1958)
Marvin Minsky and Seymour Papert. Perceptrons (1969)


LISP (List Processing)
By John McCarthy (1958)


Logical AI
John McCarthy. Programs with Common Sense (1958)


Machine Learning
Arthur L. Samuel. Some Studies in Machine Learning Using the Game of Checkers (1959)


Dependency Grammar
Lucien Tesnière (1959)


Alpha-Beta Pruning
Arthur L. Samuel. Some studies in mahine learning using the game of checkers (1959)


GPS(General Problem Solver)
Allen Newell. A Guide to The General Problem-solver Program GPS-2-2 (1963)


Decision Tree
Morgan, J.N. & Sonquist, J.A. Problems in the analysis of survey data, and a proposal. (1963)


Iris
R.A. Fisher' et al. The Use of Multiple Measurements in Taxonomic Problems (1963)


Alchemy and Artificial Intelligence
Hubert Dreyfus. Alchemy and Artificial Intelligence (1965)


ELIZA
Joseph Weizenbaum. ELIZA—a computer program for the study of natural language communication between man and machine (1966)


Automata
J Von Neumann, AW Burks. Theory of self-reproducing automata. (1966)
Bingbin Liu. Transformers Learn Shortcuts to Automata. (ICLR 2023)


K-Nearest Neighbors (K-NN)
T. M. COVER. Nearest Neighbor Pattern Classification (1967)


Student
Daniel Bobrow et al. Binary Message Forms in Computer Networks (1968)


Symbolic AI
Newell, J. C. Shaw, Allen Simon. Empirical explorations of the logic theory machine: a case study in heuristic (1957)
Newell, Allen Simon and Herbert A. Human problem solving (1972)
Newell, Allen Simon. Computer science as empirical inquiry: symbols and search (1976)


Rescorla–Wagner Model
Rescorla, R.A. & Wagner, A.R. A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement (1972)


Emergent Ability
P. W. Anderson et al. More Is Different (1972)
Rylan Schaeffer et al. Are Emergent Abilities of Large Language Models a Mirage? (2023)


Eligibility Traces
A. Klopf. Brain Function and Adaptive Systems: A Heterostatic Theory (1972)
Satinder P.Singh & Rechard S.Sutton. Reinforcement learning with replacing eligibility traces (1996)


Frame
Marvin Minsky. Framework for Representing Knowledge (1974)


Cognitron
Kunihiko Fukushima. Cognitron: A self-organizing multilayered neural network (Biological Cybernetics 1975)


Beam Search
B. T. Lowerre. The harpy speech recognition system. Carnegie Mellon University. (1976)
PENG SI OW et al. Filtered beam search in scheduling. (1986)


EM Algorithm
A. P. Dempster et al. Maximum Likelihood from Incomplete Data via the EM Algorithm (Journal of the Royal Statistical Society 1977)


Bayesian Optimization
J Mockus et al. The application of Bayesian methods for seeking the extremum (1978)
Jasper Snoek et al. Practical Bayesian Optimization of Machine Learning Algorithms (NeurIPS 2012)


Constraint Satisfaction Problem
Geoffrey E Hinton. Using Relaxation to Find a Puppet. (1979)


Chinese Room Argument
John Searle. Minds, brains, and programs (1980)


Outliers Detection
D. Hawkins. Identification of Outliers (1980)


Temporal Difference Learning
Sutton, Richard S. Barto, Andrew G. Toward a modern theory of adaptive networks (Psychological Review 1981)


Shallow Learning(Least Squares)
Stephen M. Stigler. Gauss and the Invention of Least Squares. (1981)


Neuroscience
David Hubel, Torsten Wiesel. Receptive fields of single neurons in the cat’s striate cortex (1959) Lawrence Roberts. Machine perception of three-dimensional solids (1963)
David Mar. Vision: A computational investigation into the human representation and processing of visual information(1982)


Neocognitron
Kunihiko Fukushima. Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position (1980)


K-means Clustering
STUART P. LLOYD. Least square quantization in PCM (1982)


Hopfield Network
J J Hopfield. Neural networks and physical systems with emergent collective computational abilities (1982)


Cognitive wheels
Daniel Dennett. Cognitive wheels: The frame problem of AI (1984)


Euriko
by Douglas Lenat (1984)


CART
Breiman, L et al. Classzfication and Regression Trees. (Belmont, CA: Wadsworth International Group 1984)


Boltzmann Machines
Geoffrey E. Hinton et al. A Learning Algorithm for Boltzmann Machines (1985)


KL-ONE
Ronald J. Brachman et al. An overview of the KL-ONE Knowledge Representation System (Cognitive Science 1985)


On Bullshit
Harry Frankfurt (1986)


Distributed representations
Geoffrey E. Hinton et al. Distributed representations (1986)


Iterative Dichotomiser 3
Quinlan, R. Induction of decision trees. (Machane Learnzng 1986)


NETtalk
Terrence J. Sejnowski et al. NETtalk: a parallel network that learns to read aloud (1986)


Backpropagation
H. J. Kelley. Gradient Theory of Optimal Flight Paths. ARS Journal, Vol. 30, No. 10, pp. 947-954. (1960)
Stuart Dreyfus. The numerical solution of variational problems (1962)
S. Linnainmaa. The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors (1970)
David E. Rumelhart et al. Learning representations by back-propagating errors (1986)


Katz's back-off model
Katz, S. M. Estimation of probabilities from sparse data for the language model component of a speech recognizer (1987)


Connect-Four
Victor Allis. A Knowledge-Based Approach of Connect-Four (ICGA Journal 1988)


Hidden Markov Models
Rabiner, L. A. Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition.(Proceedings of the IEEE 1989)


Genetic Algorithms (GA)
W. Siedlecki et al. A note on genetic algorithms for large-scale feature selection (Pattern
Recognition Letters 1989)


Chinook
Jonathan Schaeffer et al. (1989)


Symbol Grounding
Stevan Harnad. The symbol grounding problem (1990)


Nouvelle AI
Rodney A. Brooks. Elephants Don't Play Chess (Robotics and Autonomous Systems 1990)


MARS(Multivariate Adaptive Regression Splines)
Friedman, J. H. Multivariate adaptive regression splines. (The Annals of Statzstzcs 1991)


Dyslexi
Geoffrey E. Hinton et al. Lesioning an attractor network: Investigations of acquired dyslexi (1991)


IBM alignment models
Mays, E., F. J. Damerau et al. Context based spelling correction (Information Processing and Management 1991)
Peter E Brown et al. The Mathematics of Statistical Machine Translation: Parameter Estimation. (Computational Linguistics, 1993)


Mixture of Experts
Robert A. Jacobs et al. Adaptive Mixtures of Local Experts (MIT Press 1991)
Noam Shazeer et al. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer (ICLR 2017)
Dmitry Lepikhin et al. GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding (ICLR 2021)
William Fedus et al. Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity (JMLR 2022)
Barret Zoph et al. ST-MoE: Designing Stable and Transferable Sparse Expert Models (2022)
Trevor Gale et al. MegaBlocks: Efficient Sparse Training with Mixture-of-Experts (2022)
Sheng Shen et al. Mixture-of-Experts Meets Instruction Tuning:A Winning Combination for Large Language Models (ICLR 2024)
Albert Q. Jiang et al. Mixture of Experts (2024)


Object Recognition
David Lowe. Object Recognition from Local Scale-Invariant Features. (1992)


Singular Value Decomposition
G. W. Stewart. Early History of the Singular Value Decomposition (1993)


Penn Treebank
Mitchell P. Marcus et al. Building a Large Annotated Corpus of English: The Penn Treebank (1993)


Locally Linear Embedding (LLE)
Sam T. Roweis1 and Lawrence K. Saul. Nonlinear Dimensionality Reduction by Locally Linear Embedding (1993)


Word Co-occurrence Probabilities
Ido Dagan et al. Similarity-Based Estimation of Word Cooccurrence Probabilities (ACL 1994)


Maximum Entropy
Adwait R. A Maximum Entropy Model for POS tagging (1994)


Complementary priors
Geoffrey E. Hinton et al. A fast learning algorithm for deep belief nets (1994)


Kernel Learning
Vladimir Vapnik. The Nature of Statistical Learning Theory (1995)
Christopher J.C. Burges et al. Advances in Kernel Methods: Support Vector LearningUnavailable (1998)
Scholkopf and Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond (2001)


Kneser-Ney Smoothing
Reinhard Kneser and Hermann Ney. 1995. Improved backing-off for M-gram language modeling (ICASSP 1995)


Speech Recognization
LeCun, Yann et al. Convolutional networks for images, speech, and time series. (The handbook of brain theory and neural networks 1995)


BM25
Stephen Robertson et al. Okapi at TREC-3. In Overview of the Third Text REtrieval Conference(TREC-3). pages 109–126. (1995)


SVM(Support Vector Machine)
Corinna Cortes, Vladimir Vapnik. Support-vector networks (1995)


Statistical Machine Learning
Vladimir Vapnik. The Nature of Statistical Learning Theory (1995)


NER(Named-Entity Recognition)
Lance Ramshaw, Mitch Marcus. Text Chunking using Transformation-Based Learning (VLC-WS 1995)


TF-IDF
Thorsten Joachims. A probabilistic analysis of the rocchio algorithm with tfidf for text categorization (1996)


Logistello
Michael Buro. The othello match of the year: Takeshi murakami vs. logistello (ICGA Journal 1997)


LeNet
Yann LeCun et al. GradientBased Learning Applied to Document Recognition (IEEE 1998)


MNIST
LeCun et al. Gradient-based learning applied to document recognition (IEEE 1998)


MEMM
McCallum et al. Maximum Entropy Markov Models for Information Extraction and Segmentation (ICML 2000)


CRFs
J. Lafferty et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling SequenceData (ICML 2001)


DBSCAN
Martin Ester et al. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise (KDD 1996)


Adaboost
Yoav Freund, Robert E. Schapire. Experiments with a New Boosting Algorithm (1996)


Graph Neural Network
Alessandro Sperduti et al. Supervised neural networks for the classification of structures (1997)


LeNet
Yann LeCun Leon Bottou Yoshua Bengio, Patrick Ha�ner. Gradient-Based Learning Applied to Document Recognition (1998)


DNN
Yann LeCun Leon Bottou Yoshua Bengio, Patrick Ha�ner. Gradient-Based Learning Applied to Document Recognition (1998)


RNN
Rumelhart, David E; Hinton, Geoffrey E, and Williams, Ronald J. Learning internal representations by error propagation (Sept. 1985)
Jordan, Michael I. Serial order: a parallel distributed processing approach (1986)
Williams, R. J et al. Gradient-based learning algorithms for recurrent networks and their computational complexity. In Back-propagation: Theory, Architectures and Applications (1992)


DENDRAL
Edward A. Feigenbaum, Bruce G. Buchanan. DENDRAL and Meta-DENDRAL roots of knowledge systems and expert system applications (1993)


Variational AutoEncoder(VAE)
Geoffrey E Hinton et al. Autoencoders, minimum description length, and helmholtz free energy (NeurIPS 1994)
Diederik P Kingma, Max Welling. Auto-Encoding Variational Bayes (2013)


LSTM
S. Hochreiter and J. Schmidhuber. Long Short-Term Memory (1995)


EQP(Equation Prover)
William Mccune. Deep Blue. (1997)


Kernel PCA
Sebastian Mika et al. Kernel PCA and De-Noising in Feature Spaces (NIPS 1998)


Support Vector Method for Novelty Detection (SVND)
Bernhard Scholkopf. Support Vector Method for Novelty Detection. (NIPS 1999)


Local Outlier Factor (LOF)
Markus M. Breunig et al. LOF: identifying density-based local outliers (ACM SIGMOD Record 2000)


Isometric Feature Mapping (ISOmap)
J. B. Tenenbaum et al. A Global Geometric Framework for Nonlinear Dimensionality Reduction. (Science 2000)


Random Forest
Leo Breiman. Random Forests. Machine Learning, Volume 45, pages 5–32. (2001)


Deap Blue
M Campbell. Deep Blue. (2002)


Kismet
Cynthia Breazeal. Emotion and sociable humanoid robots (IJHCS 2003)


NPLM(Neural Probabilistic Language Model)
Yoshua Bengio et al. A neural probabilistic language model (Journal of Machine Learning Research 2003)


LDA(Latent Dirichlet Allocation)
David M. Blei et al. A neural probabilistic language model (Journal of Machine Learning Research 2003)


CoNLL-2003
Sang et al. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition (NAACL 2003)


Kernel Fisher Discriminant (KFD)
Jian Yang et al. Essence of kernel Fisher discriminant: KPCA plus LDA (Pattern Recognition 2004)


Support Vector Data Description (SVDD)
David M.J. Tax et al. Support Vector Data Description (Machine Learning 2004)


DRIVE (Digital Retinal Images for Vessel Extraction)
Joes Staal et al. Ridge-based vessel segmentation in color images of the retina (IEEE 2004)


Feature
David G. Lowe. Distinctive Image Features from Scale-Invariant Keypoints (IJCV 2004)
Navneet Dalal, Bill Triggs. Histograms of Oriented Gradients for Human Detection (CVPR 2005)


Reconstruction
Noah Snavely, Steven M. Seitz, Richard Szeliski. Photo Tourism: Exploring Photo Collections in 3D (ACM 2006)


Connectionist Temporal Classification (CTC)
Alex Graves et al. Connectionist Temporal Classification, Labelling Unsegmented Sequence Data with RNN (ICML 2006)


Deep Belief Network (DBN)
Geoffrey E. Hinton et al. A fast learning algorithm for deep belief nets (2006)
Lee Honglak et al. Sparse deep belief net model for visual area V2 (NeurlIPS 2007)
Hinton, Geoffrey E et al. Reducing the dimensionality of data with neural networks (Science 2006)
Hinton, Geoffrey E et al. Training products of experts by minimizing contrastive divergence (Neural computation 2002)


Autoencoder
Reducing the dimensionality of data with neural networks (2006)


Support Vector Regression (SVR)
Harris Drucker et al. Support Vector Regression Machines (Statistics and Computing 2007)


Greedy layer-wise training
Bengio, Yoshua, et al. "Greedy layer-wise training of deep networks. (NeurlIPS 2007)


SLAM
Davison et al. MonoSLAM: Real-Time Single Camera SLAM (TPAMI 2007)


Knowledge Graph
Fabian M. Suchanek et al. YAGO: A Core of Semantic Knowledge Unifying WordNet and Wikipedia (WWW 2007)


t-SNE
Laurens van der Maaten et al. Visualizing Data using t-SNE (JMLR 2008)


Denoising Autoencoder
Pascal Vincent et al. Extracting and Composing Robust Features with Denoising Autoencoders (ICML 2008)


The Four-Color Theorem
Georges Gonthier. Formal Proof—The Four- Color Theorem (2008)


IEMOCAP (The Interactive Emotional Dyadic Motion Capture (IEMOCAP) Database)
Carlos Busso et al. IEMOCAP: interactive emotional dyadic motion capture database (2008)


Deformable Part Model
Felzenszwalb, David McAllester, Deva Ramanan. A discriminatively trained, multiscale, deformable part model. (2008)


Pubmed
Prithviraj Sen et al. Collective Classification in Network Data (AAAI 2008)


t-SNE
Laurens van der Maaten et al. Visualizing Data using t-SNE (JMLR 2008)


Isolation Forest
Fei Tony Liu et al. Isolation Forest (ICDM 2008)
Sahand Hariri et al. Extended Isolation Forest (2018)


Relation Extraction
Mintz et al. Distant supervision for relation extraction without labeled data (ACL | IJCNLP 2009)


ImageNet
Jia Deng et al. ImageNet: A Large-Scale Hierarchical Image Database (CVPR 2009)


Domain Adaption
Shai Ben-David et al. A theory of learning from different domains (Mach Learn 2010)


ReLU
Vinod Nair and Geoffrey E. Hinton. Rectified Linear Units Improve Restricted Boltzmann Machines (ICML 2010)


PASCAL VOC
Mark Everingham et al. The PASCAL Visual Object Classes (VOC) Challenge (IJCV 2010)


Fold It (Rosetta-based game)
Seth Cooper et al. Predicting protein structures with a multiplayer online game (Nature 2010)


Graphical Models
Sebastian Nowozin and Christoph H. Lampert. Structured Learning and Prediction in Computer Vision (2011)


CUB-200-2011 (Caltech-UCSD Birds-200-2011)
Wah et al. The Caltech-UCSD Birds-200-2011 Dataset (2011)


HMDB51
Hildegard Kuehne et al. HMDB: A large video database for human motion recognition (IEEE 2011)


SVHN (Street View House Numbers)
Netzer et al. Reading digits in natural images with unsupervised feature learning (2011)


Sicikit learn
Fabian Pedregosa et al. Scikit-learn: Machine Learning in Python (2011)
Lars Buitinck et al. API design for machine learning software: experiences from the scikit-learn project (2013)


Numpy
Stefan Van Der Walt et al. The NumPy array: a structure for efficient numerical computation (2011)
Charles R. Harris et al. Array Programming with NumPy (2020)


IMuJoCo
Emanuel Todorov et al. MuJoCo: A physics engine for model-based control (IEEE/RSJ IROS 2012)


CIFAR
Alex Krizhevsky et al. Learning Multiple Layers of Features from Tiny Images (2012)


NYUv2 (NYU-Depth V2)
Nathan Silberman et al. Indoor Segmentation and Support Inference from RGBD Images (LNIP 2012)


KITTI-360
KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D (PAMI 2012)


UCF101 (UCF101 Human Actions dataset)
Soomro et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012)


KITTI
Andreas Geiger et al. Are we ready for autonomous driving? The KITTI vision benchmark suite (IEEE 2012)


LIDC-IDRI
Armato et al. The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on CT scans (2011)


Random Search
J. Bergstra and Y. Bengio. Random search for hyper-parameter optimization (2012)


CNN(Alexnet)
A. Krizhevsky, I. Sutskever, G. E. Hinton. ImageNet Classification with Deep Convolutional Neural Networks (2012)
Matthew D. Zeiler and Rob Fergus. Visualizing and Understanding Convolutional Networks (ECCV 2014)


SST (Stanford Sentiment Treebank)
Socher et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank (EMNLP 2013)


Human3.6M
Ionescu et al. Human3.6m: Large scale datasets and predictive methods for 3D human sensing in natural environments (IEEE 2013)


ConvGNN
Joan Bruna et al. Spectral Networks and Locally Connected Networks on Graphs (2013)


R-CNN
Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik. Rich feature hierarchies for accurate object detection and semantic segmentation (2013)


Word2Vec
T. Mikolov et al. Efficient estimation of word representations in vector space (2013)


Linear representation Hypothesis
T. Mikolov et al. Linguistic regularities in continuous space word representations (NAACL 2013)


Anomaly Detection
Charu C Aggarwal. An introduction to outlier analysis (2013)


Never Ending Image Learner (NEIL)
Xinlei Chen et al. NEIL: Extracting Visual Knowledge from Web Data (ICCV 2013)


Dropout
N. Srivastava et al. Dropout: A simple way to prevent neural networks from overfitting (2014)


Word Representation
Omer Levy et al. Neural Word Embedding as Implicit Matrix Factorization (2014)


Adam
D. Kingma and J. Ba. Adam: A method for stochastic optimization (2014)


COCO (Microsoft Common Objects in Context)
Tsung-Yi Lin et al. Microsoft COCO: Common Objects in Context (ECCV 2014)


Caffe
Yangqing Jia et al. Caffe: Convolutional Architecture for Fast Feature Embedding (ACM 2014)


GRU(Gated Recurrent Unit)
Kyunghyun Cho et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation (EMNLP 2014)


PASCAL3D+
Yu Xiang et al. Beyond PASCAL: A benchmark for 3D object detection in the wild (IEEE 2014)


DeCAF
Boris van Breugel, Trent Kyono, Jeroen Berrevoets, Mihaela van der Schaar. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition (ICML 2014)


GAN
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio. Generative adversarial nets (2014)


FCN
Jonathan Long, Evan Shelhamer, Trevor Darrell. Fully Convolutional Networks for Semantic Segmentation. (2014)


DeepFace
Y. Taigman et al.DeepFace: Closing the gap to human-level performance in face verification (2014)


Seq2Seq
I. Sutskever et al. Sequence to sequence learning with neural networks. (2014)


DQN (Deep Q-Network)
John Schulman et al. Playing Atari with Deep Reinforcement Learning (NeurIPS 2014)
Volodymyr Mnih. Human level control through deep reinforcement learning (Nature 2015)


Robotics: OpenAI Gym
Matthias Plappert et al. Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research (NeurIPS 2014)


GloVe
Jeffrey Pennington et al. GloVe: Global Vectors for Word Representation (EMNLP 2014)


Text Classification with CNN
Yoon Kim. Convolutional Neural Networks for Sentence Classification (EMNLP 2014)


Autonomous Weapons Open Letter: AI & Robotics Researchers
Barbara J. Grosz, Demis Hassabis, Stephen Hawking, Kathryn McElroy, Elon Musk, Steve Wozniak et al. (IJCAI 2015)


CAM(Class-Activation Map)
Maxime Oquab et al. Is Object Localization for Free? - Weakly-Supervised Learning With Convolutional Neural Networks (CVPR 2015)


Unsupervised Domain Adaptation
Yaroslav Ganin et al. Unsupervised Domain Adaptation by Backpropagation (2015)


ResNet
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. Deep Residual Learning for Image Recognition. (2015)


Batch Normalization
S. Loffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. (2015)


YOLO
Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi. You Only Look Once: Unified, Real-Time Object Detection (2015)


ArcFace
Jiankang Deng, Jia Guo, Jing Yang, Niannan Xue, Irene Kotsia, and Stefanos Zafeiriou. ArcFace: Additive Angular Margin Loss for Deep Face Recognition (CVPR 2015)


PReLU(Parametric Rectified Lienar Unit)
Kaiming He et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification (2015)


SUN RGB-D
Song et al. SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite (IEEE 2015)


MovieLens
F. Maxwell Harper et al. The MovieLens Datasets: History and Context (ACL 2015)


ModelNet
Wu et al. 3D ShapeNets: A Deep Representation for Volumetric Shapes (CVPR 2015)


LibriSpeech
Vassil Panayotov et al. Librispeech: An ASR corpus based on public domain audio books (IEEE 2015)


SNLI (Stanford Natural Language Inference)
Bowman et al. A large annotated corpus for learning natural language inference (EMNLP 2015)


Visual Question Answering (VQA)
Agrawal et al. VQA: Visual Question Answering (ICCV 2015)


ShapeNet
Chang et al. ShapeNet: An Information-Rich 3D Model Repository (2015)


Model Compression
Cristian Bucil˘a et al. Model Compression (ACM SIGKDD 2006)
O. Vinyals, J. A. Dean, G. E. Hinton. Distilling the Knowledge in a Neural Network. (2015)


CelebA (CelebFaces Attributes Dataset)
Liu et al. Deep Learning Face Attributes in the Wild (IEEE 2015)


ActivityNet
Heilbron et al. ActivityNet: A Large-Scale Video Benchmark for Human Activity Understanding (IEEE 2015)


IModelNet
Wu et al. 3D ShapeNets: A Deep Representation for Volumetric Shapes (CVPR 2015)


Deep learning
Yann LeCun, Yoshua Bengio, Geoffrey Hinton. Deep learning (NatureDeepReview 2015)


TRPO
Schulman, John et al. Trust Region Policy Optimization (2015)


Eugene Goostman 2014
Kevin Warwick and Huma Shah. Can machines think? A report on Turing test experiments at the Royal Society (JEAIL 2016)
Kevin Warwick and Huma Shah. Passing the Turing Test Does Not Mean the End of Humanity (Cognitive Computation 2016)


A3C (Asynchronous Advantage Actor Critic)
Volodymyr Mni et al. Asynchronous Methods for Deep Reinforcement Learning (ICML 2016)


MCTS
Silver, D et al. Mastering the game of Go with deep neural networks and tree search (Nature 2016)


DeepLab
Liang-Chieh Chen et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs (TPAMI 2016)


Face Rigging
Reconstruction of Personalized 3D Face Rigs from Monocular Video (ACM 2016)


R-FCN
Jifeng Dai et al. R-FCN: Object Detection via Region-based Fully Convolutional Networks (2016)


LIME(Local Interpretable Model-agnostic Explanations)
Marco Tulio Ribeiro et al. Why Should I Trust You?": Explaining the Predictions of Any Classifier (NAACL 2016)


Subword Model
Rico Sennrich et al. Neural Machine Translation of Rare Words with Subword Units (ACL 2016)


Monte Carlo Dropout
Yarin Gal et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning (ICML 2016)


Graph Autoencoders
Thomas N. Kipf et al. Variational Graph Auto-Encoders (NeurIPS 2016)


Document Classification
Z Yang et al. Hierarchical Attention Networks for Document Classification (NAACL 2016)


Visual Intelligence
Brenden M. Lake et al. Building Machines That Learn and Think Like People (NeurIPS 2016)


Imini-Imagenet
Vinyals et al. Matching Networks for One Shot Learning (NeurIPS 2016)


IoU Loss
Jiahui Yu et al. UnitBox: An Advanced Object Detection Network (ACM MM 2016)


XGBoost
Tianqi Chen et al. XGBoost: A Scalable Tree Boosting System (KDD 2016)


DAVIS (Densely Annotated VIdeo Segmentation)
Perazzi et al. A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation (IEEE 2016)


S3DIS (Stanford 3D Indoor Scene Dataset (S3DIS))
Armeni et al. 3D Semantic Parsing of Large-Scale Indoor Spaces (IEEE 2016)


Universal Dependencies
Nivre et al. Universal Dependencies v1: A Multilingual Treebank Collection (LREC 2016)


CheXpert
Irvin et al. CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison (AAAI 2016)


VCTK (CSTR VCTK Corpus)
Veaux et al. CSTR VCTK corpus: English multi-speaker corpus for CSTR voice cloning toolkit (2016)


MIMIC-III (The Medical Information Mart for Intensive Care III)
Johnson et al. MIMIC-III, a freely accessible critical care database (2016)


SQuAD (Stanford Question Answering Dataset)
Rajpurkar et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text (EMNLP 2016)


Cityscapes
Cordts et al. The Cityscapes Dataset for Semantic Urban Scene Understanding (CVPR 2016)


MS MARCO (Microsoft Machine Reading Comprehension Dataset)
Bajaj et al. MS MARCO: A Human Generated MAchine Reading COmprehension Dataset (EMNLP 2016)


Tensorflow
Martín Abad et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems (2016)


OpenAI Gym
Brockman et al. OpenAI Gym (2016)


XGBoost
Tianqi Chen et al. XGBoost: A Scalable Tree Boosting System (KDD 2016)


Image Style Transfer
Leon A. Gatys et al. Image Style Transfer Using Convolutional Neural Networks (CVPR 2016)


Deep speech 2
Amodei, D et al. Deep speech 2: End-to-end speech recognition in english and mandarin. (ICML 2016)


Continual Learning
James Kirkpatrick et al. Overcoming catastrophic forgetting in neural networks (2016)


Federated Learning
H. Brendan McMahan et al. Communication-Efficient Learning of Deep Networks from Decentralized Data (2016)
Jakub Konečný et al. Federated Learning: Strategies for Improving Communication Efficiency (2018)
Tian Li et al. Federated Learning: Challenges, Methods, and Future Directions (2020)


Random Erasing
Sachin Ravi et al. Optimization as a Model For Few-Shot Learning (2016)
Zhun Zhong et al. Random Erasing Data Augmentation (2017)


Sophia
Ben Goertzel et al. Loving AI: Humanoid Robots as Agents of Human Consciousness Expansion (summary of early research progress) (2017)


RetinaNet
Tsung-Yi Lin et al. Focal loss for dense object detection (2017)


Mask R-CNN
Kaiming He et al. Mask R-CNN for Object Detection and Segmentation (2017)


PPO
Schulman, John, et al. Proximal policy optimization algorithms (2017)


NAS(Neural Architecture Search)
Barret Zoph et al. Neural Architecture Search with Reinforcement Learning (ICLR 2017)


SHAP(Shapley Additive Explanations)
Scott Lundberg et al. A Unified Approach to Interpreting Model Predictions (NeurIPS 2017)


Graph Convolutional Networks(GCN)
Thomas N. Kipf et al. Semi-Supervised Classification with Graph Convolutional Networks (ICLR 2017)


Image Restoration
Dmitry Ulyanov et al. Deep Image Prior (2017)


Open-Domain Question Answering
Danqi Chen et al. Reading Wikipedia to Answer Open-Domain Questions. (ACL 2017)
Karpukhin et al. Dense Passage Retrieval for Open-Domain Question Answering. (EMNLP, 2020)


GraphSAGE
William L. Hamilton et al. Inductive Representation Learning on Large Graphs (NerulIPS 2017)


Loss Function
Dong Yu et al. Permutation Invariant Training of Deep Models Forspeaker-independent Multi-talker Speech Separation (IEEE 2017)


IPlaces
Zhou et al. Places: A 10 Million Image Database for Scene Recognition (IEEE 2017)


Meta Learning
Chelsea Finn et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks (ICML 2017)


Weight & Activation Quantizer
Shuchang Zhou et al. DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients (IEEE 2017)


Opennmt
Guillaume Klein et al. OpenNMT: Open-Source Toolkit for Neural Machine Translation (ACL 2017)


ICARLA (Car Learning to Act)
Dosovitskiy et al. CARLA: An Open Urban Driving Simulator (2017)


MobileNet
Andrew G. Howard et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications (2017)


VoxCeleb1
Nagrani et al. VoxCeleb: a large-scale speaker identification datasett (Interspeech 2017)


Kinetics (Kinetics Human Action Video Dataset)
Kay et al. The Kinetics Human Action Video Dataset (2017)


ScanNet
Dai et al. ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes (CVPR 2017)


AudioSet
Jort F. Gemmeke et al. Audio Set: An ontology and human-labeled dataset for audio events (2017)


Fashion-MNIST
Xiao et al. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms (2017)


Visual Genome
Krishna et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations (2017)


Alphago
Silver, D et al. Mastering the game of Go without human knowledge (2017)


Alphazero
David Silver et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm (2017)


Text Style Transfer
Tianxiao Shen et al. Style Transfer from Non-Parallel Text by Cross-Alignment (NeurIPS 2017)


Transformer
A. Vaswani et al. Attention is all you need (2017)


BERT
J. Devlin et al. Bert: Pre-training of deep bidirectional transformers for language understanding (2018)


GPT
Alec Radford et al. Improving Language Understanding by Generative Pre-Training (2018)


GPT-2
Alec Radford et al. Language Models are Unsupervised Multitask Learners (2018)


RoBERTa
Yinhan Liu et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach (2018)


Reinforcement Learning
Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction (2018)
David Silver et al. Reward is enough (Artificial Intelligentce 2021)


CornerNet
Hei Law et al. CornerNet: Detecting Objects as Paired Keypoints (ECCV 2018)


AutoEncoder-based Recommendation System
Dawen Liang et al. Variational Autoencoders for Collaborative Filtering. (WWW 2018)


VoxCeleb2
Chung et al. VoxCeleb2: Deep Speaker Recognition (ISCA 2018)


GLUE
Wang et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding (EMNLP 2018)


GAT
Petar Veličković et al. Graph Attention Networks (ICLR 2018)


fastMRI
Zbontar et al. fastMRI: An Open Dataset and Benchmarks for Accelerated MRI (2018)


Speech Commands
Pete Warden et al. Warden in Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition (2018)


MultiNLI (Multi-Genre Natural Language Inference)
Williams et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference (NAACL 2018)


Session-based Recommendation System
Self-Attentive Sequential Recommendationr (ICDM 2018)


BLAS (Basic Linear Algebra Subprograms)
C. Nugteren, CLBlast: A tuned OpenCL BLAS library (2018)


Low Distortion & Good Perceptual Quality
Yochai Blau et al. The Perception-Distortion Tradeoff (CVPR 2018)


Albumentations
Alexander Buslaev et al. Albumentations: fast and flexible image augmentations (2018)


PINN
Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations (Journal of Computational Physics 2019)


DeepXDE
DeepXDE: A Deep Learning Library for Solving Differential Equations (Journal of Computational Physics 2019)


ERNIE
Yu Sun et al. ERNIE: Enhanced Representation through Knowledge Integration (2019)


AlphaStar
Google Deepmind. AlphaStar: Mastering the real-time strategy game StarCraft II (2019)


EfficientNet
Mingxing Tan et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks (ICML 2019)


Backdoor Attack
Tianyu Gu et al. BadNets: Evaluating Backdooring Attacks on Deep Neural Networks (NeurIPS 2019)


SentencePiece
Taku Kudo et al. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing (ACL 2019)


Specaugment
Daniel S. Park et al. Specaugment: A simple data augmentation method for automatic speech recognition (Interspeech 2019)


EDA
Jason Wei et al. EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks (EMNLP-IJCNLP 2019)


Label Smoothing
Rafael Müller et al. When Does Label Smoothing Help? (NeurIPS 2019)


GIoU Loss
Hamid Rezatofighi et al. Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression (CVPR 2019)


AutoAugment
Ekin D. Cubuk et al. AutoAugment: Learning Augmentation Policies from Data (CVPR 2019)


Scipy
Pauli Virtanen et al. SciPy 1.0--Fundamental Algorithms for Scientific Computing in Python (2019)


Natural Questions
Kwiatkowski et al. Natural Questions: a Benchmark for Question Answering Research (TACL 2019)


CoLA (Corpus of Linguistic Acceptability)
Warstadt et al. Neural Network Acceptability Judgments (TACL 2019)


Capture of Facial Geometry
Thabo Beeler et al. High-Quality Single-Shot Capture of Facial Geometry (ACM 2019)


Data Augmentation
Jason Wei et al. EDA : Easy Data Augmentation Techniques for Boosting Performance on Text Classification (EMNLP-IJCNLP 2019)


SuperGLUE
Wang et al. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems (NeurIPS 2019)


Hugging face
Thomas Wolf et al. HuggingFace's Transformers: State-of-the-art Natural Language Processingt (EMNLP 2019)


GPU
Jeff Johnson et al. Billion-scale similarity search with GPUs (IEEE 2019)


IFFHQ (Flickr-Faces-HQ)
Karras et al. A Style-Based Generator Architecture for Generative Adversarial Networks (CVPR 2019)


T5
Colin Raffel et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (The Journal of Machine Learning Research 2019)


RAG
Patrick Lewis et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (NeurIPS 2020)


MoCo
Kaiming He et al. Momentum Contrast for Unsupervised Visual Representation Learning (2019)


Robotics: Vision and Touch Representation
Michelle A. Lee et al. Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks (ICRA 2019)


OCGAN
Pramuditha Perera et al. OCGAN: One-class Novelty Detection Using GANs with Constrained Latent Representations (CVPR 2019)


MuZero
Julian Schrittwieser et al. Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model (Nature 2020)


wav2vec
S. Schneider et al. wav2vec: Unsupervised pre-training for speech recognition (Interspeech 2019)
Alexei Baevski et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations Neural Turing Machines (NeurlIPS 2020)


GPT-3, Prompt Tuning
Brown et al. Language Models are Few-Shot Learners (NeurIPS 2020)


UMAP(Uniform Manifold Approximation and Projection)
Leland McInnes et al. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction (2020)


DETR
Nicolas Carion et al. End-to-End Object Detection with Transformers (2020)


SimCLR
Chen Ting et al. Simple Framework for Contrastive Learning of Visual Representations (PMLR 2020)
Chen Ting et al. Big self-supervised models are strong semi-supervised learners. (2020)


UDA(Unsupervised Data Augmentation)
Qizhe Xie et al. Unsupervised Data Augmentation for Consistency Training (NeurIPS 2020)


MLIR (Multi Level Intermediate Representation)
C. Lattner et al. MLIR: A compiler infrastructure for the end of Moore’s law (2020)


GNN-based Recommendation System
Xiangnan He et al. LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation (SIGIR 2020)


OGB (Open Graph Benchmark)
Hu et al. Open Graph Benchmark: Datasets for Machine Learning on Graphs (NIPS 2020)


nuScenes
Caesar et al. nuScenes: A multimodal dataset for autonomous driving (CVPR 2020)


CORD-19
Wang et al. CORD-19: The COVID-19 Open Research Dataset (ACL 2020)


NeRF(Neural Radiance Fields)
Mildenhall et al. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis (ECCV 2020)


Retrieval-Augmented Generation
Patrick Lewis et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (NeurIPS 2020)


Vision Transformer(ViT)
Alexey Dosovitskiy Gu et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (2020)


Reinforcement Learning from Human Feedback (RLHF)
Nisan Stiennon et al. Learning to summarize from human feedback (NeurIPS 2020)
Long Ouyang et al. Training language models to follow instructions with human feedback (2022)


BART
Mike Lewis et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension (ACL 2020)


Scaling Law
Jared Kaplan et al. Scaling Laws for Neural Language Models (2020)
Tom Henigh et al. Scaling Laws for Autoregressive Generative Modeling (2020)
Leo Gao et al. Scaling Laws for Reward Model Overoptimization (2022)
Aidan Clark et al. Unified Scaling Laws for Routed Language Models (2022)
Yi Tay et al. Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers (ICLR 2022)
Jordan Hoffmann et al. Training Compute-Optimal Large Language Models (NeurIPS 2022)
Jason Wei et al. Emergent Abilities of Large Language Models (TMLR 2022)


DeepONet
Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators (Nature machine intelligence 2021)


Prefix Tuning
Xiang Lisa Li and Percy Liang. Prefix-tuning: Optimizing continuous prompts for generation. (ACL 2021)


Generative Artificial Intelligence(GAI)
R Bommasani et al. On the opportunities and risks of foundation models (2021)


Swin Transformer
Ze Liu et al. Hierarchical Vision Transformer using Shifted Windows (2021)


Reinforcement Learning
David Silver et al. Reward is enough (Artificial Intelligence 2021)


Text to Image Generation
Aditya Ramesh et al. DALL-E: Zero-Shot Text-to-Image Generation (JMLR 2021)
Aditya Ramesh et al. Hierarchical Text-Conditional Image Generation with CLIP Latents (2022)
James Betke et al. Improving Image Generation with Better Captions (2023)


ROUGE
Chin-Yew Lin et al. ROUGE: A Package for Automatic Evaluation of Summaries (2021)


Stable Diffusion
Rombach et al. High-Resolution Image Synthesis with Latent Diffusion Models (2021)
Lvmin Zhang et al. Adding Conditional Control to Text-to-Image Diffusion Models (2023)


AlphaFold
Richard Evans et al. Protein complex prediction with AlphaFold-Multimer. (2021)
John Jumper et al. Highly accurate protein structure prediction with AlphaFold (Nature 2021)
Patrick Bryant et al. Predicting the structure of large protein complexes using AlphaFold and Monte Carlo tree search (Nature Communications 2022)
Josh Abramson et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3 (Nature 2024)


AlphaProteo
Vinicius Zambaldi et al. De novo design of high-affinity protein binders with AlphaProteo (2024)


IMDb Movie Reviews
Andrew L. Maas et al. Learning Word Vectors for Sentiment Analysis (ACL 2021)


Pytorch
Adam Paszk et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library (NeurIPS 2021)


Stochastic Parrot
Emily M. Bender et al. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 (2021)


Superposition Hypothesis
Elhage et al. Toy Models of Superposition (2022)


Langchain
Shunyu Yao et al. ReAct: Synergizing Reasoning and Acting in Language Models (ICLR 2022)


AlphaChord
Yujia Li et al. Competition-Level Code Generation with AlphaCode (2022)


GraphCast
Remi Lam et al. GraphCast: Learning skillful medium-range global weather forecasting (2022)


OPT
Susan Zhang et al. OPT: Open Pre-trained Transformer Language Models (2022)


InstructGPT
Long Ouyang et al. Training language models to follow instructions with human feedback (2022)


PaLM
Aakanksha Chowdhery et al. PaLM: Scaling Language Modeling with Pathways (2022)


Chain-of-Thought Prompting
Jason Wei et al. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (2022)


Non-language Task
Tuan Dinh et al. LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks (NeurIPS 2022)


3D Generator
Eric R. Chan et al. G3D: Efficient Geometry-aware 3D Generative Adversarial Networks (CVPR 2022)


Joint Embedding Predictive Architecture(JEPA)
Yann LeCun et al. A Path Towards Autonomous Machine Intelligence (2022)


Neural Jacobian Fields (NJF)
Noam Aigerman et al. Neural Jacobian Fields: Learning Intrinsic Mappings of Arbitrary Meshes (ACM 2022)
Sizhe Lester Li et al. Controlling diverse robots by inferring Jacobian fields with deep networks (Nature 2025)


Qwen-VL
Jinze Bai et al. Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond (2023)
Peng Wanget al. Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution (2024)
Shuai Bai et al. Qwen2.5-VL Technical Report (2025)


Gemini
Google. Gemini: A Family of Highly Capable Multimodal Models (2023)
Google. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context (2024)
Google Deepmind. Gemini Robotics: Bringing AI into the Physical World (2025)
Google. Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities (2025)


GPT-4
Baolin Peng et al. Instruction Tuning with GPT-4 (EMNLP 2023)
OpenAI. GPT-4 Technical Report (2023)


LLAMA
Touvron et al. LLaMA: Open and Efficient Foundation Language Models (2023)


LLaVa
Haotian Liu et al. Visual Instruction Tuning (NeurIPS 2023)


Alpaca, Instruction Tuning
Rohan Taori et al. Alpaca: A Strong, Replicable Instruction-Following Model (2023)


Large Language Model (LLM)
Tyna Eloundou et al. GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models (2023)
Brandon C. Roy et al. Predicting the birthe of a spoken word (2015)
Tom McCoy et al. Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference (ACL 2019)
Qihuang Zhong et al. Can ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERT (2023)
Lukas Berglund et al. The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A" (2023)


Quantization
Tim Dettmers et al. LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale (NeurIPS 2022)
Tim Dettmers et al. QLoRA: Efficient Finetuning of Quantized LLMs (NeurIPS 2023)
Hongyu Wang et al. BitNet: Scaling 1-bit Transformers for Large Language Models (2023)
Shuming Ma et al. The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits (2024)


DPO
Rafael Rafailov et al. Direct Preference Optimization: Your Language Model is Secretly a Reward Model (NeurIPS 2023)


PINN (Physics-informed neural network) VS FEM(Finite Element Method)
T.G. Grossmann et al. Can Physics-Informed Neural Networks beat the Finite Element Method? (2024)
A.S. Kishnapriyan et al. Characterizing possible failure modes in physics-informed neural networks. (NeurIPS 2021)


Generalist Medical AI (GMAI)
Michael Moor et al. Foundation models for generalist medical artificial intelligence (Nature 2023)


Med-PaLM
Karan Singhal et al. Large language models encode clinical knowledge (Nature 2023)
Karan Singhal et al. Towards Expert-Level Medical Question Answering with Large Language Models
Tao Tu et al. Towards Generalist Biomedical AI (2023)


Provide responses to patient questions
John W. Ayers et al. Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum (JAMA Intern Med 2023)


Clinical Text Summarization
Dave Van Veen et al. Adapted large language models can outperform medical experts in clinical text summarization (Nature 2024)


Titans
Ali Behrouz et al. Titans: Learning to Memorize at Test Time (2024)


Semantic Planner
Ma et al. Eureka: Human-Level Reward Design via Coding Large Language Models (ICLR 2024)


AlphaGeometry
Trieu H. Trinh et al. Solving olympiad geometry without human demonstrations (2024)


AlphaProof
Google Deepmind. AI achieves silver-medal standard solving International Mathematical Olympiad problems (2024)


Habsburg AI
Ilia Shumailov et al. The Curse of Recursion: Training on Generated Data Makes Models Forget (2023)
Ilia Shumailov et al. AI models collapse when trained on recursively generated data (Nature 2024)


Machine Unlearning
Weijia Shi et al. Detecting Pretraining Data from Large Language Models (ICLR 2024)
Martin Pawelczyk et al. In-Context Unlearning: Language Models as Few-Shot Unlearners (ICML 2024)
Michael Duan et al. Do Membership Inference Attacks Work on Large Language Models (COLM 2024)


Claude
Adly Templeton et al. Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet (2024)


Hunyuan3D
Tencent. Hunyuan3D 1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation (2024)
Tencent. Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation (2025)
Tencent. Hunyuan3D 2.1: From Images to High-Fidelity 3D Assets with Production-Ready PBR Material (2025)


Visual AutoRegressive Modeling
Keyu Tian et al. Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction (NeurIPS 2024)


Veo & Imagen
Google Deepmind. State-of-the-art video and image generation with Veo 2 and Imagen 3 (2024)


Antagonistic AI
Alice Cai et al. Antagonistic AI (2024)


Semantic Entropy
Sebastian Farquhar et al. Detecting hallucinations in large language models using semantic entropy (Nature 2024)


RT-X
Abby O’Neill et al. Open X-Embodiment: Robotic Learning Datasets and RT-X Models (2024)


Smollm
Allal, L. B et al. Smollm - blazingly fast and remarkably powerful (2024)
Loubna Ben Allal et al. SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model (2025)


MobileLLM
Zechun Liu et al. MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024)


The Nobel Prize in Physics 2024
John J. Hopfield, Geoffrey E. Hinton


The Nobel Prize in Chemistry 2024
David Baker, Demis Hassabis, John M. Jumper


Rule-Based Reward
Yecheng Jason Ma et al. Eureka: Human-Level Reward Design via Coding Large Language Models (ICLR 2024)


GRPO(Group Relative Policy Optimization)
Zhihong Shao et al. DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models (2024)


DeepSeek
DeepSeek-AI et al. DeepSeek-Coder: When the Large Language Model Meets Programming- The Rise of Code Intelligence (2024)
DeepSeek-AI. DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model (2024)
DeepSeek-AI. DeepSeek-V3 Technical Report (2024)
DeepSeek-AI. DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning (2025)


WinoGrande
Keisuke Sakaguchi et al. WinoGrande: An Adversarial Winograd Schema Challenge at Scale (2019)


HellaSwag
Rowan Zellers et al. HellaSwag: Can a Machine Really Finish Your Sentence? (ACL 2019)


Measuring Massive Multitask Language Understanding (MMLU)
Dan Hendrycks et al. Measuring Massive Multitask Language Understanding (ICLR 2021)
Yubo Wang et al. MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark (NeurIPS 2024)


Codex. HumanEval
Mark Chen et al. Evaluating Large Language Models Trained on Code (2021)


TruthfulQA
Stephanie Lin et al. TruthfulQA: Measuring How Models Mimic Human Falsehoods (ACL 2022)


FrontierMath
Elliot Glazer et al. FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI (2024)


Social Sycophancy
Myra Cheng et al. Social Sycophancy: A Broader Understanding of LLM Sycophancy (2025)


Cognitive Debt
Nataliya Kosmyna et al. Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task (2025)


SWE-Lancer
Samuel Miserendino et al. SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering? (2025)


ARC-AGI
Francois Chollet. On the Measure of Intelligence (2019)
Francois Chollet et al. ARC Prize 2024: Technical Report (2024)
Francois Chollet et al. ARC-AGI-2: A New Challenge for Frontier AI Reasoning Systems (2025)


Humanity's Last Exam (HLE)
Long Phan et al. Humanity's Last Exam (2025)


Agent2Agent Protocol (A2A)
Rao Surapaneni et al. Announcing the Agent2Agent Protocol (A2A) (2025)


Model Context Protocol (MCP)
Narajala, V. S. et al. Enterprise-grade security for the Model Context Protocol (MCP): Frameworks and mitigation strategies (2025)


Aurora
Cristian Bodnar et al. A Foundation Model for the Earth System (Nature 2025)


The Common pile v.01
Nikhil Kandpal et al. The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text (2025)


Misaligned persona
Miles Wang et al. Persona Features Control Emergent Misalignment (2025)


Agentic Misalignment
Anthropic. Agentic Misalignment: How LLMs could be insider threats (2025)


SeRL
Wenkai Fang et al. SeRL: Self-Play Reinforcement Learning for Large Language Models with Limited Data (2025)


Strategic Deception
Kai Wang et al. When Thinking LLMs Lie: Unveiling the Strategic Deception in Representations of Reasoning Models (2025)


AlphaGenome
Žiga Avsec et al. AlphaGenome: advancing regulatory variant effect prediction with a unified DNA sequence model (2025)


Amzon DeepFleet
Amazon launches a new AI foundation model to power its robotic fleet and deploys its 1 millionth robot (2025)
Amazon’s tiny robot drives do the heavy lifting (2022)


Choice-Supportive Bias
Dharshan Kumaran et al. How Overconfidence in Initial Choices and Underconfidence Under Criticism Modulate Change of Mind in Large Language Models (2025)


Arch-Router
Co Tran et al. Arch-Router: Aligning LLM Routing with Human Preferences (2025)


Monte Carlo Tree Diffusion (MCTD)
Jaesik Yoon et al. Monte Carlo Tree Diffusion for System 2 Planning (2025)


Inverse Scaling
Aryo Pradipta Gema et al. Inverse Scaling in Test-Time Compute (2025)


Alignment Audits
Samuel Marks et al. Auditing language models for hidden objectives (2025)


Subliminal Learning
Alex Cloud et al. Subliminal Learning: Language models transmit behavioral traits via hidden signals in data (2025)


AI Collusion Driven by “Artificial Stupidity.”
Winston Wei Dou et al. AI-Powered Trading, Algorithmic Collusion, And Price Efficiency (NBER 2025)


The General-Purpose AI Code of Practice
European Comission (2025)


MaViLa
Haolin Fan et al. MaViLa: Unlocking new potentials in smart manufacturing through vision language models (Journal of Manufacturing Systems 2025)


Wan
Alibaba. Wan: Open and Advanced Large-Scale Video Generative Models (2025)


MAI-DxO
Microsoft. The Path to Medical Superintelligence (2025)


HealthBench
Rahul K. Arora et al. HealthBench: Evaluating Large Language Models Towards Improved Human Health (2025)


AlphaEarth Foundations
Christopher F. Brown et al. AlphaEarth Foundations: An embedding field model for accurate and efficient global mapping from sparse label data (2025)


Persona Vectors
Runjin Chen et al. Persona Vectors: Monitoring and Controlling Character Traits in Language Models (2025)




Reference

Annotated History of Modern AI and Deep Learning (Juergen Schmidhuber)
Classical Paper List on Machine Learning andNatural Language Processing (Zhiyuan Liu)
Award-winning classic papers in ML and NLP (Desh Raj)
Computer Vision: 10 Papers to Start (Chenxi Liu)
Awesome - Most Cited Deep Learning Papers (Terryum)
Papers With Code Machine Learning Datasets
야사와 만화로 배우는 인공지능 강의
History of AI
The Nobel Prize in Physics 2024
The Nobel Prize in Chemistry 2024