Home
PST
Light Mode
Favorite Papers
Attention Is All You Need
#01
Vaswani et al.
NeurIPS 2017
The death of RNNs and the birth of the parallelizable sequence model.
Language Models are Few-Shot Learners
#02
Brown et al.
NeurIPS 2020
GPT-3 and the realization that scale is a quality of its own.
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
#03
Wei et al.
NeurIPS 2022
A simple prompting trick that unlocked emergent logical capabilities.
Training language models to follow instructions with human feedback
#04
Ouyang et al.
NeurIPS 2022
The core mechanics of RLHF; aligning predicted text with human intent.
Constitutional AI: Harmlessness from AI Feedback
#05
Bai et al.
Anthropic
Scalable oversight: using a model to supervise another model's safety.
Computing Machinery and Intelligence
#06
Alan Turing
Mind (1950)
The original question: 'Can machines think?' and the imitation game.
A Mathematical Theory of Communication
#07
Claude Shannon
The Bell System Technical Journal (1948)
Information entropy defined. The bedrock of every bit we transmit.
Mastering the game of Go with deep neural networks and tree search
#08
Silver et al.
Nature 2016
AlphaGo and the triumph of MCTS pair with deep reinforcement learning.
Scaling Laws for Neural Language Models
#09
Kaplan et al.
OpenAI
The empirical predictability of error as compute and data grow.
Direct Preference Optimization
#10
Rafailov et al.
NeurIPS 2023
Removing the complex reward model from RLHF for simpler alignment.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
#11
Devlin et al.
NAACL 2019
The masked language modeling revolution for context-aware embeddings.
Deep Residual Learning for Image Recognition
#12
He et al.
CVPR 2016
ResNets and the identity mapping that enabled training 1000+ layers.
Generative Adversarial Nets
#13
Goodfellow et al.
NeurIPS 2014
The zero-sum game that redefined synthetic data generation.
LoRA: Low-Rank Adaptation of Large Language Models
#14
Hu et al.
ICLR 2022
Fine-tuning billions of parameters by updating only a tiny fraction.
Chinchilla: Training Compute-Optimal Large Language Models
#15
Hoffmann et al.
DeepMind
Challenging the assumption that bigger is always better; data matters.
Tree of Thoughts: Deliberate Problem Solving with Large Language Models
#16
Yao et al.
NeurIPS 2023
Enabling models to branch, look ahead, and backtrack during reasoning.
Voyager: An Open-Ended Embodied Agent with Large Language Models
#17
Wang et al.
ICLR 2024
Agents that learn continuously in Minecraft through a code-based skill library.
Self-Instruct: Aligning Language Models with Self-Generated Instructions
#18
Wang et al.
ACL 2023
Bootstrapping an instruction-tuned model from a raw base model.
LLaMA: Open and Efficient Foundation Language Models
#19
Touvron et al.
Meta AI
Democratizing state-of-the-art performance for the open-source community.
DALL·E: Zero-Shot Text-to-Image Generation
#20
Ramesh et al.
ICML 2021
Bridging the gap between conceptual text and high-fidelity visuals.
Reinforcement Learning from Human Feedback
#21
Christiano et al.
NeurIPS 2017
Teaching a model to backflip using only human preferences.
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
#22
Silver et al.
Science 2018
AlphaZero and the endgame of superhuman generalized game intelligence.
The Second Law of Thermodynamics
#23
Clausius / Kelvin
Historical
The inevitable heat death of every system. My favorite physics constraint.
Bitter Lesson
#24
Rich Sutton
Essay (2019)
The hard truth: compute-heavy methods eventually crush human-engineered cleverness.
ImageNet Classification with Deep Convolutional Neural Networks
#25
Krizhevsky et al.
NeurIPS 2012
AlexNet: The Big Bang of the modern deep learning era.