Home
PST
Light Mode
Favorite Papers
Attention Is All You Need
Attention Is All You Need
#01
Vaswani et al.
NeurIPS 2017
The death of RNNs and the birth of the parallelizable sequence model.
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
#02
Brown et al.
NeurIPS 2020
GPT-3 and the realization that scale is a quality of its own.
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
#03
Wei et al.
NeurIPS 2022
A simple prompting trick that unlocked emergent logical capabilities.
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
#04
Ouyang et al.
NeurIPS 2022
The core mechanics of RLHF; aligning predicted text with human intent.
Constitutional AI: Harmlessness from AI Feedback
Constitutional AI: Harmlessness from AI Feedback
#05
Bai et al.
Anthropic
Scalable oversight: using a model to supervise another model's safety.
Computing Machinery and Intelligence
Computing Machinery and Intelligence
#06
Alan Turing
Mind (1950)
The original question: 'Can machines think?' and the imitation game.
A Mathematical Theory of Communication
A Mathematical Theory of Communication
#07
Claude Shannon
The Bell System Technical Journal (1948)
Information entropy defined. The bedrock of every bit we transmit.
Mastering the game of Go with deep neural networks and tree search
Mastering the game of Go with deep neural networks and tree search
#08
Silver et al.
Nature 2016
AlphaGo and the triumph of MCTS pair with deep reinforcement learning.
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
#09
Kaplan et al.
OpenAI
The empirical predictability of error as compute and data grow.
Direct Preference Optimization
Direct Preference Optimization
#10
Rafailov et al.
NeurIPS 2023
Removing the complex reward model from RLHF for simpler alignment.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
#11
Devlin et al.
NAACL 2019
The masked language modeling revolution for context-aware embeddings.
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
#12
He et al.
CVPR 2016
ResNets and the identity mapping that enabled training 1000+ layers.
Generative Adversarial Nets
Generative Adversarial Nets
#13
Goodfellow et al.
NeurIPS 2014
The zero-sum game that redefined synthetic data generation.
LoRA: Low-Rank Adaptation of Large Language Models
LoRA: Low-Rank Adaptation of Large Language Models
#14
Hu et al.
ICLR 2022
Fine-tuning billions of parameters by updating only a tiny fraction.
Chinchilla: Training Compute-Optimal Large Language Models
Chinchilla: Training Compute-Optimal Large Language Models
#15
Hoffmann et al.
DeepMind
Challenging the assumption that bigger is always better; data matters.
Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Tree of Thoughts: Deliberate Problem Solving with Large Language Models
#16
Yao et al.
NeurIPS 2023
Enabling models to branch, look ahead, and backtrack during reasoning.
Voyager: An Open-Ended Embodied Agent with Large Language Models
Voyager: An Open-Ended Embodied Agent with Large Language Models
#17
Wang et al.
ICLR 2024
Agents that learn continuously in Minecraft through a code-based skill library.
Self-Instruct: Aligning Language Models with Self-Generated Instructions
Self-Instruct: Aligning Language Models with Self-Generated Instructions
#18
Wang et al.
ACL 2023
Bootstrapping an instruction-tuned model from a raw base model.
LLaMA: Open and Efficient Foundation Language Models
LLaMA: Open and Efficient Foundation Language Models
#19
Touvron et al.
Meta AI
Democratizing state-of-the-art performance for the open-source community.
DALL·E: Zero-Shot Text-to-Image Generation
DALL·E: Zero-Shot Text-to-Image Generation
#20
Ramesh et al.
ICML 2021
Bridging the gap between conceptual text and high-fidelity visuals.
Reinforcement Learning from Human Feedback
Reinforcement Learning from Human Feedback
#21
Christiano et al.
NeurIPS 2017
Teaching a model to backflip using only human preferences.
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
#22
Silver et al.
Science 2018
AlphaZero and the endgame of superhuman generalized game intelligence.
The Second Law of Thermodynamics
The Second Law of Thermodynamics
#23
Clausius / Kelvin
Historical
The inevitable heat death of every system. My favorite physics constraint.
Bitter Lesson
Bitter Lesson
#24
Rich Sutton
Essay (2019)
The hard truth: compute-heavy methods eventually crush human-engineered cleverness.
ImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural Networks
#25
Krizhevsky et al.
NeurIPS 2012
AlexNet: The Big Bang of the modern deep learning era.