Summarizing books with human feedback

To safely deploy powerful, general-purpose artificial intelligence in the future, we need to ensure that machine learning models act in accordance with human intentions. This challenge has become known as the _alignment problem_.

A scalable solution to the alignment problem needs to work on tasks where model outputs are difficult or time-consuming for humans to evaluate. To test scalable alignment techniques, we trained a model to summarize entire books, as shown in the following samples.A Our model works by first summarizing small sections of a book, then summarizing those summaries into a higher-level summary, and so on.

Our best model is fine-tuned from GPT‑3 and generates sensible summaries of entire books, sometimes even matching the average quality of human-written summaries: it achieves a 6/7 rating(similar to the average human-written summary) from humans who have read the book 5% of the time and a 5/7 rating 15% of the time. Our model also achieves state-of-the-art results on theBookSum dataset⁠(opens in a new window)for book-length summarization. A zero-shot question-answering model can use our model’s summaries to obtain competitive results on theNarrativeQA dataset⁠(opens in a new window)for book-length question answering.B

## Our approach: combining reinforcement learning from human feedback and recursive task decomposition

Consider the task of summarizing a piece of text. Largepretrained models aren’t very good at summarization⁠. In the past we found that training a model withreinforcement learning from human feedback⁠helped align model summaries with human preferences on short posts and articles. But judging summaries of entire books takes a lot of effort to do directly since a human would need to read the entire book, which takes many hours.

To address this problem, we additionally make use of _recursive task decomposition_: we procedurally break up a difficult task into easier ones. In this case we break up summarizing a long piece of text into summarizing several shorter pieces. Compared to an end-to-end training procedure, recursive task decomposition has the following advantages:

1. Decomposition allows humans to evaluate model summaries more quickly by using summaries of smaller parts of the book rather than reading the source text. 2. It is easier to trace the summary-writing process. For example, you can trace to find where in the original text certain events from the summary happen. See for yourself onour summary explorer⁠(opens in a new window)! 3. Our method can be used to summarize books of unbounded length, unrestricted by the context length of the transformer models we use.

## Why we are working on this

This work is part of ourongoing⁠research⁠into aligning advanced AI systems, which is key toour mission.⁠As we train our models to do increasingly complex tasks, making informed evaluations of the models’ outputs will become increasingly difficult for humans. This makes it harder to detect subtle problems in model outputs that could lead to negative consequences when these models are deployed. Therefore we want our ability to evaluate our models to increase as their capabilities increase.

Our current approach to this problem is toempower humans to evaluate machine learning model outputs using assistance from other models⁠(opens in a new window). In this case, to evaluate book summaries we empower humans with individual chapter summaries written by our model, which saves them time when evaluating these summaries relative to reading the source text. Our progress on book summarization is the first large-scale empirical work on scaling alignment techniques.

Going forward, we are researching better ways to assist humans in evaluating model behavior, with the goal of finding techniques that scale to aligning artificial general intelligence.

_We’re always looking for more talented people to join us; so if this work interests you, please__apply to join our team_⁠_!_

1. A These samples were selected from works in thepublic domain⁠(opens in a new window), and are part of GPT-3′s pretraining data. To control for this effect, and purely for research purposes, ourpaper⁠(opens in a new window)evaluates summaries of books the model has never seen before.

2. B We’ve amended our original claim about results on NarrativeQA after being made aware of prior work with better results than ours.

Jeffrey Wu, Ryan Lowe, Jan Leike

We’d like to acknowledge our paper co-authors: Long Ouyang, Daniel Ziegler, Nisan Stiennon, and Paul Christiano.

Thanks to the following for feedback on this release: Steve Dowling, Hannah Wong, Miles Brundage, Gretchen Krueger, Ilya Sutskever, and Sam Altman.

Design: Justin Jay Wang

Book Cover Artwork: DALL·E⁠

Disrupting malicious uses of AI by state-affiliated threat actors Security Feb 14, 2024

Building an early warning system for LLM-aided biological threat creation Publication Jan 31, 2024

Democratic inputs to AI grant program: lessons learned and implementation plans Safety Jan 16, 2024

Our Research * Research Index * Research Overview * Research Residency * OpenAI for Science * Economic Research

Latest Advancements * GPT-5.3 Instant * GPT-5.3-Codex * GPT-5 * Codex

Safety * Safety Approach * Security & Privacy * Trust & Transparency

ChatGPT * Explore ChatGPT(opens in a new window) * Business * Enterprise * Education * Pricing(opens in a new window) * Download(opens in a new window)

Sora * Sora Overview * Features * Pricing * Sora log in(opens in a new window)

API Platform * Platform Overview * Pricing * API log in(opens in a new window) * Documentation(opens in a new window) * Developer Forum(opens in a new window)

For Business * Business Overview * Solutions * Contact Sales

Company * About Us * Our Charter * Foundation(opens in a new window) * Careers * Brand

Support * Help Center(opens in a new window)

More * News * Stories * Livestreams * Podcast * RSS

Terms & Policies * Terms of Use * Privacy Policy * Other Policies

(opens in a new window)(opens in a new window)(opens in a new window)(opens in a new window)(opens in a new window)(opens in a new window)(opens in a new window)

English United States

Summarizing books with human feedback

Nokia to Cut 20% of Workforce, Affecting Over 14,000 Employees Globally

From Moon hotels to cattle herding: 8 startups investors chased at YC Demo Day

Urban Company's InstaHelp Achieves Over 1 Million Monthly Bookings in March

Urban Company's InstaHelp Surpasses One Million Bookings in March

Latest Briefs