OpenAI Data Partnerships

OpenAI News
OpenAI Data Partnerships

We are introducing OpenAI Data Partnerships, where we’ll work together with organizations to produce public and private datasets for training AI models.

Modern AI technology learns skills and aspects of our world—of people, our motivations, interactions, and the way we communicate—by making sense of the data on which it’s trained. To ultimately make AGI that is safe and beneficial to all of humanity, we’d like AI models to deeply understand all subject matters, industries, cultures, and languages, which requires as broad a training dataset as possible.

Including your content can make AI models more helpful to you by increasing their understanding of your domain. We’re already working with many partners who are eager to represent data from their country or industry. For example, we recently partnered with the Icelandic Government⁠ and Miðeind ehf⁠(opens in a new window) to improve GPT‑4’s ability to speak Icelandic by integrating their curated datasets. We also partnered with non-profit organization Free Law Project⁠(opens in a new window), which aims to democratize access to legal understanding by including their large collection of legal documents in AI training. We know there may be many more who also want to contribute to the future of AI research while discovering the potential of their unique data.

Data Partnerships are intended to enable more organizations to help steer the future of AI and benefit from models that are more useful to them, by including content they care about.

## The kinds of data we’re seeking

We’re interested in large-scale datasets that reflect human society and that are not already easily accessible online to the public today. We can work with any modality, including text, images, audio, or video. We’re particularly looking for data that expresses human intention (e.g., long-form writing or conversations rather than disconnected snippets), across any language, topic, and format.

We can work with data in almost any form and can use our next-generation in-house AI technology to help you digitize and structure your data. For example, we have world-class optical character recognition⁠(opens in a new window) (OCR) technology to digitize files like PDFs, and automatic speech recognition⁠(opens in a new window) (ASR) to transcribe spoken words. If the data needs cleaning (e.g. has lots of auto-generated artifacts or transcription errors), we can work with your team to process it into the most useful form. We are not seeking datasets with sensitive or personal information, or information that belongs to a third party; we can work with you to remove this information if you need help.

## Ways to partner with us

We currently have two ways to partner, and may expand in the future:

Overall, we are seeking partners who want to help us teach AI to understand our world in order to be maximally helpful to everyone. Together, we can move towards AGI that benefits all of humanity.

View all company articles

Global news partnerships: Le Monde and Prisa Media Company Mar 13, 2024

OpenAI announces new members to board of directors Company Mar 8, 2024

Review completed & Altman, Brockman to continue to lead OpenAI Company Mar 8, 2024

Our Research * Research Index * Research Overview * Research Residency * OpenAI for Science * Economic Research

Latest Advancements * GPT-5.3 Instant * GPT-5.3-Codex * GPT-5 * Codex

Safety * Safety Approach * Security & Privacy * Trust & Transparency

ChatGPT * Explore ChatGPT(opens in a new window) * Business * Enterprise * Education * Pricing(opens in a new window) * Download(opens in a new window)

Sora * Sora Overview * Features * Pricing * Sora log in(opens in a new window)

API Platform * Platform Overview * Pricing * API log in(opens in a new window) * Documentation(opens in a new window) * Developer Forum(opens in a new window)

For Business * Business Overview * Solutions * Contact Sales

Company * About Us * Our Charter * Foundation(opens in a new window) * Careers * Brand

Support * Help Center(opens in a new window)

More * News * Stories * Livestreams * Podcast * RSS

Terms & Policies * Terms of Use * Privacy Policy * Other Policies

(opens in a new window)(opens in a new window)(opens in a new window)(opens in a new window)(opens in a new window)(opens in a new window)(opens in a new window)

OpenAI © 2015–2026 Manage Cookies

English United States

Originally published on OpenAI News.