Reddit conversation dataset used to train large language models
Price
$70,000,000
Date
2025
Buyer
OpenAI
Seller
Reddit
Type
DATASET_LICENSING
Region
Global
Market Context
Industry reporting suggests OpenAI may pay roughly $70M to license Reddit conversation data for training AI models.
Term
Licensing agreement
(Multi-year)
Confidence:
Medium-High
Citation:
DatFlash (2026).
"Reddit conversation dataset used to train large language models"
https://www.datflash.com/transaction/openai-reddit-reddit-conversation-dataset-used-to-train-large-language-2025
https://www.datflash.com/transaction/openai-reddit-reddit-conversation-dataset-used-to-train-large-language-2025