DatFlash Logo
DatFlashTM tracks real-world dataset transactions and supply signals, normalized for AI decision-making.

Reddit conversation dataset used to train large language models

Price
$70,000,000
Date
2025
Buyer
OpenAI
Seller
Reddit
Type
DATASET_LICENSING
Region
Global
Market Context
Industry reporting suggests OpenAI may pay roughly $70M to license Reddit conversation data for training AI models.
Term
Licensing agreement  (Multi-year)
Confidence: Medium-High
Citation: DatFlash (2026). "Reddit conversation dataset used to train large language models"
https://www.datflash.com/transaction/openai-reddit-reddit-conversation-dataset-used-to-train-large-language-2025
Download JSON