OpenAI to get access to Reddit data to train its AI models
Reddit, the popular social news aggregation and discussion platform which self-proclaims itself as the ‘front page of internet’, has entered into a partnership with OpenAI in order to integrate its vast repository of content into OpenAI’s widely-used ChatGPT.
“Keeping the internet open is crucial, and part of being open means Reddit content needs to be accessible to those fostering human learning and researching ways to build community, belonging, and empowerment online. Reddit is a uniquely large and vibrant community that has long been an important space for conversation on the internet. Additionally, using LLMs, ML, and AI allow Reddit to improve the user experience for everyone. In line with this, Reddit and OpenAI today announced a partnership to benefit both the Reddit and OpenAI user communities in a number of ways,” OpenAI announced in a joint statement.
Traditionally, Reddit has relied heavily on its advertising business. However, this deal, along with a similar agreement with Alphabet, are some of Reddit’s recent efforts to monetize its extensive user-generated content through data licensing. The partnership with Alphabet, which allows Google to use Reddit content to train its AI models, is reportedly worth around $60 million annually. Now, with the partnership, OpenAI will become an advertising partner for “the front page of the internet.”
We’re partnering with Reddit to bring its content to ChatGPT and new products: https://t.co/xHgBZ8ptOE
— OpenAI (@OpenAI) May 16, 2024
Under the terms of the partnership, OpenAI will be granted access to Reddit’s Data API, essentially a digital gateway to a vast library of “real-time, structured, and unique content,” as described in a joint statement released by both companies. This rich dataset will enhance OpenAI’s AI tools, including ChatGPT, by improving their understanding and generation of relevant content. For Reddit, the benefits are two-fold. The company gains a new and potentially significant advertising partner in OpenAI, offering an avenue to diversify its revenue stream beyond traditional advertising models. Additionally, Reddit users can expect the future integration of “new AI-powered features” – the specifics of which remain undisclosed for now.
This partnership has the potential to be a game-changer for ChatGPT, which will be able to up the ante against rival chatbots such as Google’s Gemini. The vast amount of data from Reddit provides valuable fuel for OpenAI’s research and development efforts. This data can be used to train more powerful AI models, and by incorporating Reddit’s treasure trove of data, OpenAI can significantly improve its understanding of human language, social trends, and online humor. This, in turn, could lead to the development of more nuanced and contextually aware AI tools. Reddit seems to be a good choice in this regard, especially since its vast collection of real-time discussions, memes, and user-generated content offers OpenAI a window into the constantly evolving online conversation.
The announcement of the partnership has had a positive impact on Reddit’s market performance. Since its initial public offering (IPO) in March, Reddit has seen a steady increase in its stock price, driven by strong revenue growth and improved profitability. In its first earnings report as a public company, Reddit reported a 450% year-over-year increase in non-advertising revenue. Reddit’s shares are currently priced at $56.38. “Reddit has become one of the internet’s largest open archives of authentic, relevant, and always up to date human conversations about anything and everything,” Steve Huffman, Reddit co-founder and CEO, announced in the statement. “Including it in ChatGPT upholds our belief in a connected internet, helps people find more of what they’re looking for, and helps new audiences find community on Reddit.”