Gleaning insight from conversation with others is what has made Reddit the hit it has been in the digital space, and if you have any area of interest you can always find at least one sub-Reddit on it, and usually you’ll have more than a few to look through. It’s one of the most engaging social networking sites if you like to ingest your information by consuming text and the simplicity of it is really nice too. 57 million people visit Reddit every day to chat and broaden their horizons with whatever subject it is they’re interest in, and it’s a great resource for that.
Looking at it from a different angle, you might be surprised to learn that Reddit chats have also served as a free teaching aid for companies like Google, Microsoft and – most notably these days - OpenAI. Reddit conversations have been used in the development of giant artificial intelligence systems created by these 3 and we’re seeing how they’re already becoming such a big deal in the tech industry.
Definitely a development that everyone in our industry will take note of as well given the connection, and here at 4GoodHosting we’re like any other quality Canadian web hosting provider in that seeing an online chat forum become an open-source type of asset for these big players in the digital world is definitely something we’ll be keen to follow as well as share here with our blog.
Charging for Access
And it is definitely interesting to see that Reddit wants to be paid for accessing its application programming interface now. That’s the method through which outside entities can download and process the social network’s vast selection of person-to-person conversations, and the heads of the company at Reddit don’t think they should be making that value available for free.
So what we’ll see now is Reddit charging for access to the conversations it hosts for the purpose of developing A.I. systems like ChatGPT, OpenAI’s popular program. Reddit is also acting as it prepares for a possible initial public offering on Wall Street this year. As of now it makes most of its money through advertising and e-commerce transactions on its platform, but right now the discussion is ongoing around what they will be charging for A.P.I. access.
The New Bots
Here are the other chatbots powered by artificial intelligence that have been utilizing Reddit as a development resource, and here they are:
ChatGPT – the MUCH talked about artificial intelligence language model from a research lab, OpenAI. Able to respond to complex questions, write poetry, generate code, translate languages or even plan vacations. GPT-4 introduced in mid-March is even able to respond to images
Bing - Microsoft has a similar chatbot, capable of having open-ended text conversations based around virtually any topic but apparently does have occasionally inaccurate, misleading and weird responses that put it in a bit of a negative light in comparison when it first came out.
Bard – This is Google’s Chatbot and originally conceived as a creative tool designed to draft emails and poems. Notable for it is the way it is able to generate ideas, answer questions with facts or opinions, or write blog posts entirely on its own.
Ernie – Here is the lesser light one for sure, but Chinese search giant Baidu came out with their rival to ChatGPT in March. The name is a shortened version for Enhanced Representation through Knowledge Integration, but it hasn’t made anything near the splash that these other three have.
LLMs
L.L.M.s are sophisticated algorithms companies like Google and OpenAI have developed, and the algorithms are able to take Reddit conversations are data, and then adding that data to the vast pool of material being fed into the L.L.M.s. to develop them.
Other types of companies are seeing value in Reddit conversations and also for what they have with images. Most people will know Shutterstock, an image hosting service. It sold image data to OpenAI to help create the A.I. program that creates vivid graphical imagery with only a text-based prompt required - DALL-E.
Artificial intelligence makers need two significant things to ensure their models continue to improve. The first is a staggeringly large amount of computing power, and the second is an enormous amount of data. Many of these A.I. developer major players have plenty of computing power but still need to go outside their own networks for the data needed to improve their algorithms.
Other sources like Reddit that are utilized are Wikipedia, millions of digitized books, and academic articles, and Reddit has had a long-standing symbiotic relationship with the search engines of companies like Google and Microsoft. Their search engines have been crawling Reddit’s web pages in order to index information and make it available for search results.
With LLMs though the dynamic is different as they obtain as much data as they can to create new A.I. systems like the chatbots. Reddit’s A.P.I. is still be going to be free to developers who wanted to build applications that helped people use Reddit and there’s also an aim to incorporate more so-called machine learning into how the site operates. One possible benefit of that is that it might identify the use of A.I.-generated text on Reddit and labelling it as that.