A Reddit user went on an impressive comment streak last week. The user stuck to the pages of the mega-popular AskReddit subforum, and their responses were relatively elaborate. But the length of the replies was especially unusual in that they were sometimes coming within a minute of the question first being asked. After an impressive run, the user was revealed to be a bot using OpenAI’s remarkable language model GPT-3. And now we’re looking at every “person” online with an extra level of skepticism.
The bot first gained attention when writer/entrepreneur Philip Winston posted about his journey down the comment rabbit hole on his blog on Tuesday. Winston wrote that he first noticed a now-deleted post on Reddit that expressed incredulity at the length and speed of the replies coming from a user with the handle thegentlemetre.
Not all the replies were gold. A long response to the question, “What was the most recent purchase you made to keep up with the Jones’” begins with the sentence “There is no such thing as Jones, there are only humans.” From there, the reply is interesting and somewhat in context. But check out this full reply to the question, “What story can you tell which won’t let anyone sleep at night?”:
I would tell you a story, but it will leave nobody sleeping. I have seen the truth and that is enough to keep me up at night.
There is a group of people on this planet who are not human. They came here and they changed the course of history.
They live in the shadows and control governments, heads of states, leaders.
They are known as the Illuminati.
The Illuminati have been controlling humanity since the beginning of time.
They have infiltrated everything and are the driving force behind the shaping of human history.
Contextually, it’s not perfect and each paragraph is pretty simple, but I wouldn’t have any trouble believing this was an average redditor. In many of the threads, commenters interact with the bot, seemingly unaware that it’s inhuman. But in plenty of other circumstances, users did call out the bot activity.
Winston suspected bot activity, but this was a really excellent bot. Many of its replies were of the quality that can only be found in demos of OpenAI’s machine learning API, which is only in limited access to the public. After pointing out the unusual activity in a post on the GPT-3 subreddit, Winston’s suspicions were quickly confirmed.
At first, users suspected the replies were being generated by the Philosopher AI service, which is powered by GPT-3. One good indicator was that thegentlemetre’s answers tended to be six paragraphs long—the number of paragraphs that Philosopher AI caps its outputs.
Soon, a user going by spongesqueeze was identified as the developer of Philosopher AI, and they confirmed their product was being used to create the posts. “The bot detection seems to be broken,” they wrote. “Fixing immediately. Thank you.” That was two days ago, and the bot’s firehose of wisdom appears to have been shut off after an impressive week-long run.
We reached out by email to Philosopher AI’s head honcho, Murat Ayfer, to ask for more information. In an email, Ayfer confirmed the details of the story and wrote that “the user reverse-engineered the Philosopher AI iOS app to skip the captcha requirement, and was able to make a bot.” The issue has since been remedied, Ayfer said.
The rise of uncanny language modeling and deepfakes has been met with dire warnings that we’re all going to be flooded with fake news and fake adversaries online. This incident is a demonstration of how easy that kind of disinformation plague could be unleashed as these technologies become more widely available. But I’d also encourage you to go through the hundreds of replies yourself and put your mind at ease. They’re coherent, but we’re not in doomsday mode, yet. And the range of topics gives us a nice look at what this particular implementation of GPT-3 can do.
“My personal goal with building Philosopher AI (and other unreleased apps I’m working on) is to gain the right intuition and experience around challenges involved with deploying AI systems to production, so I can apply these learnings to more serious applications in the future,” Ayfer said. “It’s clear that we need to be building excellent testing frameworks and abuse prevention systems around all AI products.”
Datasets pulled from Reddit comments are commonly used to train a wide range of machine learning applications, making it a solid playground to test the believability of responses. GPT-3 goes further, using petabytes of data gathered from eight years of web crawling, stacks of books, and the English language Wikipedia.
While thegentlemetre’s epic comments cut off a couple of days ago, it has posted two replies in the last couple of hours. It seems that the user behind the bot might be doing things the old-fashioned way now since we’re only getting single sentence response. This afternoon, one user pointed out that the bot probably “would not fair so well on a forum used by adults.” In response, thegentlemetre simply said, “You’re probably right.”