as the “chatbot warsanger in Silicon Valley, the growing proliferation of artificial intelligence (AI) tools designed specifically to generate humanoid text has left many stunned.
Educators in particular are struggling to adapt to the availability of software that can produce a reasonably competent essay on any subject in an instant. Should we go back to pen and paper assessments? Increase exam supervision? Ban the use of AI altogether?
All of these and more have been suggested. However, none of these less-than-ideal measures would be necessary if educators could reliably distinguished AI-generated and human-written text.
We’ve delved into several suggested methods and tools for recognizing AI-generated text. None of them are foolproof, all are vulnerable to workarounds, and they are unlikely to ever be as reliable as we’d like them to be.
You may wonder why the world’s leading AI companies cannot reliably distinguish the products of their own machines from the work of humans. The reason is ridiculously simple: the corporate mission in today’s high-stakes AI arms is to train natural language processor (NLP) AIs to produce output that closely resembles human writing as possible. The public’s demand for an easy way to spot such AIs in the wild may seem paradoxical, as if we’re missing the whole point of the program.
A mediocre effort
OpenAI – the creator of ChatGPT – launched a “classification for indicating AI-written text“the end of January.
The classifier is trained on external AIs as well as on the company’s own text-generating engines. In theory, this means it should be able to highlight essays generated by FLOWERS AI or similar, not just those made by ChatGPT.
We give this classifier a C grade at best. OpenAI admits that it accurately identifies only 26% of AI-generated text (true positive), while human prose is mislabeled as AI-generated text 9% of the time (false positive).
OpenAI has not shared its research on the rate at which AI-generated text is mislabeled as human-generated text (false negative).
A promising contender
A more promising contender is a classifier created by a Princeton University student during his Christmas break.
This app identifies AI authorship based on two factors: bewilderment and briskness. Bewilderment measures how complex a text is, while burstiness compares the variation between sentences. The lower the values for these two factors, the more likely it is that a text has been produced by an AI.
We pit this humble David against ChatGPT’s Goliath.
First, we asked ChatGPT to generate a short essay on justice. We then copied the article – unchanged – to GPTZero. Tian’s tool correctly determined that the text was probably written entirely by an AI, as the average bewilderment and burstiness scores were very low.
Fool the classifiers
An easy way to trick AI classifications is to simply replace a few words with synonyms. Websites offering tools that paraphrase AI-generated text for this purpose are already popping up all over the internet.
Many of these tools showcase their own set of AI giveaways, such as sifting through human prose with “tortured sentences(e.g. by using “fake awareness” instead of “AI”).
To further test GPTZero, we copied ChatGPT’s fairness essay to GPT Minus1 — a website that offers to ‘encrypt’ ChatGPT text with synonyms. The image on the left shows the original essay. The image on the right shows the GPT-Minus1 changes. It changed about 14% of the text.
We then copied the GPT-Minus1 version of the justice essay into GPTZero. His verdict?
Your text is most likely written by humans, but there are some sentences with little bewilderment.
It highlighted just one sentence it thought was likely written by an AI (see bottom left image), along with a report on the essay’s overall bewilderment and burstiness scores, which were much higher (see bottom right image ).
Tools like Tian’s are promising, but they aren’t perfect and are also vulnerable to workarounds. For example, a recently released YouTube tutorial explains how to trigger ChatGPT to produce text with a high degree of – you guessed it – bewilderment and burstiness.
Another proposal is for AI-written text to contain a “watermark” that is invisible to human readers but can be picked up by software.
Natural language models work word for word. They select which word to generate based on statistical probability.
However, they don’t always choose words with the greatest chance of appearing together. Instead, they randomly choose one from a list of probable words (although words with higher probability scores are more likely to be selected).
This explains why users get a different output every time they generate text with the same prompt.
Simply put, watermarking means putting some of the likely words on a “black list” and allowing the AI to select only words from a “white list”. Since a human-written text is likely to contain “blacklisted” words, this could make it possible to distinguish it from an AI-generated text.
However, watermarking also has limitations. The quality of AI-generated text may decrease if vocabulary is limited. Furthermore, each text generator would probably have a different watermark system, so almost all of the text would be checked.
Watermarks can also be bypassed by paraphrasing tools, which can insert blacklisted words or reword essay questions.
An ongoing arms race
AI-generated text detectors will become more and more sophisticated. Anti-plagiarism service TurnItIn recently announced an upcoming AI write detector with a claimed accuracy of 97%.
It will never be possible to make AI text IDs perfect, like even OpenAI acknowledgesand there will always be new ways to trick them.
As this arms race continues, we may see the rise of “contract paraphrasing”: instead of paying someone to write your assignment, you pay someone to rework your AI-generated assignment to get it past the detectors.
There are no easy answers for educators here. Technical solutions can be part of the solution, but so can new ways of teaching and assessment (which may include harnessing the power of AI).
We don’t know exactly what this will look like yet. However, we’ve spent the past year prototyping open-source AI tools for education and research in an effort to find a path between the old and the new – and you can access beta versions at Safe to fail AI.