OpenAI, the artificial intelligence (AI) research company behind ChatGPT and the DALL-E 2 art generator, has unveiled the highly anticipated GPT-4 model. Excitingly, the company made it too readily available to the public through a paid service.
GPT-4 is a large language model (LLM), a neural network trained on massive amounts of data to understand and generate text. It is the successor to GPT-3.5, the model behind ChatGPT.
The GPT-4 model introduces a series of improvements over its predecessors. These include more creativity, more advanced reasoning, better performance in multiple languages, the ability to accept visual input, and the capacity to process significantly more text.
More powerful than the wildly popular ChatGPT, GPT-4 will no doubt inspire an in-depth exploration of its capabilities and further accelerate adoption of generative AI.
Among numerous Results highlighted by OpenAI, what immediately stands out is GPT-4’s performance on a set of standardized tests. For example, GPT-4 scores in the top 10% of a simulated US bar exam, while GPT-3.5 scores in the bottom 10%.
GPT-4 also outperforms GPT-3.5 on a range of writing, reasoning, and coding tasks. The following examples illustrate how GPT-4 represents more reliable common sense than GPT-3.5.
An AI model that sees the world
Another important development is that GPT-4 is multimodal, unlike previous GPT models. This means that it accepts both text and image input.
Samples of OpenAI reveal that GPT-4 is able to interpret images, explain visual humor, and provide reasoning based on visual input. Such abilities are beyond the scope of previous models.
This ability to “see” could give GPT-4 a more comprehensive view of how the world works – just as humans gain more knowledge through observation. This is believed to be a key ingredient for the development of advanced AI that could bridge the gap between current models and human-level intelligence.
In fact, GPT-4 is not the first language model with these capabilities. A few weeks ago, Microsoft Cosmos-1, a language model that accepts visual input in the same way as GPT-4. Google also recently expanded its extension Palm language model to be able to record image data and sensor data from robots. Multimodality is a growing trend in AI research.
GPT-4 can take and generate up to 25,000 words of text, which is far more than ChatGPT’s limit of around 3,000 words.
It can handle more complex and detailed prompts and generate more elaborate pieces of writing. This allows for richer stories, more in-depth analysis, summaries of long pieces of text, and deeper conversational interactions.
In the example below, I gave the new ChatGPT (which uses GPT-4) the full Wikipedia article on artificial intelligence and asked a specific question, which was accurately answered.
Although the GPT-4 technical report controversial doesn’t give details on how the model was developed, all signs point to it being essentially a scaled-up version of GPT-3.5 with security improvements. In other words, it is not a new paradigm in AI research.
OpenAI itself has said that GPT-4 is subject to the the same restrictions such as previous language models, such as being prone to reasoning errors and biases, and fabricating false information.
That said, OpenAI’s results on GPT-4 suggest it is at least more reliable than previous GPT models.
OpenAI used human feedback to refine GPT-4 to produce more useful and less problematic output. GPT-4 is much better at denying inappropriate requests and avoiding malicious content compared to the original ChatGPT release.
Its arrival will continue a crucial debate among critics. That is whether alternative approaches are needed to fundamentally solve issues of veracity and reliability, or or throwing more data and resources at language models will eventually do the job.
It could be argued that GPT-4 represents only an incremental improvement over its predecessors in many practical scenarios. The results showed that human raters preferred GPT-4 outputs to the most advanced variant of GPT-3.5 only about 61% of the time.
GPT-4 also shows no improvement over GPT-3.5 in some tests, including English language and art history exams.
Shortly after the launch of GPT-4, Microsoft revealed the highly controversial Bing chatbot was running on GPT-4 all along. The announcement confirmed speculation by commentators who noted that it was more powerful then ChatGPT.
This means that Bing has a Alternative way to make use of GPT-4 since it is a search engine rather than just a chatbot.
But as anyone logged into AI news knows, Bing started to go a little crazy. But I don’t think the new ChatGPT will follow as it seems to be highly tuned using human feedback.
In its technical report, OpenAI shows how GPT-4 can indeed go completely off the rails without this human feedback training.
My new favorite thing: Bing’s new ChatGPT bot argues with a user, winds them up about the current year being 2022, says their phone may have a virus, and says “You haven’t been a good user”
Why? Because the person asked where Avatar 2 can be seen nearby pic.twitter.com/X32vopXxQG
— Jon Uleis (@MovingToTheSun) February 13, 2023
One notable aspect of the release of GPT-4 is that, in addition to Bing, it is already being used by companies and organizations such as Duolingo, Khan Academy, Morgan Stanley, stripe and the Icelandic government to build new services and tools.
Its commercial deployment will further fuel competition between major AI labs and fuel appetite of investors for generative technologies.