AI’s hottest tech foundation got a major update on Tuesday with the GPT-4 version of OpenAI now available in the premium version of the Chatbot ChatGPT.
GPT-4 can generate much longer text strings and respond when people feed images to it, and it’s designed to do a better job of avoiding the AI pitfalls seen in older GPT-3.5, OpenAI said on Tuesday. For example, on bar exams that lawyers must pass to practice law, GPT-4 ranks in the top 10% of scores, compared to the bottom 10% for GPT-3.5, the company said. artificial intelligence research.
GPT stands for Generative Pretrained Transformer, a reference to the fact that it can generate text on its own – now up to 25,000 words with GPT-4 – and it uses an artificial intelligence technology called transformers which Google has was the pioneer. It’s a type of AI called a large language model, or LLM, that’s trained on vast swaths of data harvested from the internet, mathematically learning to spot patterns and replicate styles. Human monitors evaluate results to steer GPT in the right direction, and GPT-4 has more of those comments.
OpenAI has made GPT available to developers for years, but ChatGPT, which debuted in November, offered a simple interface that ordinary people can use. This sparked an explosion of interest, experimentation and concern about the downsides of the technology. It can do everything from generating programming code and answering exam questions to writing poetry and providing basic facts. It’s remarkable if not always reliable.
ChatGPT is free, but it may falter when demand is high. In January, OpenAI started offering ChatGPT Plus for $20 per month with guaranteed availability and now GPT-4 foundation. Developers can join a waiting list to get their own access to GPT-4.
“In casual conversation, the distinction between GPT-3.5 and GPT-4 can be subtle. The difference appears when the complexity of the task reaches a sufficient threshold,” OpenAI said. “GPT-4 is more reliable, creative, and able to handle much more nuanced instructions than GPT-3.5.”
Another major advancement of GPT-4 is the ability to accept input data that includes text and photos. The OpenAI example asks the chatbot to explain a joke showing a clunky decades-old computer cable plugged into the small Lightning port of a modern iPhone. This feature also helps GPT pass tests which are not only text based, but it is not yet available in ChatGPT Plus.
Another is better performance avoiding AI problems like hallucinations – poorly crafted answers, often offered with as much seeming authority as the answers the AI gets right. GPT-4 is also more effective at thwarting attempts to get it to say the wrong thing: “GPT-4 scores 40% higher than our last GPT-3.5 in our internal contradictory assessments of the facts,” a declared OpenAI.
GPT-4 also adds new “dirigibility” options. Today, users of large language models often have to engage in elaborate “prompt engineering”, learning to build specific cues into their prompts to get the right kind of responses. GPT-4 adds a system command option that allows users to set a specific tone or style, such as programming code or a Socratic tutor: “You are a tutor who always responds in the Socratic style. You never give the answer to the student, but always try to ask the right question to help them learn to think for themselves.”
“Stochastic parrots” and other problems
OpenAI acknowledges the significant shortcomings that persist with GPT-4, though it also touts progress in avoiding them.
“It can sometimes make simple errors of reasoning… or be overly gullible in accepting obvious misrepresentations from a user. And sometimes it can fail difficult problems in the same way as humans, such as the introduction security flaws in the code it produces,” OpenAI said. Additionally, “GPT-4 can also be confidently wrong in its predictions, not caring to double check the work when it is likely to be wrong.”
Large language models can deliver impressive results, appearing to understand huge amounts of topics and converse in human-sounding if somewhat stilted language. Fundamentally, however, IA LLMs really don’t know anything. They are simply able to string words together in a statistically very refined way.
This statistical but fundamentally somewhat hollow approach to knowledge has led researchers, including former Google AI researchers Emily Bender and Timnit Gebru, to warn of the “dangers of stochastic parrots” that come with large language models. Language model AIs tend to encode biases, stereotypes, and negative sentiments present in training data, and researchers and others using these models tend to “confuse… performance gains with actual understanding natural language”.
OpenAI chief executive Sam Altman acknowledges the issues, but is overall happy with the progress made with GPT-4. “He’s more creative than previous models, he hallucinates a lot less, and he’s less biased. He can pass a bar exam and get a 5 on multiple AP exams,” Altman tweeted Tuesday.
One of the concerns about AI is that students will use it to cheat, such as when answering essay questions. This is a real risk, although some educators are actively embracing LLMs as a tool, like search engines and Wikipedia. Plagiarism detection companies are adapting to AI by training their own detection models. One such company, Crossplag, said Wednesday that after testing about 50 GPT-4-generated documents, “our accuracy rate was over 98.5 percent.”
OpenAI, Microsoft and Nvidia partnership
OpenAI received a big boost when Microsoft said in February that it was using GPT technology in its Bing search engine, including chat features similar to ChatGPT. On Tuesday, Microsoft announced that it is using GPT-4 for Bing work. Together, OpenAI and Microsoft pose a major search threat to Googlebut Google also has its own extended language model technology, including a chatbot called Bard that Google is testing privately.
Also on Tuesday, Google announced it would begin its own limited testing. Artificial intelligence technology to boost Gmail email writing and Google Docs word processing documents. “With your collaborative AI partner, you can continue to refine and modify, getting more suggestions as needed,” Google said.
This wording reflects Microsoft’s “co-pilot” positioning of AI technology. Calling it an aid to human-directed work is a common position, given the problems of technology and the need for careful human supervision.
Microsoft uses GPT technology both to evaluate the searches people type into Bing and, in some cases, to offer more elaborate conversational responses. THE the results can be much more informative than those of earlier search engines, but the more conversational interface that can be optionally invoked has had issues that make it seem unbalanced.
To form GPT, OpenAI used Microsoft’s Azure cloud computing service, including thousands of Nvidia’s A100 graphics processing units, or GPUs, paired together. Azure can now use Nvidia’s new H100 processors, which include specific circuitry to speed up AI Transformer calculations.
AI chatbots everywhere
Another big language model developer, Anthropic, also unveiled an AI chatbot called Claude on Tuesday. The company, which has Google as an investor, has opened a waiting list for Claude.
“Claude is capable of a wide variety of conversational and word processing tasks while maintaining a high degree of reliability and predictability,” Anthropic said in a blog post. “Claude can help you with use cases including synthesis, research, creative and collaborative writing, Q&A, coding and more.”
It’s one of a growing host. Chinese research and technology giant Baidu is working on a chatbot called Ernie Bot. Meta, parent company of Facebook and Instagram, has consolidated its AI operations into a larger team and plans to integrate more generative AI into its products. Even Snapchat is getting into the game with a GPT-based chatbot called My AI.
Expect more refinements in the future.
“We’ve been doing the initial GPT-4 training for a while, but it took a long time and a lot of hard work to feel ready to release it,” Altman tweeted. “We hope you enjoy it and we really appreciate feedback on its shortcomings.”
Editors’ note: CNET uses an AI engine to create personal finance explanations that are edited and verified by our editors. To learn more, see this post.