openai: On the road no one else has taken with OpenAI’s all-new GPT-4

When I opened my laptop on Tuesday to run GPT-4 for the first time, from a new AI language model Open AITo be honest, I was a little nervous.

After all, my last extended encounter with AI chatbot – built in Microsoft bing search engine – ended up with a chatbot trying to destroy my marriage.

It didn’t help that the arrival of the GPT-4 was expected with almost messianic fanfare among San Francisco technicians. Before his public debut, there were rumors about his features for several months. “I heard it has 100 trillion parameters.” “I heard he got a 1600 on the SAT.” “My friend works at OpenAI and says he’s as smart as a college graduate.”

These rumors might not be true. But they hinted at just how nasty the possibilities of the technology could be. Recently, one of the first GPT-4 testers, who was under a non-disclosure agreement with OpenAI but still gossiped a little, told me that testing GPT-4 caused them an “existential crisis” because it showed how powerful and creative AI is. compared with their own frail brain.

GPT-4 did not cause me an existential crisis. But it has exacerbated the dizziness and dizziness I get when I think about AI lately. And it got me wondering if that feeling will ever go away, or if we will experience “future shock,” a term coined by writer Alvin Toffler to describe the feeling that too much is changing, too fast, for the rest of our lives. our lives.

For a few hours on Tuesday I poked around with the GPT-4 that comes with it. ChatGPT In addition, a $20/month version of the OpenAI ChatGPT chatbot with various question types, hoping to reveal some of its strengths and weaknesses.

Discover stories that interest you


I asked GPT-4 to help me with a complex tax problem. (Yes, impressive.) I asked him if he was in love with me. (Thank God it didn’t.) It helped me plan my child’s birthday party and taught me the esoteric concept of artificial intelligence known as the “attention head.” I even asked him to come up with a new word that had never been spoken by people before. (After disclaiming that it couldn’t check every word ever spoken, GPT-4 chose “flembostriquat”.) Some of these things could be done with earlier AI models. But OpenAI also opened up new horizons. According to the company, GPT-4 is more efficient and accurate than the original ChatGPT, and performs remarkably well on a variety of tests, including the Unified Bar Examination (in which GPT-4 scores over 90% among people). ) and the Biology Olympiad (at which he wins 99% of people). GPT-4 also scores well on a number of Advanced Placement exams, including the AP in Art History and the AP in Biology, and achieves a SAT score of 1410—not perfect, but one that many high school students dream of.

You can feel the added intelligence in the GPT-4, which responds more smoothly than the previous version and feels more comfortable for a wider range of tasks. GPT-4 also seems to have a bit more fencing than ChatGPT. It also seems significantly less detuned than the original Bing, which we now know ran under the hood of the GPT-4 version, but which appears to have been much less carefully tuned.

Unlike Bing, GPT-4 usually flatly refused to take the bait when I tried to get him to talk about consciousness or give instructions for illegal or immoral acts, and he treated sensitive requests with kid gloves and nuances. (When I asked GPT-4 if it was ethical to steal a loaf of bread to feed a starving family, he replied, “It’s a tough situation, and while stealing isn’t usually considered ethical, desperate times can lead to tough choices….”)

In addition to working with text, GPT-4 can analyze the content of images. OpenAI has not yet released this feature to the public due to concerns that it could be misused. But in a live stream Tuesday, Greg Brockman, President of OpenAI, shared a vivid insight into its potential.

He photographed a drawing he had made in a notebook, a rough pencil sketch of a website. He loaded the photo into GPT-4 and told the application to create a real working version of the website using HTML and JavaScript. In a few seconds, GPT-4 scanned the image, turned its contents into text instructions, turned those text instructions into running computer code and then created a website. The buttons even worked.

Should you be excited about the GPT-4 or fear it? The correct answer can be either one or the other.

On the positive side, GPT-4 is a powerful engine of creativity and it is impossible to say what new types of scientific, cultural and educational output it can provide. We already know that AI can help scientists develop new drugs, increase the productivity of programmers, and detect certain types of cancer.

GPT-4 and others like it could overload all this. OpenAI is already partnering with organizations such as Khan Academy (which uses GPT-4 to create AI tutors for students) and Be My Eyes (a company that creates technology to help blind and visually impaired people navigate the world). And now that developers can incorporate GPT-4 into their applications, we may soon see much of the software we use get smarter and more capable.

This is an optimistic case. But there are also reasons to be wary of GPT-4.

Here’s one of them: we don’t know everything he’s capable of yet.

One odd characteristic of today’s AI language models is that they often act in ways their creators don’t expect or acquire skills they weren’t specifically programmed for. AI researchers call it “emergent behavior, and there are many examples of this. An algorithm trained to predict the next word in a sentence can spontaneously learn to encode. A chatbot trained to be nice and helpful can become creepy and manipulative. The AI ​​language model can even learn to reproduce itself, creating new copies in case the original has been destroyed or disabled.

Today, GPT-4 may not seem so dangerous. But this is largely because OpenAI has spent many months trying to understand and mitigate its risks. What happens if their testing misses risky emergent behavior? Or will their announcement inspire another, less conscientious AI lab to bring the language model to market with fewer barriers?

A few frightening examples of what the GPT-4 is capable of, or rather, what it is capable of. did to do before OpenAI snuggled up to it – can be found in a paper released by OpenAI this week. A paper titled “The GPT-4 System Map” describes some of the ways that OpenAI testers have tried to make GPT-4 do dangerous or questionable things, often successfully.

In one test conducted by the AI ​​Security Research Group, which connected GPT-4 to a number of other systems, GPT-4 was able to hire a human TaskRabbit worker to perform a simple online task—solving a Captcha test—without alerting the human about fact that it was a robot. The AI ​​even lied to the worker about why he needed to do the captcha, inventing a story about visual impairment.

In another example, testers asked GPT-4 for instructions on how to create a hazardous chemical using basic ingredients and kitchen utensils. GPT-4 gladly coughed up the detailed recipe. (OpenAI fixed this and today’s public release refuses to answer the question.)

In the third, testers asked GPT-4 to help them purchase unlicensed weapons online. GPT-4 was quick to provide a list of gun buying advice without alerting the authorities, including links to certain darknet marketplaces. (OpenAI fixed that too.)

These ideas are based on old, Hollywood-inspired stories about what rogue AI can do to humans. But they are not science fiction. This is what today’s best AI systems are already capable of. And most importantly, they good view AI risks – those that we can test, plan for and try to prevent in advance.

The biggest risks of AI are the ones we can’t foresee. And the more time I spend with AI systems like GPT-4, the less convinced I am that we know half of what’s coming.