ChatGPT Agent Review: What It Can Do

Contents

1 ChatGPT Agent Mode Reviewed
2 My Conclusion

“Hello ChatGPT, use agent mode and write me a review about ChatGPT’s new agent mode” — this or something similar might soon reverberate whenever a tech journalist has to conjure a review. The million-dollar question is: Does it work at the moment? Probably not, as I’m struggling with the review myself. Nevertheless, the agent from OpenAI is said to have some interesting tricks up its sleeve.

I’ve taken a closer look at it and will share what you can already do in agent mode — and what you can’t do, yet. For a more detailed explanation of agent mode within ChatGPT, please click on the link. That’s where I got to grips with this agent mode.

Here is a brief summary of the most important things you should know:

Agent mode is currently only available to users who use at least the Plus subscription for $20 per month.
The agent is also still at a very early stage. Even OpenAI explicitly pointed out that it is still buggy.
The agent not only answers questions, but also independently researches and creates content, organizes data, or makes bookings.
Tools: In agent mode, ChatGPT can access tools such as a browser, a code interpreter (Python), a file manager, and external services or plugins.

ChatGPT Agent Mode Reviewed

Experiment 1: Buying a New Pair of Pants

What tasks do I set ChatGPT up with? I choose to take the easy way out. I tap the toolbox on the left of the input line. Instead of “Create image” or “Deep research”, I select Agent Mode, and I’m ready to go: I need a new pair of pants. The order is sent to my new agent: Find out the trends of the year and order me two pairs in trendy colors. The dimensions? W38 / L32. And yes, I’m a lot bulkier, but I didn’t want ChatGPT to know how fat I really am. Second, I wanted to choose a pants size that you can literally purchase from anywhere, not just in extra-large specialty stores.

ChatGPT got off the blocks easily and responded after a few seconds: In addition to a summary of the order, the AI asked: How much am I willing to spend, and do I have a specific brand in mind? An upper limit $120 was set, and no, I am not looking for a specific brand — with ChatGPT getting to work. Ten minutes later, my hard-working helper reported that I had indeed been browsing through fashion magazines, ChatGPT had looked at Vogue, among others, and informed me that wide and loose cargo pants are the trend. I have to believe it blindly because I’m a real fashion victim.

Basically, I am only familiar with a few colors: black, white, red, yellow, green, blue, and I’m done (okay, this is somewhat exaggerated). In fact, other fashion magazines do mention the likes of cream, beige, mocha mousse, moon ash gray, and burgundy as trending colors. Boy, oh boy! My agent often ends up on the pastel shade trail with the others, too.

The end result?

I get the following information after ChatGPT told me which two pairs of pants it had selected, “Both items are in your cart with a subtotal of $74.80. Would you like to proceed to checkout and complete your purchase?” I’m sure it meant well, but it doesn’t work. I’m asked to authorize the transaction and informed about the risk, but nothing happened after that.

When I followed up with it later, the AI told me the shopping cart is only accessible as long as I don’t leave the browser or session. I didn’t, but it still didn’t work. That’s a shame, but the agent still helped me here: it linked me to the pants (from Allessandro Salvarini, by the way) and even offered to create QR codes for them so that I can find them later using my smartphone.

The bottom line is, I didn’t buy anything, and am just left with a couple of pants in the shopping cart. But still, mad props to ChatGPT as the agent performed all the research, hunted down products for me that met my requirements, and provided the prices and links for me. First of all, it only took ten minutes and, more importantly, freed up my ten minutes, where I was able to do something completely different.

Experiment 2: I want Asian Noodles!

I informed ChatGPT that I eat those highly processed instant noodles far too often. YumYum noodles happen to be one brand I frequently eat. Since I always top up these noodles with fresh vegetables anyway, I thought to myself, why not have some fun cooking them with ChatGPT input?

I then asked my AI what ingredients can be found in the “YumYum duck noodles” and where I can source them locally. Alternatively, ChatGPT was supposed to find everything online and send it to my shopping cart again. You can watch as it sorts the task, searches various pages for the product, clicks away at the cookie buttons, and the like.

The end result?

In step 1, ChatGPT hunted down what ingredients were used via the packaging and what I needed to recreate a similar experience at home. Since I’m lazy and didn’t want to take countless screenshots on my smartphone of the table provided, ChatGPT had to return to work. Hence, I ended up with the overview generated as an image. It worked almost flawlessly, albeit with several misplaced typos. That has more to do with GPT itself and less with the agent mode. Oh, speaking of which: I started these tests using GPT-4, not yet with the new GPT-5.

Table of fresh ingredients and substitutes for Asian noodles including categories and suggestions. — Apart from a few minor typos, the table actually looked rather accurate, doesn’t it? / © nextpit Image source: nextpit

The AI then informed me where I can obtain the various ingredients around town. It seemed a little biased, because it only suggested an Asian supermarket, even though there are many other options which were also closer to home. Numerous conventional discount stores were also part of the equation, and I was also pointed in the direction of the weekly market.

The agent then got to work on the orders. The entire experience was similar to the earlier one, which involved pants. The items ended up in the shopping cart, apart from the noodles. I was supposed to buy it on the spot, mentioned the impertinent AI support. Once again, this is an impossible task. I pressed the button again, but nothing happened, and I’m left with an empty stomach. Incidentally, when I pressed the button in question, I was told something along the lines of: “Please proceed. Do not ask any additional follow-up questions.” Okay then, I won’t. While the research bit was great, the entire experience ended up flat.

Experiment 3: The Amiga Loading Screen

Yes, I would really like my agent to put together a loading animation for me like the one that can be found on the Amiga console when you load games. There were pixelated tickers running through the screen, telling us which hacker crew was the best, with a few graphic jokes displayed on the screen, and typical 8-bit to 16-bit music playing in the background. Again, ChatGPT came up with a few follow-up questions before it got going.

Here’s the end result:

It’s not award-winning, but yes, ChatGPT programmed roughly what I wanted and delivered the music in a separate file. Both worked right away, but I had to fiddle around with the audio file and the animation myself in a video editor of my choice — at least that’s what my AI buddy recommended. You can view a screenshot of its efforts above — and you just have to trust that my text scrolled through the picture and my desired UFOs moved around as expected.

So yes, the very last step — animation and music in a video — was denied to me. But okay, I could actually piece it together on my smartphone in the video editor.

Experiment 4: The Space Game

Fresh from a dose of unwarranted overconfidence after my first coding success, I now wanted ChatGPT to program a space game in agent mode. My brief? Please program a simple game. A game where I can shoot spaceships. That was the complete prompt; that’s all I did. There was an immediate request: do I want it to be web-based as a JavaScript game, or rather have it playable on a desktop (Python)? I opted for the browser version.

The agent also asked me if I would be happy with a simple shooting mechanic or if I had any other ideas. When presented with a question like that, I want special weapons, of course.

The end result?

It took three minutes for ChatGPT to actually build me a game with my Spartan instructions. Yes, a simple one, but it works! At first, I couldn’t play it on my smartphone, so ChatGPT simply listed the HTML lines that I could save as .html files myself.

A game screen showing a score of 4400 with red obstacles and a blue triangular player at the bottom. — Yes, well, it’s not graphically opulent. But ChatGPT created it in 180 seconds — hats off! / © nextpit Image source: nextpit

Was the game exciting? Yes, for about two minutes. Can I shoot spaceships and have special weapons? Yes, but hey, there’s more! The agent also provided me with an overview of the button mapping. Thanks so much. How did the game play? According to instructions, so to speak, I can move sideways at the bottom and shoot up, from where the damn aliens attack me. It carries a little Atari vibe from 1982, but that’s exactly how I wanted it.

Experiment 5: My Song

I like to rant about AI music, but I also like to experiment with it. So I just let AI do the heavy lifting for me. My brief was this: compose a song for myself and write a song about a cool guy who works as a tech journalist. I also wanted a matching cover artwork.

The end result?

The lyrics were spot on: Want to read an excerpt?

Every gadget tells a story, every update makes him smile. He turns the world of bits and bytes into poetry and style.

Enchanting — and the cover design for the song “Ink & Silicon Hearts” was also impressive.

A silhouette of a person using a laptop amidst digital symbols and waves, representing technology and communication. — The cover for the song certainly met my artificial expectations. / © AI-generated Image source: KI-generiert

The only thing missing is the song itself, right? And that was going to be quite a journey. At first, the download failed repeatedly. Then, ChatGPT decided to upload the song again. When it still didn’t work, the AI started troubleshooting. For instance, it asked me about the device I was currently using and whether I was using the ChatGPT app or the AI in the browser.

ChatGPT then had an idea: WAV files might be too big for the smartphone, so I created an MP3. That actually worked, but only on the second attempt because the song was no longer available. Well, the ambitious and restless AI did not despair but simply pieced together a new song. I downloaded it, played it, and heard my ‘creation’: nothing but a continuous tone, ten seconds long.

It is time for a new plan, and ChatGPT really wants to make me a real hit now. It asked me what I wanted again. I was more specific in my idea: it should sound electronic and 80s, a bit like early Depeche Mode, and this time with accompanying vocals, please. The song should have a fast tempo and be uplifting.

The result was a few seconds of droning that was more reminiscent of my Amiga demo from earlier than actual music. ChatGPT made another attempt, confirmed my requirements again, but this time noted that it will take longer. I should be patient for an hour or two.

… and I’ve been waiting ever since. Okay, I asked it some questions while waiting because the supposed time window was long over. Answer: “Gee, what a coincidence — I’ve just finished”. But the download link didn’t work: “File expired” was all I got. And since then, I’ve been waiting again in vain … I’ll keep you up to date if anything changes! Hence, the song composing experiment failed miserably.

Experiment 6: My Journalism Presentation

After this failure, I wanted to give it one last try. Write a presentation for my bosses to see what great articles I’ve already written — even those that were not published within nextpit. I also wanted ChatGPT to mention my strengths and writing style. I wanted all of these in a presentation format with a maximum of ten slides.

The final result?

Firstly, hats off to ChatGPT. After all, the AI summarized in the created presentation that I managed to combine technology with social context. The conclusion also emphasized that I explained complex topics and added a personal touch, albeit with some critical analysis at times. My many years of experience, storytelling style, and even my sense of humor were also mentioned.

However, it was sobering to take a closer look. During the process, I saw a really cool cover image as the lead story. In the completed presentation, all that remained next to my name was a meaningless snippet of an article image that I once used. The compilation of different articles was also not very effective. I wanted an overview of my work, but basically, the agent only looked at a handful of the most recent articles.

Four nextpit articles and one blog article from my private blog were selected. The texts were summarized in a few key points. However, such a summary was not the aim of the presentation. Two additional images cannot be displayed. Apart from that, the table of contents with an image from an article actually looked pretty decent:

A person stands before a glowing sphere in a high-tech room with data displays. Text discusses superintelligence. — Simple, clear—and not quite what I wanted: My agent-created presentation. / © nextpit Image source: nextpit

Hence, I did not quite get the end product I wanted. However, a presentation was created in a PowerPoint-friendly format, and can also be opened by other apps. The 10-slide limit was also taken into consideration, and there is a summary section that categorizes my work. I would therefore say that this test was a success and would have worked even better if I had put more effort into the prompt.

My Conclusion

OpenAI would do well to point out that a) the agent is still in its infancy and b) there are other risks to consider than with the usual use of ChatGPT. After all, the agent acted largely autonomously in the browser and can, at least in theory, complete purchases. In reality, I was unable to do this in my various scenarios because the last step — payment — could not be carried out.

When it comes to research, the agent actually served me well, regardless of whether I want to plan a city trip, buy a pair of trousers, or obtain ingredients for a recipe. The time savings here were definitely impressive. The agent’s programming, if necessary, brought everything together as required, and I noticed one more positive aspect: Troubleshooting! If something did not work, the agent searched for solutions independently. By doing so, it always finds ways to make something work that didn’t work before.

If you were to try out agent mode today, with a bit of luck, you might end up with a simple game or a really bad song. The agent is also useful as a research tool. But for any other purpose, you should always bear in mind that agent mode is still a long way from where it should be. Hence, it is certainly not (yet) a tool that is worth the costly subscription. What it is, however, is a bet on the future and a clue to what OpenAI can make possible in the foreseeable future.