Research & Opinions

LLMs & Autonomous Agents

This article explores the world of autonomous agents, explaining how LLMs make them possible, what they're capable of, and what we might expect from them in the future.
Shanif Dhanani
4.4 minutes

Read time: 4.4 minutes

---

One of the most frustrating aspects of work is knowing that you have a lot of tasks on your plate that don’t require much effort, but will end up taking up a lot of your time. It’s a huge challenge to time management and productivity — and one of the main reasons businesses find large language models (LLMs) so appealing.

LLMs like ChatGPT have been fed billions of words from content all over the internet. Accessing and analyzing this content allows them to learn to detect patterns in language and relationships, understanding how words fit together. When you ask an LLM a question, it uses this knowledge base to return relevant information to your query. 

An interesting consequence of training these tools on the world’s textual data is that these systems developed a “model of the world.” At least that’s what Andrew Ng, one of the godfathers of A.I. seems to think. In doing so, LLMs have developed the capability to “reason” through a complex concept or decision, which makes them perfectly-suited for large scale, automated decision-making, something we’ve never had before, at least not with this level of sophistication.

What are autonomous agents?

There's been a lot of buzz recently around the idea of "autonomous agents", or applications that can act without human input to accomplish a complex task. Recently, there was a popular open-source tool introduced on GitHub named "AutoGPT", which leverages agents to handle a user's request to accomplish a task, and it wowed a lot of people.

The recent advent of large language models (LLMs) has enabled software to function in ways that were never really possible before.

You may have heard the term "agent" before, and if you have, you might be confused as to what we're actually referring to when it comes to LLM-powered autonomous agents. The word "agent" has become a catchall term in the world of A.I. to refer to any software application that can take an action, or a series of actions, to accomplish a larger goal. A few years ago, the term became very popular when advancements in reinforcement learning led to new software that could beat the world's best players in Starcraft and Go.

In those cases, researchers had built "agents" that were capable of playing each respective game, taking a series of actions in an optimal manner, that ultimately led to software being able to beat reigning world champions.

In the world of LLMs, an "agent" is a software application that can take a series of actions to accomplish a certain task. Unlike in the world of reinforcement learning, LLM agents aren't optimized to maximize a specific objective (like winning a game of Starcraft). Rather, they can perform an action, like calling an API or searching the internet, as part of a series of tasks that are needed to accomplish some user-specified objective.

With the advent of LLMs, we now have a tool that can allow us to have a lot more flexibility when it comes to accomplishing an objective. Rather than trying to specify a strict set of rules for a particular use case, we can now leverage LLMs to "understand" what needs to be done at a certain point in time, and tell us what to do next.

The power of autonomous agents & LLMs

For years, your only options to handle a large volume of low-effort but time-consuming tasks were to a) outsource the work to another person, b) find the time to do everything yourself, or c) create or buy rules-based, highly-focused and specific pieces of software that were designed to complete individual pieces of tasks. But with the emergence of LLMs like ChatGPT, you can now use general-purpose software applications to manage these tasks instead.

An example of where something like this might come in handy is a virtual assistant app that's responsible for handling a wide variety of instructions from you. For example, you might want an app that plugs into your calendar to schedule meetings for you, plugs into your email service to send emails from you, and connects to your social media accounts on demand to write a new post when you tell it to do so. An application like this would need to be able to handle a wide variety of commands.

Before LLMs, to create an app like this, you'd have to use some advanced keyword matching code to try to have the application understand what task it needed to accomplish next, and you'd need to have a lot of state management and error handling in your app to handle edge cases. However, with the advent of LLMs, you can now "outsource" a lot of the hard work to a language model, which will act as an orchestration layer.

Automation now and in the future

Despite their incredible potential, we’re still figuring out how to fully optimize these tools. Creating high-functioning agents that interface with LLMs requires significant engineering work; programmers have to ensure both the agent and the LLM are prepared to tackle tasks one step at a time, evaluate and access various tools, and overcome errors and knowledge gaps.

That’s why for now, the most reliable and effective agents focus on automating just a few use cases. In an ideal world, LLMs and autonomous agents would work seamlessly in tandem to automate a wide variety of requests, responses, and actions. As we develop more and more sophisticated software, we could even create systems where autonomous agents communicate with one another to handle high-level, more complex tasks. 

Moving forward with agents, one step at a time

Using LLMs to power autonomous agents is just one way we’re bringing AI capabilities closer to that of a human’s. LLMs can create strategic plans, give instructions, and provide feedback, while autonomous agents can put those plans, instructions, and feedback into action. 

We’re always on the lookout for ways to make our lives a little easier. With LLMs becoming more widely used, we’re sure to continue to see a push for more sophisticated and varied autonomous agents that streamline our most tedious tasks. They may not be able to do it all just yet…but hey, even one task off your plate when you’re feeling overwhelmed can really make a difference.