Advent Calendar Day 2: How AI Agents Improve Naive Chatbots by Understanding Context Shifts

December 8, 2024 · 11 min read

AI Consultant & Developer

This December, I'm highlighting the limitations of simple AI chatbots in online retail and demonstrating how AI agents enhance customer interactions.

In yesterday's issue, we saw how naive chatbots fail immediately due to lack of contextual reasoning. Today, we highlight a slightly different but equally challenging scenario: dealing with context shifts during the conversation.

A customer often changes their mind mid-conversation. A naive Retrieval Augmented Generation (RAG) chatbot may struggle to incorporate these shifts, while an AI agent will dynamically adapt, refining its recommendations based on updated customer input.

Source Code

For a deeper dive into this example, check out my GitHub repository for a detailed, step-by-step implementation in Jupyter Notebooks, breaking down the code and methodology used by the AI agent in today's scenario.

Introducing SoleMates

SoleMates is our fictional online shoe store we'll use to illustrate these concepts:

SoleMates is our fictional online shoe store

As with Day 1, we'll explore how the customer's changing needs confuse a naive chatbot, while an AI agent gracefully handles the shift.

Today's Challenge: Failure to Adapt After Context Shift

Scenario

A customer initiates a chat with SoleMates:

Customer: "I'm looking for women's casual shoes"

A customer initiates a chat with SoleMates

A customer initiates a chat with SoleMates and asks about casual women's shoes

The naive chatbot vectorizes the customer query and retrieves products and recommends a variety of women's casual shoes:

The naive chatbot pulls casual women's shoes

The naive chatbot correctly pulls casual women's shoes

Sudden context shift

However, the customer then shifts the context and suddenly says "Actually, I need something more formal":

The customer is suddenly looking for formal shoes

Naive Chatbot Response

When the customer updates their request, the naive RAG system processes it as a brand-new query without considering the earlier conversation.

Hence the chatbot replies with irrelevant formal shoes:

A naive RAG chatbot processes each user message independently

A naive RAG chatbot processes each user message independently and replies with irrelevant shoes

Here's what happens step-by-step:

1. First Query:
The customer says, "I'm looking for women's casual shoes"
The naive chatbot takes this exact sentence, turns it into a vector, and searches the database. It then shows women's casual shoes.

2 Second Query (Context Shift):
The customer then says, "Actually, I need something more formal"
Instead of remembering the customer wanted women's shoes, the chatbot treats this as a completely separate query.

It takes "Actually, I need something more formal" vectorizes it on its own, and searches again.

Now, it did return results that are formal but not necessarily women's shoes, instead it found some formal men's shoes.

Why Did the Naive Chatbot Fail?

The naive chatbot doesn't carry over important details from the first message when handling the second one. Each time, it starts fresh:

No Ongoing Memory: Doesn't connect "Actually, I need something more formal" to the earlier "women's casual shoes" request
Independent Vectorization: Treats each message on its own, losing details like "women's" or "shoes"
No Context Linking: Doesn't adjust its search to include both past and present requirements

Limitations Highlighted

Lack of Conversation Memory: Forgets earlier information when the user's request changes
Rigid Query Handling: Each message is processed as if it's the very first
Inconsistent Results: The second answer doesn't build on the first, causing confusion and irrelevant product suggestions

Explanation

By not linking the two messages, the chatbot forgets the earlier details and may show results that no longer fit the full picture (women's formal shoes).

AI Agent Solution

The AI agent keeps track of the whole conversation.

When the customer changes their mind, the AI agent doesn't forget the original focus on women's shoes. Instead, it updates the search from "casual" to "formal" while still looking for women's footwear.

Agent's Interaction:

Customer: "I'm looking for women's casual shoes"
Agent: "Got it. Here are some options:"
Customer: "Actually, I need something more formal"
Agent: "Understood. Let's switch to women's formal shoes:"

The AI agent doesn't forget the original focus on women's shoes

The AI agent keeps track of the whole conversation. When the customer changes their mind, the agent doesn't forget the original focus on women's shoes.

Instead, it updates the search from "casual" to "formal" while still looking for women's footwear.

Conversation memory

The AI agent has a conversation memory and keeps track of the whole conversation with the customer.

How Did the AI Agent Succeed?

Conversation Memory: Remembers the earlier detail - women's shoes - and updates only the "casual" part to "formal"
Flexible Reasoning: Adapts the vectorized query without starting from zero each time
Accurate Results: Finds women's formal shoes that match the new requirement

Key Takeaways

Naive Chatbot Limitation

Doesn't remember earlier requests and treats new messages as unrelated searches

AI Agent Advantages

Conversation Memory: Maintains a running understanding of the conversation
Flexible Reasoning: Updates the search based on the latest input without losing previous details
Accurate Results: Delivers results that stay relevant as the user's needs evolve

Conclusion

This example shows how a naive RAG chatbot fails when the user changes their mind mid-conversation. By not connecting the dots, it provides unhelpful results.

An AI agent, on the other hand, can smoothly adapt to the changing request, keeping track of what was said before and making sure the recommendations stay on target.

Stay tuned for tomorrow's issue, where we'll explore another challenge and see how AI agents handle it better than simple chatbots.

About This Series

In this series, we highlight common problems with naive RAG chatbots and show how AI agents solve them. By understanding these differences, developers and businesses can implement smarter, more helpful AI solutions for online retail.

Learn to Build Your Own AI Agent Chatbot

Excited about building smarter AI agents? I'm launching a course soon where you'll learn to build and deploy your own AI agent chatbot. Sign up here!

Coming Up: Understanding Context Shifts

In tomorrow's issue, we'll explore how AI Agents Handle Requests for Specific Measurements

Behind the Scenes: Code Snippets

Here's a simplified illustration of how the AI agent processes the query.

We're giving the agent access to two tools:

Vector database metadata filtering
Vector database query

1. Vector database metadata filtering tool

Allows the agent to create filters based on available metadata (e.g., gender, product type, usage).

It informs the agent of what metadata is filterable.

def create_metadata_filter(filter_string):

    # Parses the filter_string and returns a list of filters
    filters = parse_filters(filter_string)

    return filters  # Example: [{"key": "gender", "value": "women", "operator": "=="}, {"key": "usage", "value": "formal", "operator": "=="}]

2. Vector database query

The agent selects what query to vectorize (e.g., "shoes") and uses the metadata filter to retrieve relevant products.

import boto3

def search_footwear_database(query_str, filters_json):

    # Embeds the query string and searches the vector database with filters
    embedded_query = embed_query_aws_titan(query_str)
    results = vector_db.search(embedded_query, filters=filters_json)

    return results

AWS Titan Multimodal Embeddings in Action

I use AWS Titan, a multimodal embedding model that converts both product texts and images into vectors, integrated as the function embed_query_aws_titan into the AI agent tool search_footwear_database.

This means the AI agent can process a query like "red heels" and match it to not only product descriptions but also actual images of red heels in the database.

By combining text and image data, the model helps the AI agent provide more relevant and visually aligned recommendations based on the customer's input.

Agent workflow

Construct agent worker with access to the two tools:

from llama_index.core.agent import FunctionCallingAgentWorker

agent_worker = FunctionCallingAgentWorker.from_tools(
    [
        create_metadata_filters_tool,
        query_vector_database_tool,
    ])

Initialize the agent:

from llama_index.core.agent import AgentRunner

agent = AgentRunner(
    agent_worker,
)

Initiate conversation with agent:

agent.chat("I'm looking for women's casual shoes")

Agent decides to apply filters: women's + casual and received this JSON from the create_metadata_filter tool:

[
  {"key": "gender", "value": "women", "operator": "=="}, 
  {"key": "usage", "value": "formal", "operator": "=="}
]

The AI agent decides to apply a filter and pulls causal shoes

The AI agent decides to apply a filter and pulls causal shoes from the vector database

Here's the full agent output:

Added user message to memory: I'm looking for women's casual shoes
=== Calling Function ===
Calling function: create_metadata_filter with args: {"filter_string": "women's casual shoes"}
=== Function Output ===
[{"key": "gender", "value": "women", "operator": "=="}, {"key": "usage", "value": "casual", "operator": "=="}]
=== Calling Function ===
Calling function: search_footwear_database with args: {"query_str": "shoes", "filters_json": [{"key": "gender", "value": "women", "operator": "=="}, {"key": "usage", "value": "casual", "operator": "=="}]}
=== LLM Response ===
Here are some women's casual shoes you might like:

1. **Hm Women Brown Shoes**
   - Type: Flats
   - Brand: Hm
   - Price: $155

2. **Gliders Women Brown Shoes**
   - Type: Casual Shoes
   - Brand: Gliders
   - Price: $75

3. **Catwalk Women Red Shoes**
   - Type: Casual Shoes
   - Brand: Catwalk
   - Price: $75

4. **Rocia Women Multi-Coloured Shoes**
   - Type: Flats
   - Brand: Rocia
   - Price: $130

5. **Skechers Women Black Casual Shoes**
   - Type: Casual Shoes
   - Brand: Skechers
   - Price: $185

6. **Reebok Women Black Casual Shoes**
   - Type: Flats
   - Brand: Reebok
   - Price: $85

7. **Nike Women Flyclave Black Casual Shoes**
   - Type: Casual Shoes
   - Brand: Nike
   - Price: $175

8. **Catwalk Women Turquoise Casual Shoes**
   - Type: Casual Shoes
   - Brand: Catwalk
   - Price: $65

Let me know if you need more information or if there's anything else I can help with!

We suddenly change our mind:

agent_response = agent.chat("Actually, I need something more formal")

Agent updates filters: women's + formal (instead of casual)
It doesn't start over; it adjusts the existing context

The new JSON filter from the create_metadata_filter tool is:

[
  {"key": "gender", "value": "women", "operator": "=="}, 
  {"key": "usage", "value": "formal", "operator": "=="}
]

And here's the full thought process of the agent after the context shift:

Added user message to memory: Actually, I need something more formal
=== Calling Function ===
Calling function: create_metadata_filter with args: {"filter_string": "women's formal shoes"}
=== Function Output ===
[{"key": "gender", "value": "women", "operator": "=="}, {"key": "usage", "value": "formal", "operator": "=="}]
=== Calling Function ===
Calling function: search_footwear_database with args: {"query_str": "shoes", "filters_json": [{"key": "gender", "value": "women", "operator": "=="}, {"key": "usage", "value": "formal", "operator": "=="}]}

Here's the final reply from the agent:

=== LLM Response ===
Here are some options for women's formal shoes:

1. **Catwalk Women Corporate Leather Black Heels**
   - Type: Heels
   - Price: $155
   - Brand: Catwalk

2. **Carlton London Women Black Heels**
   - Type: Heels
   - Price: $200
   - Brand: Carlton London

3. **Carlton London Women Nude Heels**
   - Type: Heels
   - Price: $200
   - Brand: Carlton London

If you have any specific preferences or need further assistance, feel free to let me know!

The AI agent doesn't forget the original focus on women's shoes

Additional Resources

For a deeper dive into this example, check out my GitHub repository where I break down the code and methodology used by the AI agent in today's scenario.

Source Code​

Introducing SoleMates​

Today's Challenge: Failure to Adapt After Context Shift​

Scenario​

Sudden context shift​

Naive Chatbot Response​

Why Did the Naive Chatbot Fail?​

Limitations Highlighted​

AI Agent Solution​

How Did the AI Agent Succeed?​

Key Takeaways​

Naive Chatbot Limitation​

AI Agent Advantages​

Conclusion​

About This Series​

Coming Up: Understanding Context Shifts​

Behind the Scenes: Code Snippets​

1. Vector database metadata filtering tool​

2. Vector database query​

Agent workflow​

Additional Resources​