This December, I'm highlighting the limitations of simple AI chatbots in online retail and demonstrating how AI agents enhance customer interactions.
In yesterday's issue, we saw how naive chatbots fail immediately due to lack of contextual reasoning. Today, we highlight a slightly different but equally challenging scenario: dealing with context shifts during the conversation.
A customer often changes their mind mid-conversation. A naive Retrieval Augmented Generation (RAG) chatbot may struggle to incorporate these shifts, while an AI agent will dynamically adapt, refining its recommendations based on updated customer input.
Source Code
For a deeper dive into this example, check out my GitHub repository for a detailed, step-by-step implementation in Jupyter Notebooks, breaking down the code and methodology used by the AI agent in today's scenario.
Introducing SoleMates
SoleMates is our fictional online shoe store we'll use to illustrate these concepts:
SoleMates is our fictional online shoe store
As with Day 1, we'll explore how the customer's changing needs confuse a naive chatbot, while an AI agent gracefully handles the shift.
Today's Challenge: Failure to Adapt After Context Shift
Scenario
A customer initiates a chat with SoleMates:
Customer: "I'm looking for women's casual shoes"
A customer initiates a chat with SoleMates and asks about casual women's shoes
The naive chatbot vectorizes the customer query and retrieves products and recommends a variety of women's casual shoes:
The naive chatbot correctly pulls casual women's shoes
Sudden context shift
However, the customer then shifts the context and suddenly says "Actually, I need something more formal":
The customer is suddenly looking for formal shoes
Naive Chatbot Response
When the customer updates their request, the naive RAG system processes it as a brand-new query without considering the earlier conversation.
Hence the chatbot replies with irrelevant formal shoes:
A naive RAG chatbot processes each user message independently and replies with irrelevant shoes
Here's what happens step-by-step:
1. First Query:
The customer says, "I'm looking for women's casual shoes"
The naive chatbot takes this exact sentence, turns it into a vector, and searches the database. It then shows women's casual shoes.
2 Second Query (Context Shift):
The customer then says, "Actually, I need something more formal"
Instead of remembering the customer wanted women's shoes, the chatbot treats this as a completely separate query.
It takes "Actually, I need something more formal" vectorizes it on its own, and searches again.
Now, it did return results that are formal but not necessarily women's shoes, instead it found some formal men's shoes.
Why Did the Naive Chatbot Fail?
The naive chatbot doesn't carry over important details from the first message when handling the second one. Each time, it starts fresh:
- No Ongoing Memory: Doesn't connect "Actually, I need something more formal" to the earlier "women's casual shoes" request
- Independent Vectorization: Treats each message on its own, losing details like "women's" or "shoes"
- No Context Linking: Doesn't adjust its search to include both past and present requirements
Limitations Highlighted
- Lack of Conversation Memory: Forgets earlier information when the user's request changes
- Rigid Query Handling: Each message is processed as if it's the very first
- Inconsistent Results: The second answer doesn't build on the first, causing confusion and irrelevant product suggestions
By not linking the two messages, the chatbot forgets the earlier details and may show results that no longer fit the full picture (women's formal shoes).
AI Agent Solution
The AI agent keeps track of the whole conversation.
When the customer changes their mind, the AI agent doesn't forget the original focus on women's shoes. Instead, it updates the search from "casual" to "formal" while still looking for women's footwear.
Agent's Interaction:
- Customer: "I'm looking for women's casual shoes"
- Agent: "Got it. Here are some options:"
- Customer: "Actually, I need something more formal"
- Agent: "Understood. Let's switch to women's formal shoes:"
The AI agent doesn't forget the original focus on women's shoes
The AI agent keeps track of the whole conversation. When the customer changes their mind, the agent doesn't forget the original focus on women's shoes.
Instead, it updates the search from "casual" to "formal" while still looking for women's footwear.
The AI agent has a conversation memory and keeps track of the whole conversation with the customer.
How Did the AI Agent Succeed?
- Conversation Memory: Remembers the earlier detail - women's shoes - and updates only the "casual" part to "formal"
- Flexible Reasoning: Adapts the vectorized query without starting from zero each time
- Accurate Results: Finds women's formal shoes that match the new requirement
Key Takeaways
Naive Chatbot Limitation
- Doesn't remember earlier requests and treats new messages as unrelated searches
AI Agent Advantages
- Conversation Memory: Maintains a running understanding of the conversation
- Flexible Reasoning: Updates the search based on the latest input without losing previous details
- Accurate Results: Delivers results that stay relevant as the user's needs evolve
Conclusion
This example shows how a naive RAG chatbot fails when the user changes their mind mid-conversation. By not connecting the dots, it provides unhelpful results.
An AI agent, on the other hand, can smoothly adapt to the changing request, keeping track of what was said before and making sure the recommendations stay on target.
Stay tuned for tomorrow's issue, where we'll explore another challenge and see how AI agents handle it better than simple chatbots.
About This Series
In this series, we highlight common problems with naive RAG chatbots and show how AI agents solve them. By understanding these differences, developers and businesses can implement smarter, more helpful AI solutions for online retail.
Excited about building smarter AI agents? I'm launching a course soon where you'll learn to build and deploy your own AI agent chatbot. Sign up here!
Coming Up: Understanding Context Shifts
In tomorrow's issue, we'll explore how AI Agents Handle Requests for Specific Measurements
Behind the Scenes: Code Snippets
Here's a simplified illustration of how the AI agent processes the query.
We're giving the agent access to two tools:
- Vector database metadata filtering
- Vector database query
1. Vector database metadata filtering tool
Allows the agent to create filters based on available metadata (e.g., gender, product type, usage).
It informs the agent of what metadata is filterable.
def create_metadata_filter(filter_string):
# Parses the filter_string and returns a list of filters
filters = parse_filters(filter_string)
return filters # Example: [{"key": "gender", "value": "women", "operator": "=="}, {"key": "usage", "value": "formal", "operator": "=="}]
2. Vector database query
The agent selects what query to vectorize (e.g., "shoes") and uses the metadata filter to retrieve relevant products.
import boto3
def search_footwear_database(query_str, filters_json):
# Embeds the query string and searches the vector database with filters
embedded_query = embed_query_aws_titan(query_str)
results = vector_db.search(embedded_query, filters=filters_json)
return results
I use AWS Titan, a multimodal embedding model that converts both product texts and images into vectors, integrated as the function embed_query_aws_titan
into the AI agent tool search_footwear_database
.
This means the AI agent can process a query like "red heels" and match it to not only product descriptions but also actual images of red heels in the database.
By combining text and image data, the model helps the AI agent provide more relevant and visually aligned recommendations based on the customer's input.
Agent workflow
Construct agent worker with access to the two tools:
from llama_index.core.agent import FunctionCallingAgentWorker
agent_worker = FunctionCallingAgentWorker.from_tools(
[
create_metadata_filters_tool,
query_vector_database_tool,
])
Initialize the agent:
from llama_index.core.agent import AgentRunner
agent = AgentRunner(
agent_worker,
)
Initiate conversation with agent:
agent.chat("I'm looking for women's casual shoes")
Agent decides to apply filters: women's + casual and received this JSON from the create_metadata_filter
tool:
[
{"key": "gender", "value": "women", "operator": "=="},
{"key": "usage", "value": "formal", "operator": "=="}
]
The AI agent decides to apply a filter and pulls causal shoes from the vector database
Here's the full agent output:
Added user message to memory: I'm looking for women's casual shoes
=== Calling Function ===
Calling function: create_metadata_filter with args: {"filter_string": "women's casual shoes"}
=== Function Output ===
[{"key": "gender", "value": "women", "operator": "=="}, {"key": "usage", "value": "casual", "operator": "=="}]
=== Calling Function ===
Calling function: search_footwear_database with args: {"query_str": "shoes", "filters_json": [{"key": "gender", "value": "women", "operator": "=="}, {"key": "usage", "value": "casual", "operator": "=="}]}
=== LLM Response ===
Here are some women's casual shoes you might like:
1. **Hm Women Brown Shoes**
- Type: Flats
- Brand: Hm
- Price: $155
2. **Gliders Women Brown Shoes**
- Type: Casual Shoes
- Brand: Gliders
- Price: $75
3. **Catwalk Women Red Shoes**
- Type: Casual Shoes
- Brand: Catwalk
- Price: $75
4. **Rocia Women Multi-Coloured Shoes**
- Type: Flats
- Brand: Rocia
- Price: $130
5. **Skechers Women Black Casual Shoes**
- Type: Casual Shoes
- Brand: Skechers
- Price: $185
6. **Reebok Women Black Casual Shoes**
- Type: Flats
- Brand: Reebok
- Price: $85
7. **Nike Women Flyclave Black Casual Shoes**
- Type: Casual Shoes
- Brand: Nike
- Price: $175
8. **Catwalk Women Turquoise Casual Shoes**
- Type: Casual Shoes
- Brand: Catwalk
- Price: $65
Let me know if you need more information or if there's anything else I can help with!
We suddenly change our mind:
agent_response = agent.chat("Actually, I need something more formal")
- Agent updates filters: women's + formal (instead of casual)
- It doesn't start over; it adjusts the existing context
The new JSON filter from the create_metadata_filter
tool is:
[
{"key": "gender", "value": "women", "operator": "=="},
{"key": "usage", "value": "formal", "operator": "=="}
]
And here's the full thought process of the agent after the context shift:
Added user message to memory: Actually, I need something more formal
=== Calling Function ===
Calling function: create_metadata_filter with args: {"filter_string": "women's formal shoes"}
=== Function Output ===
[{"key": "gender", "value": "women", "operator": "=="}, {"key": "usage", "value": "formal", "operator": "=="}]
=== Calling Function ===
Calling function: search_footwear_database with args: {"query_str": "shoes", "filters_json": [{"key": "gender", "value": "women", "operator": "=="}, {"key": "usage", "value": "formal", "operator": "=="}]}
Here's the final reply from the agent:
=== LLM Response ===
Here are some options for women's formal shoes:
1. **Catwalk Women Corporate Leather Black Heels**
- Type: Heels
- Price: $155
- Brand: Catwalk
2. **Carlton London Women Black Heels**
- Type: Heels
- Price: $200
- Brand: Carlton London
3. **Carlton London Women Nude Heels**
- Type: Heels
- Price: $200
- Brand: Carlton London
If you have any specific preferences or need further assistance, feel free to let me know!
The AI agent doesn't forget the original focus on women's shoes
Additional Resources
For a deeper dive into this example, check out my GitHub repository where I break down the code and methodology used by the AI agent in today's scenario.