Advent Calendar Day 1: How AI Agents Improve Naive Chatbots by Asking Clarifying Questions

December 6, 2024 · 9 min read

AI Consultant & Developer

This December, I'm highlighting the limitations of simple AI chatbots in online retail and demonstrating how AI agents enhance customer interactions.

Each day, we'll explore a common challenge faced by naive Retrieval Augmented Generation (RAG) chatbot systems and show how AI agents overcome them.

Todays topic is about how AI agents improve naive chatbots by asking clarifying questions.

Source Code

For a deeper dive into this example, check out my GitHub repository for a detailed, step-by-step implementation in Jupyter Notebooks, breaking down the code and methodology used by the AI agent in today's scenario.

Introducing SoleMates

SoleMates is our fictional online shoe store we'll use to illustrate these concepts:

SoleMates is our fictional online shoe store

We'll exam interactions between customers and chatbots at SoleMates, so we can understand and demonstrate the practical differences between simple chatbots and advanced AI agents.

Today's Challenge: No Reflection - Simple Chatbots Can't Infer from Context

Scenario

A customer initiates a chat with SoleMates:

Customer: "I need shoes for a black-tie event"

A customer initiates a chat with SoleMates

A customer initiates a chat with SoleMates and asks about shoes for a black-tie event

Naive Chatbot Response

A simple chatbot processes the query based on direct keyword matching and semantic search by vectorizing the entire customer query:

Chatbot: "Hi there! Sure! How about these?"

And the chatbot goes ahead and shows a bunch of casual shoes:

A simple chatbot processes the query based on direct keyword matching

A simple chatbot vectorizes the entire query, hence pulls products associated with "black" and "shoes", without understanding the importance of "black-tie event."

note

These are the recommended shoes retrieved from the vector database based on the customer query. By vectorizing the entire customer query, the chatbot pulls products associated with "black" and "shoes," without understanding the context of a "black-tie event."

Provogue men black shoes
Skechers women black casual shoes
Reebok women black casual shoes
Nike women flyclave black casual shoes

Why Did the Naive Chatbot Fail?

The naive chatbot lacks the ability to reflect on the context of the query.

It focuses on keywords like "shoes" and "black", missing that a "black-tie event" requires formal attire.

Naive RAGs vectorizes the entire customer query, hence are pulling black shoes neighbors from the vector database.

Limitations Highlighted

No Reflection: Cannot infer that the occasion necessitates formal shoes
Lack of Context Understanding: Treats all keywords equally without prioritizing the event's significance
Vectorizes Entire Query: Unable to rewrite the query or extract key concepts, it is forced to vectorize the entire customer query, leading to irrelevant results

Explanation

By vectorizing the entire query, the chatbot pulls products associated with "black" and "shoes", without understanding the importance of "black-tie event".

It lacks reasoning capabilities to infer that the customer needs formal men's shoes suitable for such an event.

AI Agent Solution

An AI agent approaches the query by interpreting the context and inferring the customer's true needs.

Agent's Interaction:

Customer: "I need shoes for a black-tie event"
Agent: "To help you find the perfect shoes for a black-tie event, could you please specify any preferences you might have? For example, are you looking for men's or women's shoes, a specific color, or any particular style?"
Customer: "Men's shoes for a black-tie event."
Agent: "Here are some options for men's shoes suitable for a black-tie event:

The AI agent asks the customer for more information

The AI agent proactively asks the customer for more information to ensure accurate recommendations

How Did the AI Agent Succeed?

The AI agent performs several intelligent steps:

Contextual Understanding: Recognizes that "black-tie event" is a formal occasion requiring appropriate attire
Clarification: Asks the customer for additional information to refine the search (e.g., gender, specific preferences)
Inference: Determines that the customer needs men's formal shoes suitable for a black-tie event
Query Refinement: Adjusts the search parameters to focus on formal men's shoes, rather than relying solely on semantic search and keyword matching
Relevant Recommendations: Suggests products that match the event's dress code and the customer's clarified preferences

Key Takeaways

Naive Chatbot Limitation

Fails to reflect on context and cannot infer the customer's true needs, leading to irrelevant suggestions.

AI Agent Advantages

Contextual Understanding: Interprets the significance of "black-tie event" as a formal occasion
Reasoning Ability: Infers that formal men's shoes are appropriate, even if not explicitly stated
Interactive Clarification: Proactively asks the customer for more information to ensure accurate recommendations
Query Optimization: Rather than vectorizing the entire query, the agent extracts key elements to refine the search effectively

Conclusion

This example illustrates why simple chatbots often fall short when handling customer inquiries that require contextual understanding and reasoning.

AI agents, with their ability to interpret context, ask clarifying questions, and refine queries, are essential for providing accurate and helpful responses in modern e-commerce settings.

Stay tuned for the next issue, where we'll explore another limitation of naive RAG systems and how AI agents address it!

About This Series

In this advent calendar, we're exploring the practical benefits of AI agents over simple chatbots in e-commerce settings. By understanding these differences, developers and businesses can implement more effective AI solutions.

Coming Up: Understanding Context Shifts

In tomorrow's issue, we'll explore How AI Agents Improve Naive Chatbots by Understanding Context Shifts

Behind the Scenes: Code Snippets

Here's a simplified illustration of how the AI agent processes the query.

We're giving the agent access to two tools:

Vector database metadata filtering
Vector database query

1. Vector database metadata filtering tool

Allows the agent to create filters based on available metadata (e.g., gender, product type, usage).

It informs the agent of what metadata is filterable.

def create_metadata_filter(filter_string):

    # Parses the filter_string and returns a list of filters
    filters = parse_filters(filter_string)

    return filters  # Example: [{'key': 'gender', 'value': 'men', 'operator': '=='}]

2. Vector database query

The agent selects what query to vectorize (e.g., "shoes") and uses the metadata filter to retrieve relevant products.

import boto3

def search_footwear_database(query_str, filters_json):

    # Embeds the query string and searches the vector database with filters
    embedded_query = embed_query_aws_titan(query_str)
    results = vector_db.search(embedded_query, filters=filters_json)

    return results

AWS Titan Multimodal Embeddings in Action

I use AWS Titan, a multimodal embedding model that converts both product texts and images into vectors, integrated as the function embed_query_aws_titan into the AI agent tool search_footwear_database.

This means the AI agent can process a query like "red heels" and match it to not only product descriptions but also actual images of red heels in the database.

By combining text and image data, the model helps the AI agent provide more relevant and visually aligned recommendations based on the customer's input.

Agent workflow

Construct agent worker with access to the two tools:

from llama_index.core.agent import FunctionCallingAgentWorker

agent_worker = FunctionCallingAgentWorker.from_tools(
    [
        create_metadata_filters_tool,
        query_vector_database_tool,
    ])

Initialize the agent:

from llama_index.core.agent import AgentRunner

agent = AgentRunner(
    agent_worker,
)

Initiate conversation with agent:

agent.chat("I need shoes for a black-tie event")

As seen in the illustration, the agent asks for more information:

A simple chatbot processes the query based on direct keyword matching

The AI agent proactively asks the customer for more information to ensure accurate recommendations

Here's the full clarification the agent asked the customer:

Added user message to memory: I need shoes for a black-tie event
=== LLM Response ===
To help you find the perfect shoes for a black-tie event, could you please specify any preferences you might have? For example, are you looking for men's or women's shoes, a specific color, or any particular style? This will help me narrow down the options for you.

We provide additional information:

agent_response = agent.chat("Men's shoes for a black-tie event")

And here's the full thought process of the agent after receiving more information:

Added user message to memory: Men's shoes for a black-tie event
=== Calling Function ===
Calling function: create_metadata_filter with args: {"filter_string": "men's black-tie shoes"}

=== Function Output ===
[{"key": "gender", "value": "men", "operator": "=="}, {"key": "usage", "value": "formal", "operator": "=="}]

=== Calling Function ===
Calling function: search_footwear_database with args: {"query_str": "shoes", "filters_json": [{"key": "gender", "value": "men", "operator": "=="}, {"key": "usage", "value": "formal", "operator": "=="}]}

=== Function Output ===
[{'color': 'brown', 'text': 'Enroute men leather brown formal shoes', 'gender': 'men', 'product_type': 'formal shoes', 'usage': 'formal', 'price': 70, 'similarity_score': 1.7008}, {'color': 'brown', 'text': 'Enroute men leather brown formal shoes', 'gender': 'men', 'product_type': 'formal shoes', 'usage': 'formal', 'price': 70, 'similarity_score': 1.69490409}, {'color': 'black', 'text': 'Arrow men formal black shoe', 'gender': 'men', 'product_type': 'formal shoes', 'usage': 'formal', 'price': 180, 'similarity_score': 1.18878198}, {'color': 'black', 'text': 'Lee cooper men black shoe', 'gender': 'men', 'product_type': 'formal shoes', 'usage': 'formal', 'price': 155, 'similarity_score': 1.18653321}]

Here's the final reply from the agent:

=== LLM Response ===
Here are some options for men's shoes suitable for a black-tie event:

1. Arrow Men Formal Black Shoe
   - Color: Black
   - Price: $180

2. Lee Cooper Men Black Shoe
   - Color: Black
   - Price: $155

3. Enroute Men Leather Brown Formal Shoes
   - Color: Brown
   - Price: $70

These options are perfect for formal occasions.
Let me know if you need more information or if you'd like to explore other options!

Additional Resources

For a deeper dive into this example, check out my GitHub repository where I break down the code and methodology used by the AI agent in today's scenario.

Source Code​

Introducing SoleMates​

Today's Challenge: No Reflection - Simple Chatbots Can't Infer from Context​

Scenario​

Naive Chatbot Response​

Why Did the Naive Chatbot Fail?​

Limitations Highlighted​

AI Agent Solution​

How Did the AI Agent Succeed?​

Key Takeaways​

Naive Chatbot Limitation​

AI Agent Advantages​

Conclusion​

About This Series​

Coming Up: Understanding Context Shifts​

Behind the Scenes: Code Snippets​

1. Vector database metadata filtering tool​

2. Vector database query​

Agent workflow​

Additional Resources​