Hybrid search in practice
Let's reuse the customer inquiry, "I'm looking for women's jeans for a summer party".
All images/dataset used throughout this guide are from: Aggarwal, P. (2022). Fashion Product Images (Small). Available online: https://www.kaggle.com/datasets/paramaggarwal/fashion-product-images-dataset
When we asked our AI model with our optimized jeans descriptions, we received these jeans recommendations:
Vectorization | Rank 1 | Rank 2 | Rank 3 | Rank 4 | Rank 5 |
---|---|---|---|---|---|
Text vectors | |||||
Image vectors | |||||
Text+image vectors |
These recommendations are promising, with most jeans in light colors and all labeled as summer jeans.
Interpreting results
In the following sections, you'll see a slider and images representing the recommended jeans for each hybrid search ratio. Each slider starts at 0.5, which theoretically represents a 50/50 balance between keyword search and semantic search.
Values closer to 0 lean more towards keyword search, focusing on the actual search terms, while values closer to 1 lean more towards semantic search, considering the contextual meaning behind the inquiry.
Let's examine if these recommendations change when we're introducing hybrid search.
Text vectors
Let's start by looking at the recommendations when the AI model only has access to product description text.
Use the slider to transition from full keyword search (0) to full semantic search (1).
Observe how the recommendations shift as you adjust the slider from 0 to 1.
What is keyword search and semantic search?
Keyword search:
Focuses on exact matches of the search terms within the text. It is precise but can miss the context.
Semantic search:
Understands the context and meaning behind the search terms, providing contextually relevant results even if they don't contain the exact keywords.
Here are the recommended jeans when only using text vectors with hybrid search:
Search Adjustment: 0.50
- Lee Men Blue Party JeansRank: 1
- Jealous 21 Women's Blue JeggingRank: 2
- Peter England Men Blue Party JeansRank: 3
- Peter England Men Navy Blue Party JeansRank: 4
- Peter England Men Party Blue JeansRank: 5
With hybrid search, we're no longer solely relying on dense vectors. Keyword search with sparse vectors considers direct word matches, like party in our case.
Some men's jeans include the word party in their descriptions:
Product image | Product description |
---|---|
Lee Men Blue Party Jeans |
So when balancing sparse and dense vectors equally, a few men's jeans may appear in the recommendations.
Try moving the slider to the right to see how the recommendations change as we give more weight to dense vectors and contextual search.
Image vectors
Here are the recommended jeans when only using product image vectors with hybrid search:
Search Adjustment: 0.50
- Flying Machine Men Blue Slim Fit Mid-Rise Clean Look JeansRank: 1
- John Players Men Blue Slim Fit Low-Rise Clean Look Stretchable JeansRank: 2
- Lee Men Blue Party JeansRank: 3
- Jealous 21 Women's Blue JeggingRank: 4
- Peter England Men Party Blue JeansRank: 5
Image + text vectors
Here are the recommended jeans when both using product image and text vectors with hybrid search:
Search Adjustment: 0.50
- Lee Men Blue Party JeansRank: 1
- Jealous 21 Women's Blue JeggingRank: 2
- ONLY Women Peach JeansRank: 3
- Lee Women Mid Stone Blue Maxi Fit JeansRank: 4
- ONLY Women Blue JeansRank: 5
Compare all 3 models
This slider includes all three models: solely text vectors, solely image vectors, and the combination of text and image vectors:
Search Adjustment: 0.50
- Lee Men Blue Party JeansRank: 1
- Jealous 21 Women's Blue JeggingRank: 2
- Peter England Men Blue Party JeansRank: 3
- Peter England Men Navy Blue Party JeansRank: 4
- Peter England Men Party Blue JeansRank: 5
- Flying Machine Men Blue Slim Fit Mid-Rise Clean Look JeansRank: 1
- John Players Men Blue Slim Fit Low-Rise Clean Look Stretchable JeansRank: 2
- Lee Men Blue Party JeansRank: 3
- Jealous 21 Women's Blue JeggingRank: 4
- Peter England Men Party Blue JeansRank: 5
- Lee Men Blue Party JeansRank: 1
- Jealous 21 Women's Blue JeggingRank: 2
- ONLY Women Peach JeansRank: 3
- Lee Women Mid Stone Blue Maxi Fit JeansRank: 4
- ONLY Women Blue JeansRank: 5
Conclusions
When sliding the scale to the right, around 0.7 and 0.8, all the recommendations shift to women's jeans as opposed to some being men's jeans.
The probable reason is that some men's jeans have the word party in their descriptions, providing a perfect keyword match for our customer inquiry "I'm looking for women's jeans for a summer party."
Sliding the scale to the right gives more weight to semantic search, resulting in recommendations for light-colored women's jeans.
Moving forward, we'll use an alpha (scale) of 0.8, which leans more heavily towards dense/semantic search.
Here are our product recommendations for a hybrid search with 0.8 alpha, favoring semantic search over keyword search:
Search Adjustment: 0.80
- ONLY Women Peach JeansRank: 1
- Lee Women Mid Stone Blue Maxi Fit JeansRank: 2
- ONLY Women Blue JeansRank: 3
- ONLY Women Blue JeansRank: 4
- Lee Womens Blue Maxi Fit JeansRank: 5
Let's look at reranking in the next section.