Vectorizing data
Before we dive deeper into enhancing our online jeans store, it's crucial to understand why and how we vectorize product data.
All images/dataset used throughout this guide are from: Aggarwal, P. (2022). Fashion Product Images (Small). Available online: https://www.kaggle.com/datasets/paramaggarwal/fashion-product-images-dataset
Vectorization, or creating embeddings, is the process of converting product data, like product descriptions and images, into a format that AI models can understand and process. This involves transforming the data into vectors:
Embeddings are mathematical representations of data
Why Vectorize Jeans Data?
In any online store, the variety and specifics of products like jeans - different colors, styles, and materials - need to be searchable in a way that matches customer inquiries with the most relevant products:
The customer inquiry needs to be vectorized to be matched with the most relevant products
Traditionally, systems might rely on simple keyword matches, which can miss nuances in customer preferences or product descriptions.
Vectorizing Jeans
The illustration below shows how an embedding model converts items like jeans and a customer inquiry into numerical form. Each item is represented by a dense 300-dimensional vector, a compact array of real numbers, where each element encodes some aspect of the item's characteristics:
Mathematical representations of jeans and a customer inquiry
Visualizing Jeans in Vector Space
In our vector space, each point represents a unique pair of jeans, and their proximity to one another is based on similar characteristics such as fit, color, and style.
In the illustration, you can see clusters of jeans:
Clusters of jeans in vector space
Light blue jeans form one cluster, indicating their similarity to each other, while being distinct from clusters of dark blue and grey jeans. The vector space also reveals the relationships between different styles - notice how slim-fit jeans are positioned relative to boot-cut ones, reflecting their shared attributes and differences.
This organized layout in vector space is not just a theoretical concept; it's a practical tool that our AI model uses to identify and recommend products. When a customer searches for light blue slim-fit jeans, the system can easily locate this cluster and suggest closely related options:
Product recommendations mapped in vector space
It can also show alternatives from nearby clusters, perhaps a pair of grey slim-fit jeans that the customer may also like, thus broadening their choices without straying too far from their original intent:
Exploring related options in vector space clusters
The Limitations of Keyword Matching
Keyword matches can fail or underperform in several scenarios:
The limitations of keyword matching
Relying only on keyword matches can lead to poor customer experiences. Here's why:
1. Typos
Even a small typo can derail a search. When "genes" is typed instead of "jeans," a keyword match might return irrelevant products or no results at all.
2. Synonyms
Different words for the same item, like "denims" for "jeans," might not be recognized by a strict keyword match system, narrowing the search results.
3. Context
A color search for "navy jeans" could be misinterpreted as a military uniform if the system doesn't understand "navy" as a color in this context.
4. Slang
Fashion terms evolve, and what's known as "skinnies" might not be matched if the system only knows "skinny jeans."
5. Descriptive Searches
A request for "jeans comfortable for a long flight" aims for a specific use-case which keyword search isn't nuanced enough to understand.
6. Trends
Keywords like "80s retro jeans" imply a style that might be lost on a simple search algorithm not tuned to fashion trends.
Each of these examples shows the common problems of basic keyword-matching systems. They highlight the need for a smarter approach that can understand and process the nuances of human language and intent in retail.
Overcoming keyword limitations with vectorization
By vectorizing data, we can overcome these limitations. Vectors allow us to create multi-dimensional spaces where products are not just isolated keywords but points in relation to others, capturing the subtleties of meaning, use cases, and customer inquiries.
This is why our online jeans store will use the power of vectorization for an improved and intuitive shopping experience.
Let's vectorize our jeans in the next section.