Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Distance functions are mathematical formulas used to measure the similarity or dissimilarity between vectors. Common examples include Manhattan distance, Euclidean distance, cosine similarity, and dot product. These measurements are crucial for determining how closely related two pieces of data are.
To learn more about vectors, see vector search.
Manhattan distance
Manhattan distance measures the distance between two points by adding up the absolute differences between their coordinates. Imagine walking in a grid-like city, such as the many neighborhoods in Manhattan; it's the total number of blocks you walk north-south and east-west.
Euclidean distance
Euclidean distance measures the straight-line distance between two points. It's named after the ancient mathematician Euclid, who is often referred to as the "father of geometry."
Cosine similarity
Cosine similarity measures the cosine of the angle between two vectors projected in a multidimensional space. Two documents might be far apart by Euclidean distance because of document sizes, but they could still have a smaller angle between them and therefore high cosine similarity.
Dot product
Dot product is the result of two vectors that are multiplied to return a single number. It combines the two vectors' magnitudes, and the cosine of the angle between them, showing how much one vector goes in the direction of another.
Related content
- VectorDistance system function in Azure Cosmos DB NoSQL
- What is a vector database?
- Retrieval-augmented generation (RAG)
- Vector search in Azure Cosmos DB
- Vector search in Azure Cosmos DB for NoSQL
- Vector store in Azure Cosmos DB for MongoDB vCore
- Large language model tokens
- Vector embeddings in Azure Cosmos DB
- kNN vs ANN vector search algorithms