COSINE_DISTANCE()
Function description
The COSINE_DISTANCE() function is used to calculate the cosine distance between two vectors.
Cosine Distance is a measure of the difference in direction between two vectors, usually defined as 1 minus cosine similarity (Cosine Similarity). The value of the cosine distance ranges from 0 to 2. 0 means that the direction of the two vectors is exactly the same (the distance is smallest). 2 means that the directions of the two vectors are exactly opposite (the distance is maximum). In text analysis, cosine distance can be used to measure similarity between documents. Since it only takes into account the direction of the vector and not the length, it is fair to compare between long and short text.
Function Syntax
> SELECT COSINE_DISTANCE(vector1, vector2) FROM tbl;
Example
drop table if exists vec_table;
create table vec_table(a int, b vecf32(3), c vecf64(3));
insert into vec_table values(1, "[1,2,3]", "[4,5,6]");
mysql> select * from vec_table;
+-------+--------------------------+
| a | b | c |
+-------+--------------------------+
| 1 | [1, 2, 3] | [4, 5, 6] |
+-------+--------------------------+
1 row in set (0.01 sec)
mysql> select cosine_distance(b,c) from vec_table;
+---------------------------+
| cosine_distance(b, c) |
+---------------------------+
| 0.0253681538029239 |
+---------------------------+
1 row in set (0.00 sec)
mysql> select cosine_distance(b,"[1,2,3]") from vec_table;
+----------------------------------+
| cosine_distance(b, [1,2,3]) |
+----------------------------------+
| 0 |
+----------------------------------+
1 row in set (0.00 sec)
mysql> select cosine_distance(b,"[-1,-2,-3]") from vec_table;
+------------------------------------+
| cosine_distance(b, [-1,-2,-3]) |
+------------------------------------+
| 2 |
+------------------------------------+
1 row in set (0.00 sec)
limit
When using the COSINE_DISTANCE() function, the input vector is not allowed to be a 0 vector because this will result in a dividing by zero, which is mathematically undefined. In practical applications, we generally think that the cosine similarity between zero vectors and any other vectors is 0, because there is no similarity in any direction between them.