Member-only story
How to measure the similarity between two strings with Dart
Levenshtein distance
In dart, Strings are mainly used to represent text. We can compare two sequences using The operator ==. The assertion (string1==string2) returns true if string2 is a String with the same sequence of code units as string1 and false otherwise.
Also, we can use the method compareTo :string1.compareTo(string2);
This method allows comparing string1 to string2. It returns a negative value if string1 is ordered before string2, a positive value if string1 is ordered after string2, or zero if string1 and string2 are equal.
Certainly, these two operations are very useful, however, when you search on a search engine or in a database you often make spelling mistakes. An effective program must at least allow us to correct the situation and to realize our error.
Several distances exist to find the similarity between two strings. The most known for strings of the same length is the hamming distance. It measures the minimum number of substitutions necessary to change one string to another or the minimum number of errors that could have transformed one string into another. It owes its name to the American mathematician Richard Hamming.
The Dart implementation of the Hamming Distance
int hamming_distance(String string1, String string2) {
int dist_counter = 0;
for (int n = 0; n < (string1.length); n++)
if (string1[n] != string2[n]) dist_counter += 1;
return dist_counter;}
Another very old but very well-known and effective measure of similarity to compare between two strings of any length is indeed the Levenshtein distance.
The Levenshtein distance between two strings is the minimum number of single-character edits (insertions, deletions, or substitutions) required to change one word into another.
For example, the Levenshtein distance between “adil” and “amily” is 2, since the following two change edits are required to change one string into the other string “amily”:
- adil → amil (substitution of “m” for “d”),
- amil→ amily(insertion of “y”).