Estimate distance between two strings by Levenshtein Distance

In some cases you are required to find the distance between two strings, so here is a note for that.

The edit distance – Levenshtein distance, is a numeric distance that shows how much two strings are different from each other.
Wikipedia has a good explanation.

LevenshteinDistance
LevenshteinDistance

It is determined by how many insertions, deletions, and substitutions of a single character must be made in one string to equal the other string.

Below is implementation by Python

def Ldistance(x, y):
    #Estimate Levenstein distance

    # Handle empty
    if not x: return len(y)
    if not y: return len(x)

    # If first character is same, continue
    if x[0] == y[0]: return Ldistance(x[1:], y[1:])

    # Append, delete and replace 
    l1 = Ldistance(x, y[1:])
    l2 = Ldistance(x[1:], y)
    l3 = Ldistance(x[1:], y[1:])

    # Add min distance of the ramaining character
    return 1 + min(l1, l2, l3)

def LdNorm(x, y):
    return Ldistance(x, y) / max(len(x), len(y))


s1 = 'chevron'
s2 = 'chevolet'

print('L-dist = ', Ldistance(s1, s2))
print('L-dist Normalized = ', LdNorm(s1, s2))

L-dist = 4
L-dist Normalized = 0.5

Leave a Reply

Your email address will not be published. Required fields are marked *