This is a sandbox of Levenshtein distance implementations.
using LevenshteinToolkit
distance_matrix("Saturday", "Sunday")
# ==> 3
distance_row("Saturday", "Sunday")
# ==> 3
The distance_matrix
function will compute the Levenshtein distance between two strings by building a matrix of distances between the prefixes. The costs associated with deletion, insertion, and substitution can be tailored using the keyword arguments. The distance_row
function is similar, but only keeps two matrix rows, thereby reducing the memory footprint.
The nfa
function builds a non-deterministic finite automaton (NFA) that recognizes the set of words that are within a given Levenshtein distance from a word:
nfa_automaton = nfa("food", 1)
The draw
function can be used to generate a GraphViz input file:
open(f -> draw(f, nfa_automaton), "nfa.dot", "w")
The resulting file can be rendered using the dot
command-line tool:
dot -Tsvg nfa.dot -o nfa.svg
The dfa
function builds a deterministic finite automaton (DFA) from an NFA.
dfa_automaton = dfa(nfa_automaton)
As for the case of NFAs, the draw
function can be used to generate a GraphViz input file.
The check
function can be used to run a DFA against a string:
check(dfa_automaton, "food")
# ==> true
check(dfa_automaton, "xfood")
# ==> true ... we're one edit away.
check(dfa_automaton, "xfoodx")
# ==> false ... we're two edits away.