Please contact the project administrators of this project at the project summary page.
Sources are in the Mercurial repository.
This is the cl-string-match project ("clstringmatch")
Common Lisp provides a SEARCH function to lookup given keys in a string, but its implementation can be subotimal: it can be beaten in some cases by even the simpliest brute-force algorithm. This project is aimed at implementing robust substring search (subsequence search) algorithms for Common Lisp.
Along with subsequence search (matching) algorithms this package provides implementations of the following data structures and utility algorithms:
Check also the API reference.
At the moment Cl-String-Match is not supported by Quicklisp. Installation is manual.
Cl-String-Match exports functions in cl-string-match
package (nicknamed sm
).
Shortcuts look for given pattern pat
in text txt
. They are usually much slower
but are easier to use:
string-contains-brute pat txt
Brute-forcestring-contains-bm pat txt
Boyer-Moorestring-contains-rk pat txt
Rabin-Karpstring-contains-kmp pat txt
Knuth-Morris-Prattstring-contains-ac pat txt
Aho-CorasickA more robust approach is to use pre-calculated index data
that is processed by a pair of initialize
and search
functions:
initialize-bm pat
and search-bm bm txt
initialize-rk pat
and search-rk rk txt
initialize-kmp pat
and search-kmp kmp txt
initialize-ac pat
and search-ac ac txt
. Here initialize-ac
can accept a list of patterns that are compiled into a trie.Following example looks for a given
substring pat
in a given line of
text txt
using Rabin-Karp algorithm
implementation:
(let ((idx (initialize-rk pat))) (search-rk idx txt))
This project is still a work in progress. Any contributions are welcome!