Implements an approximate string matching version of R's native 'match' function. Also offers fuzzy text search based on various string distance measures. Can calculate various string distances based on edits (Damerau-Levenshtein, Hamming, Levenshtein, optimal sting alignment), qgrams (q- gram, cosine, jaccard distance) or heuristic metrics (Jaro, Jaro-Winkler). An implementation of soundex is provided as well. Distances can be computed between character vectors while taking proper care of encoding or between integer vectors representing generic sequences. This package is built for speed and runs in parallel by using 'openMP'. An API for C or C++ is exposed as well. Reference: MPJ van der Loo (2014) <doi:10.32614/RJ-2014-011>.
Version: | 0.9.12 |
Depends: | R (≥ 2.15.3) |
Imports: | parallel |
Suggests: | tinytest |
Published: | 2023-11-28 |
DOI: | 10.32614/CRAN.package.stringdist |
Author: | Mark van der Loo [aut, cre], Jan van der Laan [ctb], R Core Team [ctb], Nick Logan [ctb], Chris Muir [ctb], Johannes Gruber [ctb], Brian Ripley [ctb] |
Maintainer: | Mark van der Loo <mark.vanderloo at gmail.com> |
BugReports: | https://github.com/markvanderloo/stringdist/issues |
License: | GPL-3 |
URL: | https://github.com/markvanderloo/stringdist |
NeedsCompilation: | yes |
Citation: | stringdist citation info |
Materials: | README NEWS |
In views: | NaturalLanguageProcessing, OfficialStatistics |
CRAN checks: | stringdist results |
Reference manual: | stringdist.pdf |
Vignettes: |
RJournal 6 111-122 (2014) stringdist C/C++ API |
Package source: | stringdist_0.9.12.tar.gz |
Windows binaries: | r-devel: stringdist_0.9.12.zip, r-release: stringdist_0.9.12.zip, r-oldrel: stringdist_0.9.12.zip |
macOS binaries: | r-release (arm64): stringdist_0.9.12.tgz, r-oldrel (arm64): stringdist_0.9.12.tgz, r-release (x86_64): stringdist_0.9.12.tgz, r-oldrel (x86_64): stringdist_0.9.12.tgz |
Old sources: | stringdist archive |
Please use the canonical form https://CRAN.R-project.org/package=stringdist to link to this page.