.\" Hey, EMACS: -*- nroff -*- .\" (C) Copyright 2015 NOKUBI Takatsugu , .\" .\" First parameter, NAME, should be all caps .\" Second parameter, SECTION, should be 1-8, maybe w/ subsection .\" other parameters are allowed: see man(7), man(1) .TH SIMSTRING 1 "January 26, 2015" .\" Please adjust this date whenever revising the manpage. .\" .\" Some roff macros, for reference: .\" .nh disable hyphenation .\" .hy enable hyphenation .\" .ad l left justify .\" .ad b justify to both left and right margins .\" .nf disable filling .\" .fi enable filling .\" .br insert line break .\" .sp insert n+1 empty lines .\" for manpage-specific macros, see man(7) .SH NAME simstring \- build database and find similar words .SH SYNOPSIS .B simstring .RI [ OPTIONS ] .br .SH DESCRIPTION This utility finds strings in the database (DB) such that they have similarity, in the similarity measure (SIM), no smaller than the threshold (TH) with queries read from STDIN. When \fB\-b (\-\-build)\fR option is specified, this utility builds a database (DB) for strings read from STDIN. .SH OPTIONS These programs follow the usual GNU command line syntax, with long options starting with two dashes (`-'). A summary of options is included below. For a complete description, see the Info files. .TP .B \-b, \-\-build build a database for strings read from STDIN .TP .B \-d, \-\-database=DB specify a database file .TP .B \-u, \-\-unicode use Unicode (wchar_t) for representing characters .TP .B \-n, \-\-ngram=N specify the unit of n-grams (DEFAULT=3) .TP .B \-m, \-\-mark include marks for begins and ends of strings .TP .B \-s, \-\-similarity=SIM pecify a similarity measure (DEFAULT='cosine'): .TS tab (@); l lx. \fBexact\fR@T{ exact match T} \fBdice\fR@T{ dice coefficient T} \|\fBcosine\fR\|]@T{ cosine coefficient T} \fBjaccard\fR@T{ jaccard coefficient T} \fBoverlap\fR@T{ overlap coefficient T} .TE .TP .B \-t, \-\-threshold=TH specify the threshold (DEFAULT=0.7) .TP .B \-e, \-\-echo-back echo back query strings to the output .TP .B \-q, \-\-quiet suppress supplemental information from the output .TP .B \-b, \-\-benchmark show benchmark result (retrieved strings are suppressed) .TP .B \-v, \-\-version show this version information and exit .TP .B \-h, \-\-help show summary of options and exit .SH SEE ALSO .BR /usr/share/doc/simstring-dev/examples