.\" Automatically generated by Pandoc 2.9.2.1
.\"
.TH "elvish-str" "7" "Jul 18, 2021" "Elvish 0.15.0" "Miscellaneous Information Manual"
.hy
.SH Introduction
.PP
The \f[C]str:\f[R] module provides string manipulation functions.
.PP
Function usages are given in the same format as in the reference doc for
the builtin module.
.SH Functions
.SS str:compare
.IP
.nf
\f[C]
str:compare $a $b
\f[R]
.fi
.PP
Compares two strings and output an integer that will be 0 if a == b, -1
if a < b, and +1 if a > b.
.IP
.nf
\f[C]
\[ti]> str:compare a a
\[u25B6] 0
\[ti]> str:compare a b
\[u25B6] -1
\[ti]> str:compare b a
\[u25B6] 1
\f[R]
.fi
.SS str:contains
.IP
.nf
\f[C]
str:contains $str $substr
\f[R]
.fi
.PP
Outputs whether \f[C]$str\f[R] contains \f[C]$substr\f[R] as a
substring.
.IP
.nf
\f[C]
\[ti]> str:contains abcd x
\[u25B6] $false
\[ti]> str:contains abcd bc
\[u25B6] $true
\f[R]
.fi
.SS str:contains-any
.IP
.nf
\f[C]
str:contains-any $str $chars
\f[R]
.fi
.PP
Outputs whether \f[C]$str\f[R] contains any Unicode code points in
\f[C]$chars\f[R].
.IP
.nf
\f[C]
\[ti]> str:contains-any abcd x
\[u25B6] $false
\[ti]> str:contains-any abcd xby
\[u25B6] $true
\f[R]
.fi
.SS str:count
.IP
.nf
\f[C]
str:count $str $substr
\f[R]
.fi
.PP
Outputs the number of non-overlapping instances of \f[C]$substr\f[R] in
\f[C]$s\f[R].
If \f[C]$substr\f[R] is an empty string, output 1 + the number of
Unicode code points in \f[C]$s\f[R].
.IP
.nf
\f[C]
\[ti]> str:count abcdefabcdef bc
\[u25B6] 2
\[ti]> str:count abcdef \[aq]\[aq]
\[u25B6] 7
\f[R]
.fi
.SS str:equal-fold
.IP
.nf
\f[C]
str:equal-fold $str1 $str2
\f[R]
.fi
.PP
Outputs if \f[C]$str1\f[R] and \f[C]$str2\f[R], interpreted as UTF-8
strings, are equal under Unicode case-folding.
.IP
.nf
\f[C]
\[ti]> str:equal-fold ABC abc
\[u25B6] $true
\[ti]> str:equal-fold abc ab
\[u25B6] $false
\f[R]
.fi
.SS str:from-codepoints
.IP
.nf
\f[C]
str:from-codepoints $number...
\f[R]
.fi
.PP
Outputs a string consisting of the given Unicode codepoints.
Example:
.IP
.nf
\f[C]
\[ti]> str:from-codepoints 0x61
\[u25B6] a
\[ti]> str:from-codepoints 0x4f60 0x597d
\[u25B6] \[u4F60]\[u597D]
\f[R]
.fi
.PP
\[at]cf str:to-codepoints
.SS str:from-utf8-bytes
.IP
.nf
\f[C]
str:from-utf8-bytes $number...
\f[R]
.fi
.PP
Outputs a string consisting of the given Unicode bytes.
Example:
.IP
.nf
\f[C]
\[ti]> str:from-utf8-bytes 0x61
\[u25B6] a
\[ti]> str:from-utf8-bytes 0xe4 0xbd 0xa0 0xe5 0xa5 0xbd
\[u25B6] \[u4F60]\[u597D]
\f[R]
.fi
.PP
\[at]cf str:to-utf8-bytes
.SS str:has-prefix
.IP
.nf
\f[C]
str:has-prefix $str $prefix
\f[R]
.fi
.PP
Outputs if \f[C]$str\f[R] begins with \f[C]$prefix\f[R].
.IP
.nf
\f[C]
\[ti]> str:has-prefix abc ab
\[u25B6] $true
\[ti]> str:has-prefix abc bc
\[u25B6] $false
\f[R]
.fi
.SS str:has-suffix
.IP
.nf
\f[C]
str:has-suffix $str $suffix
\f[R]
.fi
.PP
Outputs if \f[C]$str\f[R] ends with \f[C]$suffix\f[R].
.IP
.nf
\f[C]
\[ti]> str:has-suffix abc ab
\[u25B6] $false
\[ti]> str:has-suffix abc bc
\[u25B6] $true
\f[R]
.fi
.SS str:index
.IP
.nf
\f[C]
str:index $str $substr
\f[R]
.fi
.PP
Outputs the index of the first instance of \f[C]$substr\f[R] in
\f[C]$str\f[R], or -1 if \f[C]$substr\f[R] is not present in
\f[C]$str\f[R].
.IP
.nf
\f[C]
\[ti]> str:index abcd cd
\[u25B6] 2
\[ti]> str:index abcd xyz
\[u25B6] -1
\f[R]
.fi
.SS str:index-any
.IP
.nf
\f[C]
str:index-any $str $chars
\f[R]
.fi
.PP
Outputs the index of the first instance of any Unicode code point from
\f[C]$chars\f[R] in \f[C]$str\f[R], or -1 if no Unicode code point from
\f[C]$chars\f[R] is present in \f[C]$str\f[R].
.IP
.nf
\f[C]
\[ti]> str:index-any \[dq]chicken\[dq] \[dq]aeiouy\[dq]
\[u25B6] 2
\[ti]> str:index-any l33t aeiouy
\[u25B6] -1
\f[R]
.fi
.SS str:join
.IP
.nf
\f[C]
str:join $sep $input-list?
\f[R]
.fi
.PP
Joins inputs with \f[C]$sep\f[R].
Examples:
.IP
.nf
\f[C]
\[ti]> put lorem ipsum | str:join ,
\[u25B6] lorem,ipsum
\[ti]> str:join , [lorem ipsum]
\[u25B6] lorem,ipsum
\[ti]> str:join \[aq]\[aq] [lorem ipsum]
\[u25B6] loremipsum
\[ti]> str:join \[aq]...\[aq] [lorem ipsum]
\[u25B6] lorem...ipsum
\f[R]
.fi
.PP
Etymology: Various languages,
Python (https://docs.python.org/3.6/library/stdtypes.html#str.join).
.PP
\[at]cf str:split
.SS str:last-index
.IP
.nf
\f[C]
str:last-index $str $substr
\f[R]
.fi
.PP
Outputs the index of the last instance of \f[C]$substr\f[R] in
\f[C]$str\f[R], or -1 if \f[C]$substr\f[R] is not present in
\f[C]$str\f[R].
.IP
.nf
\f[C]
\[ti]> str:last-index \[dq]elven speak elvish\[dq] elv
\[u25B6] 12
\[ti]> str:last-index \[dq]elven speak elvish\[dq] romulan
\[u25B6] -1
\f[R]
.fi
.SS str:replace
.IP
.nf
\f[C]
str:replace &max=-1 $old $repl $source
\f[R]
.fi
.PP
Replaces all occurrences of \f[C]$old\f[R] with \f[C]$repl\f[R] in
\f[C]$source\f[R].
If \f[C]$max\f[R] is non-negative, it determines the max number of
substitutions.
.PP
\f[B]Note\f[R]: This command does not support searching by regular
expressions, \f[C]$old\f[R] is always interpreted as a plain string.
Use re:replace if you need to search by regex.
.SS str:split
.IP
.nf
\f[C]
str:split $sep $string
\f[R]
.fi
.PP
Splits \f[C]$string\f[R] by \f[C]$sep\f[R].
If \f[C]$sep\f[R] is an empty string, split it into codepoints.
.IP
.nf
\f[C]
\[ti]> str:split , lorem,ipsum
\[u25B6] lorem
\[u25B6] ipsum
\[ti]> str:split \[aq]\[aq] \[u4F60]\[u597D]
\[u25B6] \[u4F60]
\[u25B6] \[u597D]
\f[R]
.fi
.PP
\f[B]Note\f[R]: This command does not support splitting by regular
expressions, \f[C]$sep\f[R] is always interpreted as a plain string.
Use re:split if you need to split by regex.
.PP
Etymology: Various languages, in particular
Python (https://docs.python.org/3.6/library/stdtypes.html#str.split).
.PP
\[at]cf str:join
.SS str:title
.IP
.nf
\f[C]
str:title $str
\f[R]
.fi
.PP
Outputs \f[C]$str\f[R] with all Unicode letters that begin words mapped
to their Unicode title case.
.IP
.nf
\f[C]
\[ti]> str:title \[dq]her royal highness\[dq]
\[u25B6] Her Royal Highness
\f[R]
.fi
.SS str:to-codepoints
.IP
.nf
\f[C]
str:to-codepoints $string
\f[R]
.fi
.PP
Outputs value of each codepoint in \f[C]$string\f[R], in hexadecimal.
Examples:
.IP
.nf
\f[C]
\[ti]> str:to-codepoints a
\[u25B6] 0x61
\[ti]> str:to-codepoints \[u4F60]\[u597D]
\[u25B6] 0x4f60
\[u25B6] 0x597d
\f[R]
.fi
.PP
The output format is subject to change.
.PP
\[at]cf from-codepoints
.SS str:to-lower
.IP
.nf
\f[C]
str:to-lower $str
\f[R]
.fi
.PP
Outputs \f[C]$str\f[R] with all Unicode letters mapped to their
lower-case equivalent.
.IP
.nf
\f[C]
\[ti]> str:to-lower \[aq]ABC!123\[aq]
\[u25B6] abc!123
\f[R]
.fi
.SS str:to-title
.IP
.nf
\f[C]
str:to-title $str
\f[R]
.fi
.PP
Outputs \f[C]$str\f[R] with all Unicode letters mapped to their Unicode
title case.
.IP
.nf
\f[C]
\[ti]> str:to-title \[dq]her royal highness\[dq]
\[u25B6] HER ROYAL HIGHNESS
\[ti]> str:to-title \[dq]\[u0445]\[u043B]\[u0435]\[u0431]\[dq]
\[u25B6] \[u0425]\[u041B]\[u0415]\[u0411]
\f[R]
.fi
.SS str:to-upper
.IP
.nf
\f[C]
str:to-upper
\f[R]
.fi
.PP
Outputs \f[C]$str\f[R] with all Unicode letters mapped to their
upper-case equivalent.
.IP
.nf
\f[C]
\[ti]> str:to-upper \[aq]abc!123\[aq]
\[u25B6] ABC!123
\f[R]
.fi
.SS str:to-utf8-bytes
.IP
.nf
\f[C]
str:to-utf8-bytes $string
\f[R]
.fi
.PP
Outputs value of each byte in \f[C]$string\f[R], in hexadecimal.
Examples:
.IP
.nf
\f[C]
\[ti]> str:to-utf8-bytes a
\[u25B6] 0x61
\[ti]> str:to-utf8-bytes \[u4F60]\[u597D]
\[u25B6] 0xe4
\[u25B6] 0xbd
\[u25B6] 0xa0
\[u25B6] 0xe5
\[u25B6] 0xa5
\[u25B6] 0xbd
\f[R]
.fi
.PP
The output format is subject to change.
.PP
\[at]cf from-utf8-bytes
.SS str:trim
.IP
.nf
\f[C]
str:trim $str $cutset
\f[R]
.fi
.PP
Outputs \f[C]$str\f[R] with all leading and trailing Unicode code points
contained in \f[C]$cutset\f[R] removed.
.IP
.nf
\f[C]
\[ti]> str:trim \[dq]\[r!]\[r!]\[r!]Hello, Elven!!!\[dq] \[dq]!\[r!]\[dq]
\[u25B6] \[aq]Hello, Elven\[aq]
\f[R]
.fi
.SS str:trim-left
.IP
.nf
\f[C]
str:trim-left $str $cutset
\f[R]
.fi
.PP
Outputs \f[C]$str\f[R] with all leading Unicode code points contained in
\f[C]$cutset\f[R] removed.
To remove a prefix string use \f[C]str:trim-prefix\f[R].
.IP
.nf
\f[C]
\[ti]> str:trim-left \[dq]\[r!]\[r!]\[r!]Hello, Elven!!!\[dq] \[dq]!\[r!]\[dq]
\[u25B6] \[aq]Hello, Elven!!!\[aq]
\f[R]
.fi
.SS str:trim-prefix
.IP
.nf
\f[C]
str:trim-prefix $str $prefix
\f[R]
.fi
.PP
Outputs \f[C]$str\f[R] minus the leading \f[C]$prefix\f[R] string.
If \f[C]$str\f[R] doesn\[cq]t begin with \f[C]$prefix\f[R],
\f[C]$str\f[R] is output unchanged.
.IP
.nf
\f[C]
\[ti]> str:trim-prefix \[dq]\[r!]\[r!]\[r!]Hello, Elven!!!\[dq] \[dq]\[r!]\[r!]\[r!]Hello, \[dq]
\[u25B6] Elven!!!
\[ti]> str:trim-prefix \[dq]\[r!]\[r!]\[r!]Hello, Elven!!!\[dq] \[dq]\[r!]\[r!]\[r!]Hola, \[dq]
\[u25B6] \[aq]\[r!]\[r!]\[r!]Hello, Elven!!!\[aq]
\f[R]
.fi
.SS str:trim-right
.IP
.nf
\f[C]
str:trim-right $str $cutset
\f[R]
.fi
.PP
Outputs \f[C]$str\f[R] with all leading Unicode code points contained in
\f[C]$cutset\f[R] removed.
To remove a suffix string use \f[C]str:trim-suffix\f[R].
.IP
.nf
\f[C]
\[ti]> str:trim-right \[dq]\[r!]\[r!]\[r!]Hello, Elven!!!\[dq] \[dq]!\[r!]\[dq]
\[u25B6] \[aq]\[r!]\[r!]\[r!]Hello, Elven\[aq]
\f[R]
.fi
.SS str:trim-space
.IP
.nf
\f[C]
str:trim-space $str
\f[R]
.fi
.PP
Outputs \f[C]$str\f[R] with all leading and trailing white space removed
as defined by Unicode.
.IP
.nf
\f[C]
\[ti]> str:trim-space \[dq] \[rs]t\[rs]n Hello, Elven \[rs]n\[rs]t\[rs]r\[rs]n\[dq]
\[u25B6] \[aq]Hello, Elven\[aq]
\f[R]
.fi
.SS str:trim-suffix
.IP
.nf
\f[C]
str:trim-suffix $str $suffix
\f[R]
.fi
.PP
Outputs \f[C]$str\f[R] minus the trailing \f[C]$suffix\f[R] string.
If \f[C]$str\f[R] doesn\[cq]t end with \f[C]$suffix\f[R], \f[C]$str\f[R]
is output unchanged.
.IP
.nf
\f[C]
\[ti]> str:trim-suffix \[dq]\[r!]\[r!]\[r!]Hello, Elven!!!\[dq] \[dq], Elven!!!\[dq]
\[u25B6] \[r!]\[r!]\[r!]Hello
\[ti]> str:trim-suffix \[dq]\[r!]\[r!]\[r!]Hello, Elven!!!\[dq] \[dq], Klingons!!!\[dq]
\[u25B6] \[aq]\[r!]\[r!]\[r!]Hello, Elven!!!\[aq]
\f[R]
.fi