.TH UTF8GEN 1 "2018 Jun 30" .SH NAME utf8gen \- Generate UTF-8 output from hexadecimal input .SH SYNOPSIS .br \fButf8gen\fP [ [-e \fIformat1\fP] | [-E \fIformat2\fP] ] [-r \fIformatr\fP] [ [-u \fIutf8_format\fP] | -n] [-c] [-s] [-i \fIinput_file\fP] [-o \fIoutput_file\fP] .SH DESCRIPTION .B utf8gen reads a list of hexadecimal ASCII values in the range 0 through 10FFFF, one per line, and prints the UTF-8 encoding of that number as a Unicode code point. .PP Each input line must begin with a hexadecimal number. A string may follow after that, which can be echoed to the output as the "remainder" (see the -r option below). The total input line length, including an ending newline, is limited to 4096 bytes. .SH OPTIONS .TP 6 \-c After the UTF-8 codes are printed, print a space followed by the character that the hexadecimal code point represents. .TP \-e Echo the input code point in one format, using the printf(3) format string \fIformat1\fP. .TP \-E Echo the input code point in two formats, using the printf(3) format string \fIformat2\fP. .TP \-n Do \fInot\fP print the UTF-8 byte values. This can be useful if only the printed character itself is desired; see the \-c option. .TP \-r Print the remainder of the input string after the initial hexadecimal digits, using the printf(3) format string \fIformatr\fP. .TP \-s Swap the order of output: print the UTF-8 output portion first, then print the input string portion. This can be useful for generating code containing a UTF-8 encoding followed by a comment that contains the input hexadecimal digits. .TP \-u Print the UTF-8 encoded value of the input hexadecimal number, as numeric codes for each UTF-8 byte, using the printf(3) format string \fIutf8_format\fP. If no string is specified, a default format of a backslash followed by three octal digits is printed for each byte. .SH EXAMPLES .RS .PP utf8gen -e "0x%04X " -u "\\%03o" .PP utf8gen -E "U+%04x = 0%02o = " .PP utf8gen -s -e " /* U+%04X */" -u "\\%03o" .RE .SH FILES Files contain lines that each begin with an ASCII hexadecimal code in the valid Unicode range 0 through 10FFFF, inclusive. This hexadecimal code may optionally be followed by a space followed by an arbitrary string ending with a newline, up to the limit of 4096 bytes per input line. An example line could be the following (with no indent): .PP .RS 41 Letter 'A' .RE .SH "SEE ALSO" For more detailed explanations and examples of common usage, consult the \fButf8gen\fP texinfo manual. .SH AUTHOR .B utf8gen was written by Paul Hardy. .SH LICENSE .B utf8gen is Copyright \(co 2018 Paul Hardy. .PP This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. .SH BUGS No known bugs exist.