.TH "XML2ASC" "1" "10 Jul 2011" "7.x" "HTML-XML-utils" .de d \" begin display .sp .in +4 .nf .ft CR .CDS .. .de e \" end display .CDE .in -4 .fi .ft R .sp .. .SH NAME xml2asc \- convert UTF-8 to &#nnn; entities .SH SYNOPSIS .B xml2asc .SH DESCRIPTION .LP Reads an UTF-8 encoded text from standard input and writes to standard output, converting all non-ASCII characters to &#nnn; entities, so that the result is ASCII-encoded. .LP One example use is to convert ISO-8859-1 to ASCII with &#nnn; entities, by first running .B asc2xml to convert ISO-8859-1 to UTF-8 and then pipe the result into .B xml2asc to convert to ASCII with &#nnn; entities for all accented characters. .LP To test if a file is correct UTF-8, ignore the output and test the exit code, e.g. in Bash: .d xml2asc /dev/null && echo "OK" || echo "Fail" .e .SH "DIAGNOSTICS" .B xml2asc returns with a non-zero exit code if the input was not UTF-8. .SH "SEE ALSO" .BR asc2xml (1), .BR UTF-8 " (RFC 2279)" .SH BUGS .LP Doesn't distinguish mark-up from content, so if the input uses non-ASCII characters in XML element names, they will be output with numerical entities in them, which is not legal in XML.