.\" Automatically generated by Pod::Man 4.14 (Pod::Simple 3.42) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "Lingua::Translit 3pm" .TH Lingua::Translit 3pm "2022-10-13" "perl v5.34.0" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" Lingua::Translit \- transliterates text between writing systems .SH "SYNOPSIS" .IX Header "SYNOPSIS" .Vb 1 \& use Lingua::Translit; \& \& my $tr = new Lingua::Translit("ISO 843"); \& \& my $text_tr = $tr\->translit("character oriented string"); \& \& if ($tr\->can_reverse()) { \& $text_tr = $tr\->translit_reverse("character oriented string"); \& } .Ve .SH "DESCRIPTION" .IX Header "DESCRIPTION" Lingua::Translit can be used to convert text from one writing system to another, based on national or international transliteration tables. Where possible a reverse transliteration is supported. .PP The term \f(CW\*(C`transliteration\*(C'\fR describes the conversion of text from one writing system or alphabet to another one. The conversion is ideally unique, mapping one character to exactly one character, so the original spelling can be reconstructed. Practically this is not always the case and one single letter of the original alphabet can be transcribed as two, three or even more letters. .PP Furthermore there is more than one transliteration scheme for one writing system. Therefore it is an important and necessary information, which scheme will be or has been used to transliterate a text, to work integrative and be able to reconstruct the original data. .PP Reconstruction is a problem though for non-unique transliterations, if no language specific knowledge is available as the resulting clusters of letters may be ambiguous. For example, the Greek character \*(L"\s-1PSI\*(R"\s0 maps to \*(L"ps\*(R", but \*(L"ps\*(R" could also result from the sequence \*(L"\s-1PI\*(R", \*(L"SIGMA\*(R"\s0 since \*(L"\s-1PI\*(R"\s0 maps to \*(L"p\*(R" and \*(L"\s-1SIGMA\*(R"\s0 maps to s. If a transliteration table leads to ambiguous conversions, the provided table cannot be used reverse. .PP Otherwise the table can be used in both directions, if appreciated. So if \s-1ISO 9\s0 is originally created to convert Cyrillic letters to the Latin alphabet, the reverse transliteration will transform Latin letters to Cyrillic. .SH "METHODS" .IX Header "METHODS" .ie n .SS "new(\fI""name of table""\fP)" .el .SS "new(\fI``name of table''\fP)" .IX Subsection "new(name of table)" Initializes an object with the specific transliteration table, e.g. \*(L"\s-1ISO 9\*(R".\s0 .ie n .SS "translit(\fI""character oriented string""\fP)" .el .SS "translit(\fI``character oriented string''\fP)" .IX Subsection "translit(character oriented string)" Transliterates the given text according to the object's transliteration table. Returns the transliterated text. .ie n .SS "translit_reverse(\fI""character oriented string""\fP)" .el .SS "translit_reverse(\fI``character oriented string''\fP)" .IX Subsection "translit_reverse(character oriented string)" Transliterates the given text according to the object's transliteration table, but uses it the other way round. For example table \s-1ISO 9\s0 is a transliteration scheme for the conversion of Cyrillic letters to the Latin alphabet. So if used reverse, Latin letters will be mapped to Cyrillic ones. .PP Returns the transliterated text. .SS "\fBcan_reverse()\fP" .IX Subsection "can_reverse()" Returns true (1), iff reverse transliteration is possible. False (0) otherwise. .SS "\fBname()\fP" .IX Subsection "name()" Returns the name of the chosen transliteration table, e.g. \*(L"\s-1ISO 9\*(R".\s0 .SS "\fBdesc()\fP" .IX Subsection "desc()" Returns a description for the transliteration, e.g. \*(L"\s-1ISO 9:1995,\s0 Cyrillic to Latin\*(R". .SH "SUPPORTED TRANSLITERATIONS" .IX Header "SUPPORTED TRANSLITERATIONS" .IP "Cyrillic" 4 .IX Item "Cyrillic" \&\fIALA-LC \s-1RUS\s0\fR, not reversible, \s-1ALA\-LC:1997,\s0 Cyrillic to Latin, Russian .Sp \&\fI\s-1ISO 9\s0\fR, reversible, \s-1ISO 9:1995,\s0 Cyrillic to Latin .Sp \&\fI\s-1ISO/R 9\s0\fR, reversible, \s-1ISO 9:1954,\s0 Cyrillic to Latin .Sp \&\fI\s-1DIN 1460 RUS\s0\fR, reversible, \s-1DIN 1460:1982,\s0 Cyrillic to Latin, Russian .Sp \&\fI\s-1DIN 1460 UKR\s0\fR, reversible, \s-1DIN 1460:1982,\s0 Cyrillic to Latin, Ukrainian .Sp \&\fI\s-1DIN 1460 BUL\s0\fR, reversible, \s-1DIN 1460:1982,\s0 Cyrillic to Latin, Bulgarian .Sp \&\fIStreamlined System \s-1BUL\s0\fR, not reversible, The Streamlined System: 2006, Cyrillic to Latin, Bulgarian .Sp \&\fI\s-1GOST 7.79 RUS\s0\fR, reversible, \s-1GOST 7.79:2000\s0 (table B), Cyrillic to Latin, Russian .Sp \&\fI\s-1GOST 7.79 RUS OLD\s0\fR, not reversible, \s-1GOST 7.79:2000\s0 (table B), Cyrillic to Latin with support for Old Russian (pre 1918), Russian .Sp \&\fI\s-1GOST 7.79 UKR\s0\fR, reversible, \s-1GOST 7.79:2000\s0 (table B), Cyrillic to Latin, Ukrainian .Sp \&\fI\s-1BGN/PCGN RUS\s0 Standard\fR, not reversible, \s-1BGN/PCGN:1947\s0 (Standard Variant), Cyrillic to Latin, Russian .Sp \&\fI\s-1BGN/PCGN RUS\s0 Strict\fR, not reversible, \s-1BGN/PCGN:1947\s0 (Strict Variant), Cyrillic to Latin, Russian .IP "Greek" 4 .IX Item "Greek" \&\fI\s-1ISO 843\s0\fR, not reversible, \s-1ISO 843:1997,\s0 Greek to Latin .Sp \&\fI\s-1DIN 31634\s0\fR, not reversible, \s-1DIN 31634:1982,\s0 Greek to Latin .Sp \&\fIGreeklish\fR, not reversible, Greeklish (Phonetic), Greek to Latin .IP "Latin" 4 .IX Item "Latin" \&\fICommon \s-1CES\s0\fR, not reversible, Czech without diacritics .Sp \&\fICommon \s-1DEU\s0\fR, not reversible, German without umlauts .Sp \&\fICommon \s-1POL\s0\fR, not reversible, Unaccented Polish .Sp \&\fICommon \s-1RON\s0\fR, not reversible, Romanian without diacritics as commonly used .Sp \&\fICommon \s-1SLK\s0\fR, not reversible, Slovak without diacritics .Sp \&\fICommon \s-1SLV\s0\fR, not reversible, Slovenian without diacritics .Sp \&\fI\s-1ISO 8859\-16 RON\s0\fR, reversible, Romanian with appropriate diacritics .IP "Arabic" 4 .IX Item "Arabic" \&\fICommon \s-1ARA\s0\fR, not reversible, Common Romanization of Arabic .IP "Sanskrit" 4 .IX Item "Sanskrit" \&\fI\s-1IAST\s0 Devanagari\fR, not reversible, \s-1IAST\s0 Romanization to Devanāgarī .Sp \&\fIDevanagari \s-1IAST\s0\fR, not reversible, Devanāgarī to \s-1IAST\s0 Romanization .SH "ADDING NEW TRANSLITERATIONS" .IX Header "ADDING NEW TRANSLITERATIONS" In case you want to add your own transliteration tables to Lingua::Translit, have a look at the developer documentation at . .PP A template of a transliteration table is provided as well (\fIxml/template.xml\fR) so you can easily start developing. .SH "RESTRICTIONS" .IX Header "RESTRICTIONS" Lingua::Translit is suited to handle \fBUnicode\fR and utilizes comparisons and regular expressions that rely on \fBcode points\fR. Therefore, any input is supposed to be \fBcharacter oriented\fR (\f(CW\*(C`use utf8;\*(C'\fR, ...) instead of byte oriented. .PP However, if your data is byte oriented, be sure to pass it \&\fB\s-1UTF\-8\s0 encoded\fR to \fBtranslit()\fR and/or \fBtranslit_reverse()\fR \- it will be converted internally. .SH "BUGS" .IX Header "BUGS" None known. .PP Please report bugs using \s-1CPAN\s0's request tracker at . .SH "SEE ALSO" .IX Header "SEE ALSO" Lingua::Translit::Tables, Encode, perlunicode .PP \&\f(CW\*(C`translit\*(C'\fR's manpage .PP .SH "CREDITS" .IX Header "CREDITS" Thanks to Dr. Daniel Eiwen, Romanisches Seminar, Universitaet Koeln for his help on Romanian transliteration. .PP Thanks to Dmitry Smal and Rusar Publishing for contributing the \*(L"ALA-LC \s-1RUS\*(R"\s0 transliteration table. .PP Thanks to Ahmed Elsheshtawy for his help implementing the \*(L"Common \s-1ARA\*(R"\s0 Arabic transliteration. .PP Thanks to Dusan Vuckovic for contributing the \*(L"\s-1ISO/R 9\*(R"\s0 transliteration table. .PP Thanks to Ștefan Suciu for contributing the \*(L"\s-1ISO 8859\-16 RON\*(R"\s0 transliteration table. .PP Thanks to Philip Kime for contributing the \*(L"\s-1IAST\s0 Devanagari\*(R" and \*(L"Devanagari \&\s-1IAST\*(R"\s0 transliteration tables. .PP Thanks to Nikola Lečić for contributing the \*(L"\s-1BGN/PCGN RUS\s0 Standard\*(R" and \&\*(L"\s-1BGN/PCGN RUS\s0 Strict\*(R" transliteration tables. .SH "AUTHORS" .IX Header "AUTHORS" Alex Linke .PP Rona Linke .SH "LICENSE AND COPYRIGHT" .IX Header "LICENSE AND COPYRIGHT" Copyright (C) 2007\-2008 Alex Linke and Rona Linke .PP Copyright (C) 2009\-2016 Lingua-Systems Software GmbH .PP Copyright (C) 2016\-2017 Netzum Sorglos, Lingua-Systems Software GmbH .PP Copyright (C) 2017\-2022 Netzum Sorglos Software GmbH .PP This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.