Text::Undiacritic(3pm) | User Contributed Perl Documentation | Text::Undiacritic(3pm) |
NAME¶
Text::Undiacritic - remove diacritics from a string
VERSION¶
This document describes Text::Undiacritic 0.01
SYNOPSIS¶
use Text::Undiacritic qw(undiacritic); $ascii_string = undiacritic( $czech_string );
DESCRIPTION¶
Changes characters with diacritics into their base characters.
Also changes into base character in cases where UNICODE does not provide a decomposition.
E.g. all characters '... WITH STROKE' like 'LATIN SMALL LETTER L WITH STROKE' do not have a decomposition. In the latter case the result will be 'LATIN SMALL LETTER L'.
Removing diacritics is useful for matching text independent of spelling variants.
SUBROUTINES/METHODS¶
undiacritic¶
$ascii_string = undiacritic( $characters );
Removes diacritics from $characters and returns a simplified character string.
The input string must be in character modus, i.e. UNICODE code points.
DIAGNOSTICS¶
CONFIGURATION AND ENVIRONMENT¶
DEPENDENCIES¶
- version
- charnames
- Unicode::Normalize
INCOMPATIBILITIES¶
BUGS AND LIMITATIONS¶
There is no experience if this module gives useful results for scripts other than Latin.
AUTHOR¶
Helmut Wollmersdorfer "<WOLLMERS@cpan.org>"
LICENSE AND COPYRIGHT¶
Copyright (c) 2007, Helmut Wollmersdorfer "<WOLLMERS@cpan.org>". All rights reserved.
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
2023-02-05 | perl v5.36.0 |