NAME¶
Unicode::LineBreak~[ja] - UAX #14 Unicode XXXXXXXXX
SYNOPSIS¶
use Unicode::LineBreak;
$lb = Unicode::LineBreak->new();
$broken = $lb->break($string);
DESCRIPTION¶
Unicode::LineBreak XXUnicode XXXXXX14 [UAX #14] XXXX Unicode XXXXXXXXXXXXXXX
XXXXXXXXXXXXXXX11 [UAX #11] XXXXXX East_Asian_Width XXXXXXXXXX
XXXXXXXXXXXXX
XXXXXmandatory breakXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX [UAX #14] XXXXXXXXXXXX
XXXXXdirect breakXX
XXXXXindirect breakXXXXXX
XXXXXXXXXalphabetic charactersXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXideographic charactersXXXXXXXXXXXXXXXXXXXX [UAX #14]
XXXXXXXXXXXXXXXX AL XXXXXXXXXXXXXXXX ID XXXXXXX (XXXXXXXXXXXXXXXXXXXXXXXX)X
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXX
XXXwideXXX
XXXnarrowXXXXXXXXXXXnonspacingXXXXXXXXXXXXX 2 XX1 XX0
XXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXX
PUBLIC INTERFACE¶
XXXX¶
- new ([KEY => VALUE, ...])
- XXXXXXXX KEY => VALUE XXXXXXX "XXXXX" XXXX
- break (STRING)
- XXXXXXXXXXX Unicode XXX STRING XXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXX
- break_partial (STRING)
- XXXXXXXXXXX break() XXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXSTRING XXX "undef" XXXXX
- config (KEY)
- config (KEY => VALUE, ...)
- XXXXXXXXXXX XXXXXXXXXXXXX KEY => VALUE XXXXXXX "XXXXX"
XXXX
- copy
- XXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX
XXXXX¶
- breakingRule (BEFORESTR, AFTERSTR)
- XXXXXXXXXXX XXX BEFORESTR X AFTERSTR XXXXXXXXXXXXX XXXXXXX
"XX" XXXX
X: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXbreak()
XXXXXXXXXXXXXX
- context ([Charset => CHARSET], [Language => LANGUAGE])
- XXX XXXXXXXX CHARSET XXXXXXXX LANGUAGE XXXXXXXXXX/XXXXXXXXX
XXXXX¶
"new"X"config" XXXXXXXXXXXXXXXXXXX XXXXX ([
E])XXXXXXXXXX ([
G]) (Unicode::GCString~[ja] XXX)XXXXXX ([
L]) XXXXXXXXXXX
- BreakIndent => "YES" | "NO"
- [L] XXX SPACE XXX (XXXXX) XXXXXXXXXXXX [UAX #14] X SPACE
XXXXXXXXXXXXXXXXX XXXX "YES"X
X: XXXXXXXXXXXX 1.011 XXXXXXX
- CharMax => NUMBER
- [L] XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXX XXXX
998X 0 XXXXXXX
- ColMin => NUMBER
- [L] XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXX 0X
- ColMax => NUMBER
- [L] XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXX 76X
"Urgent" XXXXXXXX "XXXXXXXXXXX" XXXX
- ComplexBreaking => "YES" | "NO"
- [L] XXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXX
"YES"X
- Context => CONTEXT
- [E][L] XX/XXXXXXXXXXX XXXXXXXX "EASTASIAN" X
"NONEASTASIAN"X XXXXXX "NONEASTASIAN"X
"EASTASIAN" XXXXXEast_Asian_Width XXXXX (A) XXXXXXXXXXXXXXXXXXXXX
AI XXXXXXXXX (ID) XXXXX
"NONEASTASIAN" XXXXXEast_Asian_Width XXXXX (A)
XXXXXXXXXXXXXXXXXXXXX AI XXXXXXXXX (AL) XXXXX
- EAWidth => "[" ORD "=>" PROPERTY
"]"
- EAWidth => "undef"
- [E] XXXXXX East_Asian_Width XXXXXXXXX ORD XXXX UCS
XXXXXXXXXXXXXXXXXXX PROPERTY X East_Asian_Width XXXXXXXXXXXX
("XX" XXX)X XXXXXXXXXXXXXXXXX "undef"
XXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXEast_Asian_width XXXXXXXXXXX "XXXXXXXXX" XXXX
- Format => METHOD
- [L] XXXXXXXXXXXXXXXXXX
- "SIMPLE"
- XXXXXX XXXXXXXXXXXXXXXXXX
- "NEWLINE"
- "Newline" XXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXX
- "TRIM"
- XXXXXXXXXXXXXXXX XXXXXXXXXXXXXXX
- "undef"
- XXXXXX (XXXXXX)X
- XXXXXXXXXX
- "XXXX" XXXX
- HangulAsAL => "YES" | "NO"
- [L] XXXXXXXXXXXXXXXXXconjoining jamoXXXXXXXXXX (AL) XXXX XXXX
"NO"X
- LBClass => "[" ORD "=>" CLASS "]"
- LBClass => "undef"
- [G][L] XXXXXXXXXXX (XX) XXXXXXX ORD XXXX UCS
XXXXXXXXXXXXXXXXXXX CLASS XXXXXXXXXXXX ("XX" XXX)X
XXXXXXXXXXXXXXXXX "undef" XXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXX "XXXXXXXXX" XXXX
- LegacyCM => "YES" | "NO"
- [G][L] XXXXXXXXXXXXXXXXXXXXXX (ID) XXXX Unicode 5.0
XXXXXXXXXXXXXXXXXXXXXXXXXXX XXXX "YES"X
- Newline => STRING
- [L] XXXXXXXXX Unicode XXXX XXXX "\n"X
- Prep => METHOD
- [L] XXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXX METHOD XXXXXXXXXXXXXX
- "NONBREAKURI"
- URI XXXXXXX
- "BREAKURI"
- URI XXXXXXXXXXXXXXXXX XXXX [CMOS] X 6.17 XX 17.11 XXXXX
- "[" REGEX, SUBREF "]"
- XXXX REGEX XXXXXXXXXXXSUBREF XXXXXXXXXXXXXXXXXX XXX
"XXXXXXXXXXX" XXXX
- "undef"
- XXXXXXXXXXXXXXXXXXXX
- Sizing => METHOD
- [L] XXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXX
- "UAX11"
- XXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXX
- "undef"
- XXXXXXXXXXXXXXX (Unicode::GCString XX) XXXXXX
- XXXXXXXXXX
- "XXXXXXX" XXXX
"ColMax"X"ColMin"X"EAWidth" XXXXXXXXX
- Urgent => METHOD
- [L] XXXXXXXXXXXXXXXX XXXXXXXXXXXXX
- "CROAK"
- XXXXXXXXXXXXXXXX
- "FORCE"
- XXXXXXXXXXXXXXXXX
- "undef"
- XXXXXX XXXXXXXXXXXXXX
- XXXXXXXXXX
- "XXXXXXXXXXX" XXXX
- ViramaAsJoiner => "YES" | "NO"
- [G] XXXXXXX (XXXXXXXXXXXXXXXXXXXXXXXXX) XXXXXXXXXXXXXXX XXXX
"YES"X X: XXXXXXXXXXXX 2011.001_29 XXXXXXX XXXXXXXXX
"NO" XXXXXXXX XXXX[UAX #29] XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
- "EA_Na", "EA_N", "EA_A", "EA_W",
"EA_H", "EA_F"
- [UAX #11] XXXXXX 6 XX East_Asian_Width XXXX X (Na)XXX (N)XXX (A)XX (W)XXX
(H)XXX (F)X
- "EA_Z"
- XXXXXXXXXX East_Asian_Width XXXXX
X: XXXXXXXXXXXXXXXXXXXXXXXXXXXX [UAX #11] XXXXXXXX
- "LB_BK", "LB_CR", "LB_LF",
"LB_NL", "LB_SP", "LB_OP", "LB_CL",
"LB_CP", "LB_QU", "LB_GL", "LB_NS",
"LB_EX", "LB_SY", "LB_IS", "LB_PR",
"LB_PO", "LB_NU", "LB_AL", "LB_HL",
"LB_ID", "LB_IN", "LB_HY", "LB_BA",
"LB_BB", "LB_B2", "LB_CB", "LB_ZW",
"LB_CM", "LB_WJ", "LB_H2", "LB_H3",
"LB_JL", "LB_JV", "LB_JT", "LB_SG",
"LB_AI", "LB_CJ", "LB_SA", "LB_XX",
"LB_RI"
- [UAX #14] XXXXXX 40 XXXXXXX (XX)X
X: XXX CP XUnicode 5.2.0XXXXXXXX XXX HL X CJ XUnicode 6.1.0XXXXXXXX
XXX RI X Unicode 6.2.0XXXXXXXX
- "MANDATORY", "DIRECT", "INDIRECT",
"PROHIBITED"
- XXXXXXXX 4 XXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
- "Unicode::LineBreak::SouthEastAsian::supported"
- XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXX XXXXXXX
"undef"X
X: XXXXXXXXXXXXXXXXXXXXXXXXXXX
- "UNICODE_VERSION"
- XXXXXXXXXXXX Unicode XXXXXXXXXXX
CUSTOMIZATION¶
XXXX¶
"Format" XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 3 XXXXXXXXXXXXXXXX
$XXX = &XXXXXX(SELF, EVENT, STR);
SELF X Unicode::LineBreak XXXXXXXEVENT XXXXXXXXXXXXXXXXXXXXXSTR XXXXXXXXXXXX
Unicode XXXXXXX
EVENT |XXXXX |STR
-----------------------------------------------------------------
"sot" |XXXXXX |XXXXXXX
"sop" |XXXXXX |XXXXXX
"sol" |XXXXXX |XXXXXXX
"" |XXXXX |XXX (XXXXXXXXXX)
"eol" |XXXX |XXXXXXXXXXX
"eop" |XXXX |XXXXXXXXXXX
"eot" |XXXXXX |XXXXXXXXXXX (XXX)
-----------------------------------------------------------------
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"undef" XXXXXXX
XXX"sot"X"sop"X"sol"
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XX: XXXXXXXXXXXXXXXXXXXXXXX Unicode::GCString~[ja] XXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXX
sub fmt {
if ($_[1] =~ /^eo/) {
return "\n";
}
return undef;
}
my $lb = Unicode::LineBreak->new(Format => \&fmt);
$output = $lb->break($text);
XXXXXXXXXXX¶
XXXXXXXXXXXXX CharMaxXColMaxXColMin XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXX "Urgent" XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 2
XXXXXXXXXXXXXXXX
@XXX = &XXXXXX(SELF, STR);
SELF X Unicode::LineBreak XXXXXXXSTR XXXXXX Unicode XXXX
XXXXXXXXXXX STR XXXXXXXXXXXXXXXXXXXXXX
XX: XXXXXXXXXXXXXXXXXXXXXXX Unicode::GCString~[ja] XXX
XXXXXXXXXXXXXXXXXX (XXXXXXX) XXXXXXXXXXXXXXXXXXXXXXXXX
sub hyphenize {
return map {$_ =~ s/yl$/yl-/; $_} split /(\w+?yl(?=\w))/, $_[1];
}
my $lb = Unicode::LineBreak->new(Urgent => \&hyphenize);
$output = $lb->break("Methionylthreonylthreonylglutaminylarginyl...");
"Prep" XXXXXX [REGEX, SUBREF] XXXXXXXXXXXXXXXXXXXX 2 XXXXXXXXXXXXXXXX
@XXX = &XXXXXX(SELF, STR);
SELF X Unicode::LineBreak XXXXXXXSTR X REGEX XXXXXXXXXXX Unicode XXXX
XXXXXXXXXXX STR XXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXHTTP URL X [CMOS] XXXXXXXXXXXX
my $url = qr{http://[\x21-\x7E]+}i;
sub breakurl {
my $self = shift;
my $str = shift;
return split m{(?<=[/]) (?=[^/]) |
(?<=[^-.]) (?=[-~.,_?\#%=&]) |
(?<=[=&]) (?=.)}x, $str;
}
my $lb = Unicode::LineBreak->new(Prep => [$url, \&breakurl]);
$output = $lb->break($string);
XXXXX
Unicode::LineBreak XXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXX
sub paraformat {
my $self = shift;
my $action = shift;
my $str = shift;
if ($action eq 'sot' or $action eq 'sop') {
$self->{'line'} = '';
} elsif ($action eq '') {
$self->{'line'} = $str;
} elsif ($action eq 'eol') {
return "\n";
} elsif ($action eq 'eop') {
if (length $self->{'line'}) {
return "\n\n";
} else {
return "\n";
}
} elsif ($action eq 'eot') {
return "\n";
}
return undef;
}
my $lb = Unicode::LineBreak->new(Format => \¶format);
$output = $lb->break($string);
XXXXXXX¶
"Sizing" XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 5 XXXXXXXXXXXXXXXX
$XX = &XXXXXX(SELF, LEN, PRE, SPC, STR);
SELF X Unicode::LineBreak XXXXXXXLEN XXXXXXXXXXXXPRE XXXXX Unicode XXXXSPC
XXXXXXXXXXXSTR XXXXX Unicode XXXX
XXXXXXX "PRE.SPC.STR" XXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXX"ColMin" XXXXXXXX "ColMax"
XXXXXXXXXXXXXXXXXXXXXX
XX: XXXXXXXXXXXXXXXXXXXXXXX Unicode::GCString~[ja] XXX
XXXXXXXXXXXXX 8 XXXXXXXXXXXXXXXXXXXXXXX
sub tabbedsizing {
my ($self, $cols, $pre, $spc, $str) = @_;
my $spcstr = $spc.$str;
while ($spcstr->lbc == LB_SP) {
my $c = $spcstr->item(0);
if ($c eq "\t") {
$cols += 8 - $cols % 8;
} else {
$cols += $c->columns;
}
$spcstr = $spcstr->substr(1);
}
$cols += $spcstr->columns;
return $cols;
};
my $lb = Unicode::LineBreak->new(LBClass => [ord("\t") => LB_SP],
Sizing => \&tabbedsizing);
$output = $lb->break($string);
XXXXXXXXX¶
"LBClass" XXXXXXXX "EAWidth" XXXXXXXXXXXXXXXXX (XX) X
East_Asian_Width XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXX
XXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXX (NS XXX CJ) XXXX XXXXX LBClass
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (ID) XXXXX
- "KANA_NONSTARTERS() => LB_ID"
- XXXXXXXXX
- "IDEOGRAPHIC_ITERATION_MARKS() => LB_ID"
- XXXXXXXXXXXXX U+3005 XXXXXXU+303B XXXXXU+309D XXXXXXXXXU+309E XXXXXXXX
(XX)XU+30FD XXXXXXXXXU+30FE XXXXXXXX (XX)X
XXXXXXXXXXXXXX
- "KANA_SMALL_LETTERS() => LB_ID"
- "KANA_PROLONGED_SOUND_MARKS() => LB_ID"
- XXXXXX XXXXXX U+3041 X, U+3043 X, U+3045 X, U+3047 X, U+3049 X, U+3063 X,
U+3083 X, U+3085 X, U+3087 X, U+308E X, U+3095 X, U+3096 XX XXXXXX U+30A1
X, U+30A3 X, U+30A5 X, U+30A7 X, U+30A9 X, U+30C3 X, U+30E3 X, U+30E5 X,
U+30E7 X, U+30EE X, U+30F5 X, U+30F6 XX XXXXXXX U+31F0 X - U+31FF XX
XXXXXX (XXXX) U+FF67 X - U+FF6F XX
XXXXX U+30FC XXXXXU+FF70 XXXX (XXXX)X
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX[JIS X 4051] 6.1.1X[JLREQ]
3.1.7 X [UAX14] XXXX
XXU+3095 X, U+3096 X, U+30F5 X, U+30F6 X XXXXXXXXXXXX
- "MASU_MARK() => LB_ID"
- U+303C XXXXX
XXXXXXXXXXXXXXXXX "XX" X "XX" XXXXXXXXXXXX
XXXXXXX [UAX #14] XXXXXXXX (NS) XXXXXXXX[JIS X 4051] X [JLREQ] XXXXXXX (13)
X cl-19 (ID XXX) XXXXXXX
XXXXXX
XXXXXXXXXXXXXXXXXX (QU) XXXX
- "BACKWARD_QUOTES() => LB_OP, FORWARD_QUOTES() =>
LB_CL"
- XXXX (XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXX) XXXXXXXX 9
XXXXXXXXXXX (X X) XXXXXXX 9 XXXXXXX (X X) XXXXX
- "FORWARD_QUOTES() => LB_OP, BACKWARD_QUOTES() =>
LB_CL"
- XXXXX (XXXXXXXXXXXXXXXX) XXX9 XXXXXXX (X X) XXXXXXX9 XXXXXXXXXXX (X X)
XXXXXXXXXX
- "BACKWARD_GUILLEMETS() => LB_OP, FORWARD_GUILLEMETS() =>
LB_CL"
- XXXXXXXXXXXXXXXXXXXXXXXXXXXX (X X) XXXXXXXXXXXXXX (X X) XXXXXXXXXX
- "FORWARD_GUILLEMETS() => LB_OP, BACKWARD_GUILLEMETS() =>
LB_CL"
- XXXXXXXXXXXXXXXXXXXXX (X X) XXXXXXXXXXXXXX (X X) XXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXX9 XXXXXXXX XXXXXXX (X X X X) XXXXXXXXXXXXXXXXX
XXXX
- "IDEOGRAPHIC_SPACE() => LB_BA"
- U+3000 XXXXXXXXXXXXXXXXX XXXXXXXXXXXX
- "IDEOGRAPHIC_SPACE() => LB_ID"
- XXXXXXXXXXXXXXXX Unicode 6.2XXXXXXXXXXXXXXXX
- "IDEOGRAPHIC_SPACE() => LB_SP"
- XXXXXXXXXXXXXXXXXXXXXXXXX
East_Asian_Width XX
XXXXXXXXXXXXXXXXXXXXXXXXXXXX (A) X East_Asian_Width XXXXXXXXXXXXXXXXXXXXXX
"EASTASIAN" XXXXXXXXXXXXX "EAWidth => [
AMBIGUOUS_"*"() => EA_N ]" XXXXXXXXXXXXXXXXXXXXXXXXXXX
- "AMBIGUOUS_ALPHABETICS() => EA_N"
- XXXXXXXXX East_Asian_Width XX N (XX) XXXXXXX
- "AMBIGUOUS_CYRILLIC() => EA_N"
- "AMBIGUOUS_GREEK() => EA_N"
- "AMBIGUOUS_LATIN() => EA_N"
- XX (A) XXXXXXXXXXXXXXXXXXXXXXXXXX (N) XXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXUnicode XXXXXX (F)
XXXXXXXXXXXXXX (Na) XXXXXXXXXXXXXXXXEAWidth XXXXXXXXXXXXXXXXXXXXXXXXXXX
"EASTASIAN" XXXXXXXXXXXX
- "QUESTIONABLE_NARROW_SIGNS() => EA_A"
- U+00A2 XXXXXXU+00A3 XXXXXXU+00A5 XXX (XXXXXX)XU+00A6 XXXXU+00AC XXXU+00AF
XXXXX
XXXXXX¶
"new" XXXXXXX "config" XXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXX
Unicode/LineBreak/Defaults.pmX XXX
Unicode/LineBreak/Defaults.pm.sample XXXXXXXX
BUGS¶
XXXXXXXXXXXXXXXXXXXXXXXXX
CPAN Request Tracker:
<
http://rt.cpan.org/Public/Dist/Display.html?Name=Unicode-LineBreak>.
VERSION¶
$VERSION XXXXXXXXXXX
XXXXXX¶
- 2012.06
- •
- eawidth() XXXXXXXXXX XXXX "columns" in Unicode::GCString
XXXXXXXXXXX
- •
- lbclass() XXXXXXXXXX "lbc" in Unicode::GCString X
"lbcext" in Unicode::GCString XXXXXXXX
XXXXXXX¶
XXXXXXXXXXXXXXXXXXXXXUnicode XX 6.3.0XXXXX
XXXXXXXXXXXXXX UAX14-C2 XXXXXXXXXXX
IMPLEMENTATION NOTES¶
- •
- XXXXXXXXXXXX NS XXXXXX ID XXXXXXXXXXX
- •
- XXXXXXXXXXXXXXXXXXX ID XXXXXX AL XXXXXXXXXXX
- •
- AI XXXXXXXXX AL X ID XXXXXXXXXXXXXXX
- •
- CB XXXXXXXXXXXXXXX
- •
- CJ XXXXXXXXXXX NS XXXXXXXXXXXXXXXXXXXXXXXX
- •
- XXXXXXXXXXXXXXXXXXXXXXXXX SA XXXXXXXXX AL XXXXXX
XXXXGrapheme_Cluster_Break XXXXX Extend X SpacingMark XXXXXX CM
XXXXXX
- •
- SG X XX XXXXXXXXX AL XXXXXX
- •
- XXX UCS XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XX | UAX #14 | UAX #11 | XX
-------------------------------------------------------------
U+20A0..U+20CF | PR [*1] | N [*2] | XXXX
U+3400..U+4DBF | ID | W | CJKXX
U+4E00..U+9FFF | ID | W | CJKXX
U+D800..U+DFFF | AL (SG) | N | XXXXX
U+E000..U+F8FF | AL (XX) | F X N (A) | XXXX
U+F900..U+FAFF | ID | W | CJKXX
U+20000..U+2FFFD | ID | W | CJKXX
U+30000..U+3FFFD | ID | W | XXX
U+F0000..U+FFFFD | AL (XX) | F X N (A) | XXXX
U+100000..U+10FFFD | AL (XX) | F X N (A) | XXXX
XXXXXXXX | AL (XX) | N | XXXXXX
| | | XXXXXX
-------------------------------------------------------------
[*1] U+20A7 XXXXX (PO)XU+20B6 XXXXXXXXXXX
(PO)XU+20BB XXXXXXXXXXXXXXX (PO) XXXX
[*2] U+20A9 XXXXX (H)XU+20AC XXXXX (F X N (A)) X
XXX
- •
- XXXXXXXXX MnXMeXCcXCfXZlXZp XXXXXXXXXXXXXXXXXXXXXXXXXX
REFERENCES¶
- [CMOS]
- The Chicago Manual of Style, 15th edition. University of Chicago
Press, 2003.
- [JIS X 4051]
- JIS X 4051:2004 XXXXXXXXXX. XXXXXX, 2004.
- [JLREQ]
- XXXXX. XXXXXXXXXX, W3C XXXXX 2012X4X3X.
<http://www.w3.org/TR/2012/NOTE-jlreq-20120403/ja/>.
- [UAX #11]
- A. Freytag (ed.) (2008-2009). Unicode Standard Annex #11: East Asian
Width, Revisions 17-19. <http://unicode.org/reports/tr11/>.
- [UAX #14]
- A. Freytag and A. Heninger (eds.) (2008-2013). Unicode Standard Annex
#14: Unicode Line Breaking Algorithm, Revisions 22-32.
<http://unicode.org/reports/tr14/>.
- [UAX #29]
- Mark Davis (ed.) (2009-2013). Unicode Standard Annex #29: Unicode Text
Segmentation, Revisions 15-23.
<http://www.unicode.org/reports/tr29/>.
SEE ALSO¶
Text::LineFold~[ja], Text::Wrap, Unicode::GCString~[ja].
AUTHOR¶
Copyright (C) 2009-2013 Hatuka*nezumi - IKEDA Soji <hatuka(at)nezumi.nu>.
This program is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.