NAME¶
HTML::FormFu::Manual::Unicode - Working with unicode
DESCRIPTION¶
Working with unicode.
For a practical example, see the Catalyst application in the
"examples/unicode" directory in this distribution.
ASSUMPTIONS¶
In this tutorial, we're assuming that all encodings are UTF-8. It's relatively
simple to combine different encodings from different sources, but that's
beyond the scope of this tutorial.
For simplicity, we're also going to assume that you're using Catalyst for your
web-framework, DBIx::Class for your database ORM, TT for your templating
system, and YAML format "HTML::FormFu" configuration files, with
YAML::XS installed. However, the principles we'll cover should translate to
whatever technologies you chose to work with.
BASICS¶
To make it short and sweet: you must decode all data going into your program,
and encode all data coming from your program.
Skip to "CHANGES REQUIRED" if you want to see what you need to do
without any other explanation.
If you're using "Catalyst", Catalyst::Plugin::Unicode will decode all
input parameters sent from the browser to your application - see
"Catalyst Configuration".
If you're using some other framework or, in any case, you need to decode the
input parameters yourself, please take a look at HTML::FormFu::Filter::Encode.
Data from the database¶
If you're using DBIx::Class, DBIx::Class::UTF8Columns is likely the best
options, as it will decode all input retrieved from the database - see
"DBIx::Class Configuration".
In other cases (i.e. plain DBI), you still need to decode the string data coming
from the database. This varies depending on the database server. For MySQL,
for instance, you can use the "mysql_enable_utf8" attribute: see
DBD::mysql documentation for details.
Your template files¶
Set TT to decode all template files - see "TT Configuration".
Set "HTML::FormFu" to decode all template files - see
"HTML::FormFu Template Configuration".
If you're using "YAML" config files, your files will automatically be
decoded by "load_config_file|HTML::FormFu/load_config_file" and
"load_config_filestem|HTML::FormFu/load_config_filestem".
If you have Config::General config files, your files will automatically be
decoded by "load_config_file|HTML::FormFu/load_config_file" and
"load_config_filestem|HTML::FormFu/load_config_filestem", which
automatically sets Config::General's "-UTF8" setting.
Your perl source code¶
Any perl source files which contain Unicode characters must use the utf8 module.
OUTPUT¶
Data saved to the database¶
With "DBIx::Class", DBIx::Class::UTF8Columns will encode all data sent
to the database - see "DBIx::Class Configuration".
HTML sent to the browser¶
With "Catalyst", Catalyst::Plugin::Unicode will encode all output sent
from your application to the browser - see "Catalyst Configuration".
In other circumstances you need to be sure to output your Unicode (decoded)
strings in UTF-8. To do this you can encode your output before it's sent to
the browser with something like:
use utf8;
if ( $output && utf8::is_utf8($output) ){
utf8::encode( $output ); # Encodes in-place
}
Another option is to set the "binmode" for "STDOUT":
bindmode STDOUT, ':utf8';
However, be sure to do this
only when sending UTF-8 data: if you're
serving images, PFD files, etc, "binmode" should remain set to
":raw".
CHANGES REQUIRED¶
Catalyst Configuration¶
Add Catalyst::Plugin::Unicode to the list of Catalyst plugins:
use Catalyst qw( ConfigLoader Static::Simple Unicode );
DBIx::Class Configuration¶
Add DBIx::Class::UTF8Columns to the list of components loaded, for each table
that has columns storing unicode:
__PACKAGE__->load_components( qw( UTF8Columns HTML::FormFu PK::Auto Core ) );
Pass each column name that will store unicode to "utf8_columns()":
__PACKAGE__->utf8_columns( qw( lastname firstname ) );
TT Configuration¶
Tell TT to decode all template files, by adding the following to your
application config in MyApp.pm
package MyApp;
use strict;
use parent 'Catalyst';
use Catalyst qw( ConfigLoader );
MyApp->config({
'View::TT' => {
ENCODING => 'UTF-8',
},
});
1;
Make "HTML::FormFu" tell TT to decode all template files, by adding
the following to your "myapp.yml" Catalyst configuration file:
package MyApp;
use strict;
use parent 'Catalyst';
use Catalyst qw( ConfigLoader );
MyApp->config({
'Controller::HTML::FormFu' => {
constructor => {
tt_args => {
ENCODING => 'UTF-8',
},
},
},
});
1;
These above 2 examples should be combined, like so:
package MyApp;
use strict;
use parent 'Catalyst';
use Catalyst qw( ConfigLoader );
MyApp->config({
'Controller::HTML::FormFu' => {
constructor => {
tt_args => {
ENCODING => 'UTF-8',
},
},
},
'View::TT' => {
ENCODING => 'UTF-8',
},
});
1;
AUTHORS¶
Carl Franks "cfranks@cpan.org" Michele Beltrame
"arthas@cpan.org" (contributions)
COPYRIGHT¶
This document is free, you can redistribute it and/or modify it under the same
terms as Perl itself.