-
Notifications
You must be signed in to change notification settings - Fork 582
Utf 8 manipulation
If you want to manipulate UTF-8 string, you need to enable utf8 pragma in all your scripts which contain UTF-8 strings.
use Mojolicious::Lite;
use utf8;
my $name = "おおつか たろう";
This is basic convention in Perl, and you remember to save the script as UTF-8.
In Mojolicious, all strings which contain requests are converted to Perl internal strings.
# Parameter value of "foo" is a Perl internal string
my $foo = $self->req->param('foo');
If you save it to data storage such as RDBMS, you must encode it to a byte string by using encode()
from the Encode module.
use Encode 'encode';
$foo = encode('UTF-8', $foo);
Generally, you can use the DBD feature of converting a Perl internal string to byte string if the DBD provides that feature.
# SQLite
my $dbh = DBI->connect($data_source, undef, undef, {sqlite_unicode => 1});
# MySQL
my $dbh = DBI->connect($data_source, $user, $password, {mysql_enable_utf8 => 1});
Please note, that this setting shall be done upon connecting to the database, not in the middle of a connection.
In HTML rendering, Perl internal strings are automatically converted to UTF-8 byte strings. The character set should be specified in HTML header "http-equiv" attribute.
get '/' => 'index';
app->start;
__DATA__
@@ index.html.ep
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>タイトル</title>
</head>
<body>
コンテンツ
</body>
</html>
When you read configuration from a configuration file as JSON using the json_config
plugin, the data is converted from a UTF-8 byte string to a Perl internal string, so remember to save the configuration file as UTF-8.
# Load JSON configuration file
plugin 'json_config';
When you render JSON data, the data is converted from Perl internal strings to UTF-8 byte strings, so all strings provided to the renderer must be Perl internal strings (not preencoded to UTF-8 or another charset).
# JSON rendering
$self->render(json => $data);
In test script, you enable utf8 pragma, and save the script as UTF-8.
use Test::More tests => 3;
use utf8;
my $t = Test::Mojo->new(...);
If you want to contain UTF-8 byte string in query string of URL, use url_escape()
of Mojo::ByteStream. b()
is shortcut of Mojo::ByteStream->new.
# Test get request
my $url = '/foo?name=すずき';
$url = b($url)->url_escape->to_string;
$t->get_ok($url)
->status_is(200)
If you want to post form data for test, form data is encoded as UTF-8 by default. All parameter names and values are converted from Perl internal string to byte string.
# Test post request
$t->post_ok('/foo', form => {name => 'すずき'})
->status_is(200)