]> git.imager.perl.org - bse.git/blame - site/docs/bse-unicode.pod
allow hypens in formlink form name values
[bse.git] / site / docs / bse-unicode.pod
CommitLineData
3f9c8a96
TC
1=head1 NAME
2
3bse-unicode.pod - using unicode with BSE
4
5=head1 DESCRIPTION
6
7Using utf-8 with BSE is currently experimental. This latest support
8is independent and incompatible with previous implementation changes.
9
10You will need to perform three steps:
11
12=over
13
14=item 1.
15
16change the database character set to utf-8
17
18=item 2.
19
20change the BSE character set to utf-8
21
22=item 3.
23
24enable the utf8 flag.
25
26=back
27
28=head2 Changing the database character set
29
30For a new system you can simply do:
31
32 cd util
33 perl upgrade_mysql.pl -c utf8
34
35For an old system it will be more complex.
36
37If the character set the database uses for your tables matches the
38character set of the data you already have stored, then the above will
39work.
40
41To check the character set:
42
43 mysql -uuser -p databasename
44 mysql> show full columns from order_item;
45
46If the C<Collation> column is a collation for your character set the
47the above will work.
48
49Note that Mysql's C<latin1> is equivalent to C<windows-1252>.
50
51If your database character set isn't equivalent you can fix the table
52character sets by converting to binary and then to the correct
53character set:
54
55 perl upgrade_mysql.pl -c binary
56 perl upgrade_mysql.pl -c latin1
57
58Only then perform the conversion to C<utf8>.
59
60=head2 Changing the BSE character set to UTF-8
61
62As you did historically, set C<charset> in C<html>:
63
64 [html]
65 charset=utf-8
66
67=head2 Enable the C<utf8> flag
68
69Set C<utf8=1> in C<[basic]>:
70
71 [basic]
72 utf8=1
73
74Note that this flag doesn't require that the BSE character set be set
75to utf-8, but it is recommended.
76
77The flag currently causes the following changes in behaviour:
78
79=over
80
81=item *
82
83template files are converted from the BSE character set to unicode for
84internal processing.
85
86=item *
87
88if the BSE character set is utf-8 then the database handle is
89configured to work in unicode.
90
91=item *
92
93template processed output is converted from unicode to the BSE
94character set on output.
95
96=item *
97
98JSON output is explicitly converted to UTF-8.
99
100=back
101
102BSE character set refers to the value configured in [html].charset
103
104=cut