Skip to content

Latest commit

 

History

History
94 lines (55 loc) · 6.24 KB

10.1.字符串.md

File metadata and controls

94 lines (55 loc) · 6.24 KB

10.1.字符串

类型: String 一个字符串是一个字符序列。

字符编码

使用 .code 属性到一个常量单字节,可以编译它的ASCII字节码:

"#".code // will compile as 35

查看 String API 了解更多关于它的方法的介绍。

10.1.1.字符串字面值

A string literal is a sequence of characters inside a pair of double quotes or single quotes:

var a = "foo";
var b = 'foo';
trace(a == b); // true

The only difference between the two forms is that single-quoted literals allow string interpolation.

Sequence Meaning Unicode codepoint (decimal) Unicode codepoint (hexadecimal)
\t horizontal tab (TAB) 9 0x09
\n new line (LF) 10 0x0A
\r new line (CR) 13 0x0D
\" double quote 34 0x22
\' single quote 39 0x27
\\\\\\\\ backslash 92 0x5C
\NNN ASCII escape where NNN is 3 octal digits 0 - 127 0x00 - 0x7F
\xNN ASCII escape where NN is a pair of hexadecimal digits 0 - 127 0x00 - 0x7F
\uNNNN Unicode escape where NNNN is 4 hexadecimal digits 0 - 65535 0x0000 - 0xFFFF
\u{N...} Unicode escape where N... is 1-6 hexadecimal digits 0 - 1114111 0x000000 - 0x10FFFF

10.1.2.Unicode

All Haxe targets except Neko support Unicode in strings by default. The compile-time define target.unicode is set on targets where Unicode is supported.

A string in Haxe code represents a valid sequence of Unicode codepoints. Due to differing internal representations of strings across targets, only the basic multilingual plane (BMP) is supported consistently: every BMP Unicode codepoint corresponds to exactly one string character.

It is still possible to work with strings including non-BMP characters on all targets without having to manually decode surrogate pairs by using the Unicode iterators API provided in the standard library.

10.1.3.Encoding

On some targets, the internal representation is UTF-16, which means that non-BMP Unicode codepoints are represented using surrogate pairs. The compile-time define target.utf16 is set when the target uses UTF-16 internally.

Some Haxe targets disallow null-bytes (Unicode codepoint 0) in strings. Additionally, some Haxe core APIs assume a null-byte terminates strings. To consistently deal with binary data, including null-bytes, use the haxe.io.Bytes API.

Target target.unicode target.utf16 Internal encoding Null-byte allowed
Flash yes yes UTF-16 no
JavaScript yes yes UTF-16 yes (except in some old browsers)
ActionScript 3 yes yes UTF-16 no
C++ yes yes ASCII or UTF-16 (if needed) yes
Java yes yes UTF-16 yes
JVM yes yes UTF-16 yes
C# yes yes UTF-16 yes
Python yes no Latin-1, UCS-2, or UCS-4 (see PEP 393) yes
Lua yes no UTF-8 yes
PHP yes no binary yes
Eval yes no UTF-8 yes
Neko no no binary yes
HashLink yes yes UTF-16 no