The Haskell 98 Report defines values of the Char
type
as the code points of Unicode (or equivalently ISO/IEC 10646).
However files and other I/O streams typically consist of bytes,
with characters in text files encoded as one or more bytes.
In many systems, a similar encoding is also required for interactions
with the system.
Therefore at these points Hugs converts characters to and from sequences
of bytes in a manner determined by the LC_CTYPE
category of the current locale.
This conversion is not applied to the contents of files opened in binary mode. It is applied to program text, so you can use all the characters representable in your locale within comments and string literals. However only ISO Latin-1 characters are permitted in identifiers.
The form of the locale string, and how it is set, vary between systems.
On POSIX systems, this value is taken from the first nonempty environment
variable from LC_ALL
, LC_CTYPE
and
LANG
.
On Windows, this value is the “user-default ANSI code page” (not the “current OEM code page” or the “ANSI code page”). This may be set using the General tab of the “Regional Options” control panel.