phobos/changelog/std-file-readText.dd
Jonathan M Davis 5d52a81e4d Fix issue 15949: Make readText check BOMs.
This makes it so that readText checks for a BOM. If there is a BOM, it
is for UTF-8, UTF-16, or UTF-32, and it doesn't match the requested
string type, then a UTFException is thrown. Other encodings are let
through in case they happen to work with the requested string type and
pass UTF validation.

Also, this makes it so that readText checks the alignment of the buffer
against the requested string type and throws a UTFException instead of
letting the cast throw an Error.
2018-02-10 19:53:29 -07:00

20 lines
1.1 KiB
Text

readText now checks BOMs
$(REF readText, std, file) now checks for a
$(HTTP https://en.wikipedia.org/wiki/Byte_order_mark, BOM). If a BOM is present
and it is for UTF-8, UTF-16, or UTF-32, $(REF readText, std, file) verifies
that it matches the requested string type and the endianness of the machine,
and if there is a mismatch, a $(REF UTFException, std, utf) is thrown without
bothering to validate the string.
If there is no BOM, or if the BOM is not for UTF-8, UTF-16, or UTF-32, then the
behavior is what it's always been, and UTF validation continues as per normal,
so if the text isn't valid for the requested string type, a
$(REF UTFException, std, utf) will be thrown.
In addition, before the buffer is cast to the requested string type, the
alignment is checked (e.g. 5 bytes don't fit cleanly in an array of $(D wchar)
or $(D dchar)), and a $(REF UTFException, std, utf) is now throw if the
number of bytes does not align with the requested string type. Previously, the
alignment was not checked before casting, so if there was an alignment mismatch,
the cast would throw an Error, killing the program.