Commit graph

79 commits

Author SHA1 Message Date
jmdavis
69a221b9ee Changed toUTFz to use an outer template.
It's easier to define aliases this way. I created two templates though
so that you can still pass both template arguments if you want to, and
it'll avoid breaking any code.
2012-03-14 00:28:48 -07:00
jmdavis
a58e19bc27 Change toUTF16z to use toUTFz. 2012-03-13 21:54:16 -07:00
dawg
e782771008 code review changes 2012-03-08 22:51:05 -08:00
dawg
8599d6b4ac optimize std.utf.stride
- templatize functions so that short paths can be inlined
- replace table lookup with bit scan
2012-03-08 22:51:05 -08:00
Daniel Murphy
983349625f This only works because of a bug in dmd, and it was probably supposed to be a char literal. 2012-02-21 15:51:35 +11:00
KennyTM~
bb05d579a2 Add 'pure' attribute to toUTF8, toUTF16 & toUTF32. 2012-02-06 04:56:26 +08:00
dawg
1b0edb728c optimize std.utf.decode
- Use fast path tests for non-complex unicode sequences that can be
   inlined. These rely on the built-in array bounds check.

 - Factor out complex cases into separate functions that do exception
   based validity checks.  The char[] and wchar[] versions use
   pointers to avoid redundant array bounds checks, thus they can only
   be trusted.

 - Complete rewrite of decode for char[] to use less branching and
   unrolled loops. This requires less registers AND less instructions.
   The overlong check is done much cheaper on the code point.

 - The decode functions were made templates to short circuit the very
   restricted function inlining possibilities.
2011-11-07 07:16:37 +01:00
jmdavis
256976dddd Removed "scheduled for deprecation" pragmas.
The pragmas have not been as effective as we might have liked, since
they only work with templates and can't tell you where in your code you
need to make changes, and they seemed to have been more annoying to
programmers than helpful, so we're going to discontinue them. We'll
leave them in for stuff that's actually been deprecated until deprecated
has been improved enough to take a message, but we'll leave "scheduled
for deprecation" messages to the documentation and changelog.
2011-10-23 23:11:17 -07:00
jmdavis
f0af275440 Unscheduled toUTF16z for deprecation.
It's pretty clear from discussions on the newsgroup that we want to keep
toUTF16z, so it shouldn't be scheduled for deprecation anymore. That
change is part of https://github.com/D-Programming-Language/phobos/pull/279 ,
but for whatever reason that pull request hasn't been reviewed yet, let
alone merged in, and this change shouldn't need review, _and_ it should
be in before the next release, so I'm just making it and checking it in.
It's simply removing the "scheduled for deprecation" note in the
documentation and its associated pragma.
2011-10-22 15:24:39 -07:00
jmdavis
5a3739f92d Made std.utf use enforce and enforceEx where applicable. 2011-09-26 22:40:22 -07:00
jmdavis
ea2da771d0 Applied some of the suggestions from the review comments. 2011-09-26 22:18:18 -07:00
jmdavis
b9b8337ccd Renamed UtfException to UTFException to match other uses of UTF.
I created an alias for UtfException and scheduled it for deprecation
(with a fairly short time to deprecation given that it's a simple
renaming).
2011-09-26 22:18:18 -07:00
jmdavis
37ed7a8c25 Fixed UtfException's sequence so that it's uint[4] again. 2011-09-26 22:18:18 -07:00
jmdavis
2ecabfce51 Cleaned up std.utf.
The main purpose of these changes was to make as much as possible in
std.utf pure (other than toUTFx, which I'll be replacing with toUTF in a
future pull request), but I also ended up doing a fair bit of
documentation cleanup. Almost everything in std.utf is pure now though,
which should help considerably in making it possible to make
string functions pure.
2011-09-26 22:18:18 -07:00
jmdavis
41b9078cbf Fix Windows unit tests. 2011-08-15 20:42:10 -07:00
jmdavis
7eee94d2a8 Fix Windows build. 2011-08-15 03:24:29 -07:00
jmdavis
d41b8d939e Scheduled toUTF16z for deprecation. 2011-08-14 19:24:33 -07:00
jmdavis
25cf1cb1fa Attempt at improving warning on toUTFz.
I also put @system on the two overloads of toUTFZ which do pointer
arithmetic. They're obviously @system anyway, but tagging them with it
makes it clearer.
2011-07-12 22:34:58 -07:00
jmdavis
5e79789e01 Merge branch 'master' into utfz 2011-07-10 03:38:54 -07:00
jmdavis
1192616696 Changed toUTFz to use ElementEncodingType instead of typeof(str[0]). 2011-07-10 03:37:55 -07:00
jmdavis
6c6a493def Updates to toUTFz per Andrei's suggestions. 2011-07-03 22:57:19 -07:00
jmdavis
f7d8ca569a Improvement to warning in documentation. 2011-06-29 20:32:16 -07:00
jmdavis
3309c9751a Improved toUTFz so that it does less copying.
toUTFz no longer guarantees that the string will remain zero-terminated.
If the string can be zero-terminated but isn't immutable and doesn't
need to be copied to have the requested character pointer type, then it
no longer copies. This means that it's possible to have a string which
is zero-terminated and then stops being zero-terminated if you alter the
character one passed its end, but that's not likely to be an issue in
most cases, and a note in the documentation points it out so that
programmers can know about it and deal with it appropriately.
2011-06-29 20:10:13 -07:00
jmdavis
595bcb86ba Implemented toUTFz.
I haven't made std.conv.to use it yet, and I haven't touched toUTF16z or
toStringz at all, but here's an implementation for toUTFz. After this is
in, we can make std.conv.to use it when converting to character
pointers, and we should probably make it so that we have toStringz,
toWstringz, and toDstringz which use it and return immutable character
pointers and get rid of toUTF16z.
2011-06-26 02:37:50 -07:00
jmdavis
7de549c1fa Merged master into branch with changes to std.string.
Conflicts:
	changelog.dd
	std/array.d
2011-06-22 21:38:17 -07:00
jmdavis
308df18f16 Added an example to std.utf.codeLength. 2011-06-21 19:36:01 -07:00
jmdavis
1684abf962 Revert "Fixed codeLength fatal misprint."
This reverts commit 1fc3a08fcb.
2011-06-21 19:22:33 -07:00
Denis
1fc3a08fcb Fixed codeLength fatal misprint. 2011-06-21 10:35:45 -07:00
jmdavis
0e1afe82cb Improved std.string.indexOf and std.string.lastIndexOf.
indexOf and lastIndexOf should not work properly with unicode for all
string types (unlike before). As part of that, I also ended up fixing a
bug in std.array.back for strings (wstrings in particular were broken).
I also improved various, related unit tests.
2011-06-12 16:59:50 -07:00
blackwhale
903738f786 another attempt at stride/strideBack 2011-05-27 18:50:14 +04:00
blackwhale
a4cf6d203b strideBack. make all strides work with mutable strings. style. 2011-05-27 17:21:18 +04:00
Walter Bright
046e1b36db add source links 2011-02-06 15:46:50 -08:00
Don Clugston
84477a5d3e Move Boost copyright declaration from ddoc to normal comment. Fixes ugly ddoc output. 2010-11-24 19:34:47 +00:00
Masahiro Nakagawa
3dbdfbfdf6 Workaround for dmd 2.050. 2010-11-24 13:51:26 +00:00
Masahiro Nakagawa
9ac23bf4d8 issue 5247: std.utf.stride() should not return 0xFF 2010-11-23 11:42:57 +00:00
Masahiro Nakagawa
44f8ab9581 Clean up std.utf. Remove UtfError and toUTF* shortcut functions. Add attributes. count function supports dchar 2010-11-23 10:28:07 +00:00
Andrei Alexandrescu
460c844b4f Fix for bugzilla 2718 2010-09-26 21:19:14 +00:00
Andrei Alexandrescu
1e4fd1db4e Bugzilla 755 2010-09-14 03:31:16 +00:00
Andrei Alexandrescu
432e3fdfc8 Replaced std.contracts with std.exception throughout 2010-07-04 22:09:03 +00:00
Shin Fujishiro
b5a054159c Fixed bugzilla 978: std.utf's toUTF* functions accept some invalid and reject some valid UTF.
* Fixed decode() to accept U+FFFE and U+FFFF.
* Changed some assert contracts (which check input for validity) to if-throw.
* Fixed static array argument to be passed by ref (regression).
2010-06-23 03:03:56 +00:00
Lars T. Kyllingstad
5e624d18c2 2275 - std.utf.toUTF16z() should return const(wchar)* 2010-06-14 13:40:18 +00:00
Andrei Alexandrescu
c3e79d1616 Eliminated decodeFront() and decodeBack() - they aren't needed since strings are bidirectional ranges. 2010-06-08 17:37:58 +00:00
Walter Bright
53a3eec534 invariant => immutable 2010-05-05 22:19:49 +00:00
Andrei Alexandrescu
2a9a6e336c string, wstring are now bidirectional (not random) ranges
std.algorithm: defined move with one argument; levenshtein distance generalized to with all forward ranges; take now has swapped arguments
std.array: empty for arrays is now a @property; front and back for a string and wstring automatically decodes the first/last character; popFront, popBack for string and wstring obey the UTF stride
std.conv: changed the default array formatting from "[a, b, c]" to "a b c"
std.range: swapped order of arguments in take
std.stdio: added readln template
std.variant: now works with statically-sized arrays and const data
std.traits: added isNarrowString
2010-02-22 15:52:31 +00:00
Walter Bright
d340dab9f3 inout to ref 2009-12-19 07:46:41 +00:00
Don Clugston
0ecae3a354 Change [length] to [$] throughout Phobos. 2009-11-03 07:55:49 +00:00
Andrei Alexandrescu
904f56e34a Added count function and changed the encode function to take fixed-size array by reference. 2009-10-27 03:28:06 +00:00
Walter Bright
478c52201b update for value static arrays 2009-10-24 02:38:26 +00:00
Andrei Alexandrescu
93cca46ecc fixed decodeFront and decodeBack 2009-10-03 21:47:54 +00:00
Walter Bright
dbf4772242 wrong return type in std.utf 2009-09-30 02:25:14 +00:00