Commit graph

350 commits

Author SHA1 Message Date
Ferdinand Majerech 7360e85a3a More FastCharSearch based on profiling results. 2014-08-04 02:24:26 +02:00
Ferdinand Majerech 5a1e6e994d Fixed a nasty rare bug caused by an assumption that 32 chars take 32 bytes. 2014-08-04 02:23:08 +02:00
Ferdinand Majerech 20048ea995 Using peekByte() in heavily used Scanner methods. 2014-08-04 02:22:09 +02:00
Ferdinand Majerech 8e63f62d7e An optimized version of peek() that reads a byte, without decoding. 2014-08-04 02:20:13 +02:00
Ferdinand Majerech a4befdd866 An optimized version of Reader.peek() with index == 0. 2014-08-04 02:19:34 +02:00
Ferdinand Majerech 063d9754d7 Queue now uses a freelist to minimize allocations. 2014-08-04 02:16:34 +02:00
Ferdinand Majerech 97e717df1b Loader creates Constructor/Resolver lazily to avoid garbage when user-provided 2014-08-04 02:14:01 +02:00
Ferdinand Majerech 6aa50b8898 A benchmark Loader method that scans a file but throws away the tokens. 2014-08-02 23:26:46 +02:00
Ferdinand Majerech c160156346 Fixed the string->char[] Token value move. 2014-08-02 23:25:56 +02:00
Ferdinand Majerech aeee0758a7 Refactored FastCharSearch with more modern string mixin code. 2014-08-02 02:35:03 +02:00
Ferdinand Majerech d32addacda Slices now nonconst in all layers up to Parser, where they get cast to string. 2014-08-02 01:58:20 +02:00
Ferdinand Majerech 7b699c5903 UTF-8 validation now uses UTF-8 decoding code. 2014-08-02 01:37:16 +02:00
Ferdinand Majerech b5da695d6b More @nogc in Scanner. 2014-08-02 01:19:29 +02:00
Ferdinand Majerech e6fdade4a6 Scanner now uses @nogc UTF decoding. 2014-08-02 01:16:29 +02:00
Ferdinand Majerech e1209711af UTF-8 decoding now has versions for validated and unvalidated strings. 2014-08-02 01:15:57 +02:00
Ferdinand Majerech 5932155435 Style. 2014-08-02 01:15:22 +02:00
Ferdinand Majerech fad280060e Better Constructor docs. 2014-08-01 23:01:34 +02:00
Ferdinand Majerech f137db438e Better Constructor funct attribs. 2014-08-01 23:01:24 +02:00
Ferdinand Majerech 66679a601c Moved tinyendian.d out of the dyaml directory. 2014-08-01 02:56:37 +02:00
Ferdinand Majerech a9fb68f340 Removed internals from DDoc. 2014-08-01 02:52:14 +02:00
Ferdinand Majerech 0f017646fc Reverted doc style due to DDoc issues. 2014-08-01 02:51:35 +02:00
Ferdinand Majerech fdf4cecddb Backported a recent std.utf script. 2014-08-01 02:47:52 +02:00
Ferdinand Majerech 830aef8df5 Simpler grepping for 'std.stream' 2014-07-31 14:53:14 +02:00
Ferdinand Majerech 151871e1b3 Better Loader docs. 2014-07-31 14:52:40 +02:00
Ferdinand Majerech 276bed7fb6 Constructor unittests now use the new Loader ctor. 2014-07-31 02:55:38 +02:00
Ferdinand Majerech 68d9124b17 Removed Reader ctor from Stream. 2014-07-31 02:33:24 +02:00
Ferdinand Majerech a74bc8cf3b Reader unittests now construct Reader from a buffer. 2014-07-31 02:28:42 +02:00
Ferdinand Majerech e100047572 Updated Loader examples. 2014-07-31 02:22:42 +02:00
Ferdinand Majerech 626337f6ed More readable error throws in Loader. 2014-07-31 02:16:53 +02:00
Ferdinand Majerech ddd89c22d5 Updated Loader unittest. 2014-07-31 02:09:44 +02:00
Ferdinand Majerech e919f4be51 Added a char[] fromString. 2014-07-31 02:08:08 +02:00
Ferdinand Majerech 24a8e945bd Removed docs of deprecated API. 2014-07-31 02:07:32 +02:00
Ferdinand Majerech d18a41999e Loader ctor from buffer, deprecated old API, constructing Reader from buffer. 2014-07-31 02:01:08 +02:00
Ferdinand Majerech a23c41385a Very minor Parser/Event @nogc/style. 2014-07-31 01:56:36 +02:00
Ferdinand Majerech 03d1183550 Added a Reader ctor from buffer and deprecated the ctor from Stream. 2014-07-31 01:56:06 +02:00
Ferdinand Majerech 4d6634f49b Doc fixes. 2014-07-30 23:30:37 +02:00
Ferdinand Majerech 5da8561df4 Renamed buffer8_ to buffer_, bufferOffset8_ to bufferOffset_. 2014-07-30 23:26:44 +02:00
Ferdinand Majerech 4e2c3e6093 Only validate UTF-8 if we get UTF-8 input (UTF-16/32 validated at conversion) 2014-07-30 22:32:58 +02:00
Ferdinand Majerech 6c15bd95cc Moved unused, but potentially useful Reader code to dyaml.unused. 2014-07-30 18:38:27 +02:00
Ferdinand Majerech e58b092fe1 Removed an unsafe cast. 2014-07-30 18:23:40 +02:00
Ferdinand Majerech 492e36d28a In-place UTF32->UTF-8 conversion in Reader. 2014-07-30 18:21:09 +02:00
Ferdinand Majerech cc1aaf4ac8 dchar->char encoding now also has a version for non-validated dchars. 2014-07-30 18:20:22 +02:00
Ferdinand Majerech f4c57b368b Style/spaces. 2014-07-30 04:46:53 +02:00
Ferdinand Majerech cf3bff517c UTF-8 is now the default input encoding. UTF-16/32 is encoded into UTF-8. 2014-07-30 04:46:28 +02:00
Ferdinand Majerech c1ffa05735 Removed redundant spaces. 2014-07-30 00:37:15 +02:00
Ferdinand Majerech c473ef7dee Removed -8 suffixes from Reader methods. 2014-07-30 00:13:48 +02:00
Ferdinand Majerech eb266b4e27 Removed the -8 suffixes from Scanner methods. 2014-07-29 23:42:50 +02:00
Ferdinand Majerech e5561285c3 Removed UTF-32 buffer offset. 2014-07-29 23:25:22 +02:00
Ferdinand Majerech 33b2a7ef68 Removed the UTF-32 buffer from Reader. 2014-07-29 23:23:45 +02:00
Ferdinand Majerech 736de8beb9 Reader now uses validation to get the number of characters in the UTF8 buffer. 2014-07-29 23:22:16 +02:00
Ferdinand Majerech 74c161c576 validateUTF8NoGC now calculates the number of characters in passed string. 2014-07-29 23:21:07 +02:00
Ferdinand Majerech ffef7bf6fc Removed UTF-32 parts of Reader API. 2014-07-29 23:15:08 +02:00
Ferdinand Majerech d1aaec6a60 Removed the UTF-32 SliceBuilder. 2014-07-29 23:10:46 +02:00
Ferdinand Majerech 207cb249e0 Scanner style. 2014-07-29 23:08:37 +02:00
Ferdinand Majerech 8806cfc1b4 More @nogc in Scanner. 2014-07-29 23:08:03 +02:00
Ferdinand Majerech 18be6b2e5b Removed UTF-32 scanLineBreak. 2014-07-29 23:01:05 +02:00
Ferdinand Majerech 6837156258 Block scalar scanning now works with UTF-8. 2014-07-29 20:58:00 +02:00
Ferdinand Majerech 19ed03cb3e Low hanging fruit for using UTF-8 reader methods 2014-07-29 20:55:24 +02:00
Ferdinand Majerech ecc168dc75 insert() for SliceBuilder8. 2014-07-29 20:52:39 +02:00
Ferdinand Majerech 58e19d75ad Assert message fix. 2014-07-29 20:52:24 +02:00
Ferdinand Majerech 302995354c Fixed a SliceBuilder8.Transaction compilation bug. 2014-07-29 14:43:53 +02:00
Ferdinand Majerech 510357f4c7 insert() instead of insertBack() for SliceBuilder. 2014-07-29 14:41:46 +02:00
Ferdinand Majerech 239152f793 UTF-8 scanPlain and callees. 2014-07-29 04:28:07 +02:00
Ferdinand Majerech d80917419f Removed obsolete UTF-32 methods. 2014-07-29 04:20:14 +02:00
Ferdinand Majerech 4a09338a7a Directive scanning is now fully UTF-8. 2014-07-29 04:19:44 +02:00
Ferdinand Majerech 38143a2c64 Fixed NoGC appender unittest. 2014-07-29 04:18:34 +02:00
Ferdinand Majerech 31acd6aead Removed obsolete comment. 2014-07-29 04:10:42 +02:00
Ferdinand Majerech e565543080 Removed UTF-32 scanAlphaNumeric. 2014-07-29 04:10:30 +02:00
Ferdinand Majerech ef735e280f UTF-8 directive name scanning. 2014-07-29 04:10:16 +02:00
Ferdinand Majerech 4307ccbe82 Fixed a Reader compilation bug. 2014-07-29 03:18:54 +02:00
Ferdinand Majerech 952726aa5e UTF-8 scanFlowScalar. **NOTE:** moved escaping to Parser; can't do it in-place 2014-07-29 03:18:37 +02:00
Ferdinand Majerech 252bf083a7 Fixed a potential Unicode bug. 2014-07-29 03:13:42 +02:00
Ferdinand Majerech b789317df8 UTF-8 scanTag 2014-07-29 03:13:21 +02:00
Ferdinand Majerech de6c1aacdb UTF-8 scanTagHandle. 2014-07-29 03:11:38 +02:00
Ferdinand Majerech 40fe7090d9 UTF-8 scanTagURI. 2014-07-29 03:11:17 +02:00
Ferdinand Majerech 2003a950cb UTF-8 scanURIEscapes. 2014-07-29 03:10:51 +02:00
Ferdinand Majerech 1cc07c263a UTF-8 scanAnchor. 2014-07-29 03:09:59 +02:00
Ferdinand Majerech 2a524bbb5e UTF-8 scanLineBreak. 2014-07-29 03:07:57 +02:00
Ferdinand Majerech 6dd53b55a0 UTF-8 scanAlphaNumeric. 2014-07-29 03:07:31 +02:00
Ferdinand Majerech a9def88eed Docfix. 2014-07-29 03:06:51 +02:00
Ferdinand Majerech 3880adf81d UTF-8 SliceBuilder. 2014-07-29 03:01:16 +02:00
Ferdinand Majerech cb64197bb1 nogcutil import. 2014-07-29 02:59:58 +02:00
Ferdinand Majerech 76cfd7704d forward() invalidates last decoded offsets. 2014-07-29 02:59:33 +02:00
Ferdinand Majerech 2e156a8ece UTF-8 prefix()/get() 2014-07-29 02:59:16 +02:00
Ferdinand Majerech 709ab00e44 A UTF-8 slice(). 2014-07-29 02:58:04 +02:00
Ferdinand Majerech 56057b43ec peek() now uses the UTF-8 buffer. 2014-07-29 02:57:19 +02:00
Ferdinand Majerech ef9053d7f3 Keeping buffer8_ and buffer_ positions in sync. 2014-07-29 02:54:39 +02:00
Ferdinand Majerech 6addaa4cbe Better comment. 2014-07-29 02:52:01 +02:00
Ferdinand Majerech 634418b599 Added UTF-8 version of the Reader buffer (for now, side by side with UTF-32) 2014-07-29 02:51:46 +02:00
Ferdinand Majerech d3846f7970 Removed now unused function. 2014-07-29 02:00:32 +02:00
Ferdinand Majerech 5d78e76f6a Error messages with non-ASCII chars will now show the char, not 'unknown'. 2014-07-29 02:00:13 +02:00
Ferdinand Majerech 7cf9dca57d Function to encode *valid* UTF-32 to UTF-8 2014-07-29 01:59:22 +02:00
Ferdinand Majerech cf15d55da0 Function to decode *valid* UTF-8 2014-07-29 01:58:59 +02:00
Ferdinand Majerech 53b39dc590 Updated copyright and description. 2014-07-29 01:58:22 +02:00
Ferdinand Majerech 6b8ff23859 A function to validate a UTF-8 string. 2014-07-29 01:58:00 +02:00
Ferdinand Majerech 61424b0ac6 A @nogc isValidDchar. 2014-07-29 01:57:07 +02:00
Ferdinand Majerech cac25207f1 parseNoGC can work with code points directly. 2014-07-29 01:55:43 +02:00
Ferdinand Majerech 6e1239fdac Removed unused/untested code from AppenderNoGCFixed. 2014-07-29 01:50:04 +02:00
Ferdinand Majerech 4a4e83112c utf8Stride is now globally visible in reader.d 2014-07-28 23:21:43 +02:00
Ferdinand Majerech 45b15890ca It should be enough to use \x instead of \u for \u0085 2014-07-28 23:19:59 +02:00