Commit graph

438 commits

Author SHA1 Message Date
Colden Cullen df0624fa1f Changed exceptions to take line as a size_t
`Exception` takes `line` as a `size_t`, so this is for consistency.
2014-09-19 12:58:16 -04:00
Ferdinand Majerech 494dcd30d9 tinyendian is now a DUB package. 2014-08-06 16:15:02 +02:00
Ferdinand Majerech 510065b111 Style. 2014-08-06 14:17:32 +02:00
Ferdinand Majerech b254e35762 Unittest build now works with 'dub test' 2014-08-06 14:17:07 +02:00
Ferdinand Majerech 0268a1ea39 Refactored func attribs in Reader. 2014-08-05 23:00:23 +02:00
Ferdinand Majerech ada8335504 Compound pure nothrow @nogc in Scanner. 2014-08-05 22:52:51 +02:00
Ferdinand Majerech cd879c05d3 Spaces. 2014-08-05 22:41:40 +02:00
Ferdinand Majerech 1916b1953a Loader doc fix. 2014-08-05 22:07:35 +02:00
Ferdinand Majerech 893b43edee Style. 2014-08-05 21:31:42 +02:00
Ferdinand Majerech fd93830243 Using the Scanner FastCharSearches wherever they apply. 2014-08-05 21:31:36 +02:00
Ferdinand Majerech 755eb4e468 Moved common FastCharSearch instantiations to Scanner body to minimize bloat. 2014-08-05 21:30:59 +02:00
Ferdinand Majerech d505728824 Moved a branch outside of aloop in scanPlain() to improve performance. 2014-08-05 20:58:05 +02:00
Ferdinand Majerech 57d936ed0f Scanner using prefixBytes() for optimization. 2014-08-05 20:57:30 +02:00
Ferdinand Majerech 3b303f6e82 An ASCII (bytes) version of prefix(). 2014-08-05 20:56:59 +02:00
Ferdinand Majerech 8f94a40730 Doc fixes. 2014-08-05 20:56:30 +02:00
Ferdinand Majerech 568e75d3de Removed decodeCount_, as it's no longer used. 2014-08-05 20:53:04 +02:00
Ferdinand Majerech 2b7ea42199 Removed the old (obsolete) Reader decoding method. 2014-08-05 20:52:43 +02:00
Ferdinand Majerech 92396b4cae An optimized version of forward() with length == 1. 2014-08-05 20:52:05 +02:00
Ferdinand Majerech 34e6f55bd9 forward() now uses upcomingASCII_ 2014-08-05 20:46:00 +02:00
Ferdinand Majerech c828c6b132 peek()/slice() now use upcomingASCII() 2014-08-05 20:44:15 +02:00
Ferdinand Majerech d9079de427 get() now calls slice() directly instead of through prefix(). 2014-08-05 20:42:51 +02:00
Ferdinand Majerech bfa2f1bd5c Using checkASCII in Reader. 2014-08-05 20:42:22 +02:00
Ferdinand Majerech e01c40ede5 Func to count consecutive ASCII chars starting at current Reader position. 2014-08-05 20:36:33 +02:00
Ferdinand Majerech 7409f3bbd9 ASCII optimizations for isPrintableValidUTF8. 2014-08-05 19:34:28 +02:00
Ferdinand Majerech 1c0702f3cd Func to count the num of ASCII chars in string before the first UTF-8 sequence 2014-08-05 19:12:54 +02:00
Ferdinand Majerech 8902ea8806 Minor optimization. 2014-08-05 18:21:42 +02:00
Ferdinand Majerech b2d0c74e56 Minor style fix. 2014-08-05 18:21:29 +02:00
Ferdinand Majerech a89f9e93f7 Removed unnecessary memory allocations in Queue. 2014-08-05 18:20:57 +02:00
Ferdinand Majerech a9333e3dd3 Fixed another 'in' parameter. 2014-08-05 18:15:07 +02:00
Ferdinand Majerech 9ea269de87 Removed obsolete 'final' from Reader methods. 2014-08-05 13:14:42 +02:00
Ferdinand Majerech 3d8de67771 Using peekByte() where possible in Scanner. 2014-08-05 13:14:15 +02:00
Ferdinand Majerech 078269be36 Queue no longer supports types with destructors. 2014-08-05 13:12:07 +02:00
Ferdinand Majerech 7539b40d3d (optimization) Mark ctor now doesn't check file column for overflow. 2014-08-05 13:10:52 +02:00
Ferdinand Majerech 57afd47bb5 Doc fixes. 2014-08-05 01:53:17 +02:00
Ferdinand Majerech fb9525bb00 Obsoleting decodeCount_. 2014-08-05 01:52:21 +02:00
Ferdinand Majerech 75ed314dd6 More FastCharSearch. 2014-08-05 01:51:32 +02:00
Ferdinand Majerech 0424ff5e77 Style. 2014-08-05 01:51:20 +02:00
Ferdinand Majerech 44885cde4e Optimized fetchToken() 2014-08-04 02:26:14 +02:00
Ferdinand Majerech 7360e85a3a More FastCharSearch based on profiling results. 2014-08-04 02:24:26 +02:00
Ferdinand Majerech 5a1e6e994d Fixed a nasty rare bug caused by an assumption that 32 chars take 32 bytes. 2014-08-04 02:23:08 +02:00
Ferdinand Majerech 20048ea995 Using peekByte() in heavily used Scanner methods. 2014-08-04 02:22:09 +02:00
Ferdinand Majerech 8e63f62d7e An optimized version of peek() that reads a byte, without decoding. 2014-08-04 02:20:13 +02:00
Ferdinand Majerech a4befdd866 An optimized version of Reader.peek() with index == 0. 2014-08-04 02:19:34 +02:00
Ferdinand Majerech 063d9754d7 Queue now uses a freelist to minimize allocations. 2014-08-04 02:16:34 +02:00
Ferdinand Majerech 97e717df1b Loader creates Constructor/Resolver lazily to avoid garbage when user-provided 2014-08-04 02:14:01 +02:00
Ferdinand Majerech 6aa50b8898 A benchmark Loader method that scans a file but throws away the tokens. 2014-08-02 23:26:46 +02:00
Ferdinand Majerech c160156346 Fixed the string->char[] Token value move. 2014-08-02 23:25:56 +02:00
Ferdinand Majerech aeee0758a7 Refactored FastCharSearch with more modern string mixin code. 2014-08-02 02:35:03 +02:00
Ferdinand Majerech d32addacda Slices now nonconst in all layers up to Parser, where they get cast to string. 2014-08-02 01:58:20 +02:00
Ferdinand Majerech 7b699c5903 UTF-8 validation now uses UTF-8 decoding code. 2014-08-02 01:37:16 +02:00
Ferdinand Majerech b5da695d6b More @nogc in Scanner. 2014-08-02 01:19:29 +02:00
Ferdinand Majerech e6fdade4a6 Scanner now uses @nogc UTF decoding. 2014-08-02 01:16:29 +02:00
Ferdinand Majerech e1209711af UTF-8 decoding now has versions for validated and unvalidated strings. 2014-08-02 01:15:57 +02:00
Ferdinand Majerech 5932155435 Style. 2014-08-02 01:15:22 +02:00
Ferdinand Majerech fad280060e Better Constructor docs. 2014-08-01 23:01:34 +02:00
Ferdinand Majerech f137db438e Better Constructor funct attribs. 2014-08-01 23:01:24 +02:00
Ferdinand Majerech 66679a601c Moved tinyendian.d out of the dyaml directory. 2014-08-01 02:56:37 +02:00
Ferdinand Majerech a9fb68f340 Removed internals from DDoc. 2014-08-01 02:52:14 +02:00
Ferdinand Majerech 0f017646fc Reverted doc style due to DDoc issues. 2014-08-01 02:51:35 +02:00
Ferdinand Majerech fdf4cecddb Backported a recent std.utf script. 2014-08-01 02:47:52 +02:00
Ferdinand Majerech 830aef8df5 Simpler grepping for 'std.stream' 2014-07-31 14:53:14 +02:00
Ferdinand Majerech 151871e1b3 Better Loader docs. 2014-07-31 14:52:40 +02:00
Ferdinand Majerech 276bed7fb6 Constructor unittests now use the new Loader ctor. 2014-07-31 02:55:38 +02:00
Ferdinand Majerech 68d9124b17 Removed Reader ctor from Stream. 2014-07-31 02:33:24 +02:00
Ferdinand Majerech a74bc8cf3b Reader unittests now construct Reader from a buffer. 2014-07-31 02:28:42 +02:00
Ferdinand Majerech e100047572 Updated Loader examples. 2014-07-31 02:22:42 +02:00
Ferdinand Majerech 626337f6ed More readable error throws in Loader. 2014-07-31 02:16:53 +02:00
Ferdinand Majerech ddd89c22d5 Updated Loader unittest. 2014-07-31 02:09:44 +02:00
Ferdinand Majerech e919f4be51 Added a char[] fromString. 2014-07-31 02:08:08 +02:00
Ferdinand Majerech 24a8e945bd Removed docs of deprecated API. 2014-07-31 02:07:32 +02:00
Ferdinand Majerech d18a41999e Loader ctor from buffer, deprecated old API, constructing Reader from buffer. 2014-07-31 02:01:08 +02:00
Ferdinand Majerech a23c41385a Very minor Parser/Event @nogc/style. 2014-07-31 01:56:36 +02:00
Ferdinand Majerech 03d1183550 Added a Reader ctor from buffer and deprecated the ctor from Stream. 2014-07-31 01:56:06 +02:00
Ferdinand Majerech 4d6634f49b Doc fixes. 2014-07-30 23:30:37 +02:00
Ferdinand Majerech 5da8561df4 Renamed buffer8_ to buffer_, bufferOffset8_ to bufferOffset_. 2014-07-30 23:26:44 +02:00
Ferdinand Majerech 4e2c3e6093 Only validate UTF-8 if we get UTF-8 input (UTF-16/32 validated at conversion) 2014-07-30 22:32:58 +02:00
Ferdinand Majerech 6c15bd95cc Moved unused, but potentially useful Reader code to dyaml.unused. 2014-07-30 18:38:27 +02:00
Ferdinand Majerech e58b092fe1 Removed an unsafe cast. 2014-07-30 18:23:40 +02:00
Ferdinand Majerech 492e36d28a In-place UTF32->UTF-8 conversion in Reader. 2014-07-30 18:21:09 +02:00
Ferdinand Majerech cc1aaf4ac8 dchar->char encoding now also has a version for non-validated dchars. 2014-07-30 18:20:22 +02:00
Ferdinand Majerech f4c57b368b Style/spaces. 2014-07-30 04:46:53 +02:00
Ferdinand Majerech cf3bff517c UTF-8 is now the default input encoding. UTF-16/32 is encoded into UTF-8. 2014-07-30 04:46:28 +02:00
Ferdinand Majerech c1ffa05735 Removed redundant spaces. 2014-07-30 00:37:15 +02:00
Ferdinand Majerech c473ef7dee Removed -8 suffixes from Reader methods. 2014-07-30 00:13:48 +02:00
Ferdinand Majerech eb266b4e27 Removed the -8 suffixes from Scanner methods. 2014-07-29 23:42:50 +02:00
Ferdinand Majerech e5561285c3 Removed UTF-32 buffer offset. 2014-07-29 23:25:22 +02:00
Ferdinand Majerech 33b2a7ef68 Removed the UTF-32 buffer from Reader. 2014-07-29 23:23:45 +02:00
Ferdinand Majerech 736de8beb9 Reader now uses validation to get the number of characters in the UTF8 buffer. 2014-07-29 23:22:16 +02:00
Ferdinand Majerech 74c161c576 validateUTF8NoGC now calculates the number of characters in passed string. 2014-07-29 23:21:07 +02:00
Ferdinand Majerech ffef7bf6fc Removed UTF-32 parts of Reader API. 2014-07-29 23:15:08 +02:00
Ferdinand Majerech d1aaec6a60 Removed the UTF-32 SliceBuilder. 2014-07-29 23:10:46 +02:00
Ferdinand Majerech 207cb249e0 Scanner style. 2014-07-29 23:08:37 +02:00
Ferdinand Majerech 8806cfc1b4 More @nogc in Scanner. 2014-07-29 23:08:03 +02:00
Ferdinand Majerech 18be6b2e5b Removed UTF-32 scanLineBreak. 2014-07-29 23:01:05 +02:00
Ferdinand Majerech 6837156258 Block scalar scanning now works with UTF-8. 2014-07-29 20:58:00 +02:00
Ferdinand Majerech 19ed03cb3e Low hanging fruit for using UTF-8 reader methods 2014-07-29 20:55:24 +02:00
Ferdinand Majerech ecc168dc75 insert() for SliceBuilder8. 2014-07-29 20:52:39 +02:00
Ferdinand Majerech 58e19d75ad Assert message fix. 2014-07-29 20:52:24 +02:00
Ferdinand Majerech 302995354c Fixed a SliceBuilder8.Transaction compilation bug. 2014-07-29 14:43:53 +02:00
Ferdinand Majerech 510357f4c7 insert() instead of insertBack() for SliceBuilder. 2014-07-29 14:41:46 +02:00
Ferdinand Majerech 239152f793 UTF-8 scanPlain and callees. 2014-07-29 04:28:07 +02:00
Ferdinand Majerech d80917419f Removed obsolete UTF-32 methods. 2014-07-29 04:20:14 +02:00
Ferdinand Majerech 4a09338a7a Directive scanning is now fully UTF-8. 2014-07-29 04:19:44 +02:00
Ferdinand Majerech 38143a2c64 Fixed NoGC appender unittest. 2014-07-29 04:18:34 +02:00
Ferdinand Majerech 31acd6aead Removed obsolete comment. 2014-07-29 04:10:42 +02:00
Ferdinand Majerech e565543080 Removed UTF-32 scanAlphaNumeric. 2014-07-29 04:10:30 +02:00
Ferdinand Majerech ef735e280f UTF-8 directive name scanning. 2014-07-29 04:10:16 +02:00
Ferdinand Majerech 4307ccbe82 Fixed a Reader compilation bug. 2014-07-29 03:18:54 +02:00
Ferdinand Majerech 952726aa5e UTF-8 scanFlowScalar. **NOTE:** moved escaping to Parser; can't do it in-place 2014-07-29 03:18:37 +02:00
Ferdinand Majerech 252bf083a7 Fixed a potential Unicode bug. 2014-07-29 03:13:42 +02:00
Ferdinand Majerech b789317df8 UTF-8 scanTag 2014-07-29 03:13:21 +02:00
Ferdinand Majerech de6c1aacdb UTF-8 scanTagHandle. 2014-07-29 03:11:38 +02:00
Ferdinand Majerech 40fe7090d9 UTF-8 scanTagURI. 2014-07-29 03:11:17 +02:00
Ferdinand Majerech 2003a950cb UTF-8 scanURIEscapes. 2014-07-29 03:10:51 +02:00
Ferdinand Majerech 1cc07c263a UTF-8 scanAnchor. 2014-07-29 03:09:59 +02:00
Ferdinand Majerech 2a524bbb5e UTF-8 scanLineBreak. 2014-07-29 03:07:57 +02:00
Ferdinand Majerech 6dd53b55a0 UTF-8 scanAlphaNumeric. 2014-07-29 03:07:31 +02:00
Ferdinand Majerech a9def88eed Docfix. 2014-07-29 03:06:51 +02:00
Ferdinand Majerech 3880adf81d UTF-8 SliceBuilder. 2014-07-29 03:01:16 +02:00
Ferdinand Majerech cb64197bb1 nogcutil import. 2014-07-29 02:59:58 +02:00
Ferdinand Majerech 76cfd7704d forward() invalidates last decoded offsets. 2014-07-29 02:59:33 +02:00
Ferdinand Majerech 2e156a8ece UTF-8 prefix()/get() 2014-07-29 02:59:16 +02:00
Ferdinand Majerech 709ab00e44 A UTF-8 slice(). 2014-07-29 02:58:04 +02:00
Ferdinand Majerech 56057b43ec peek() now uses the UTF-8 buffer. 2014-07-29 02:57:19 +02:00
Ferdinand Majerech ef9053d7f3 Keeping buffer8_ and buffer_ positions in sync. 2014-07-29 02:54:39 +02:00
Ferdinand Majerech 6addaa4cbe Better comment. 2014-07-29 02:52:01 +02:00
Ferdinand Majerech 634418b599 Added UTF-8 version of the Reader buffer (for now, side by side with UTF-32) 2014-07-29 02:51:46 +02:00
Ferdinand Majerech d3846f7970 Removed now unused function. 2014-07-29 02:00:32 +02:00
Ferdinand Majerech 5d78e76f6a Error messages with non-ASCII chars will now show the char, not 'unknown'. 2014-07-29 02:00:13 +02:00
Ferdinand Majerech 7cf9dca57d Function to encode *valid* UTF-32 to UTF-8 2014-07-29 01:59:22 +02:00
Ferdinand Majerech cf15d55da0 Function to decode *valid* UTF-8 2014-07-29 01:58:59 +02:00
Ferdinand Majerech 53b39dc590 Updated copyright and description. 2014-07-29 01:58:22 +02:00
Ferdinand Majerech 6b8ff23859 A function to validate a UTF-8 string. 2014-07-29 01:58:00 +02:00
Ferdinand Majerech 61424b0ac6 A @nogc isValidDchar. 2014-07-29 01:57:07 +02:00
Ferdinand Majerech cac25207f1 parseNoGC can work with code points directly. 2014-07-29 01:55:43 +02:00
Ferdinand Majerech 6e1239fdac Removed unused/untested code from AppenderNoGCFixed. 2014-07-29 01:50:04 +02:00
Ferdinand Majerech 4a4e83112c utf8Stride is now globally visible in reader.d 2014-07-28 23:21:43 +02:00
Ferdinand Majerech 45b15890ca It should be enough to use \x instead of \u for \u0085 2014-07-28 23:19:59 +02:00
Ferdinand Majerech 645b191948 Removed todo garbage. 2014-07-26 23:38:59 +02:00
Ferdinand Majerech f07aaeef87 Reader UTF decoding is now private. 2014-07-26 23:37:56 +02:00
Ferdinand Majerech a8c32430ed Minor style. 2014-07-26 23:37:33 +02:00
Ferdinand Majerech ebe10ad8c4 Removed the Error and ErrorData aliases. 2014-07-26 23:31:13 +02:00
Ferdinand Majerech 2e7de5f9ed checkDocumentStart func attribs. 2014-07-26 23:30:13 +02:00
Ferdinand Majerech d5663b1e57 Scanner style. 2014-07-26 23:29:55 +02:00
Ferdinand Majerech f76e4cfd02 Queue copyright. 2014-07-26 23:25:08 +02:00
Ferdinand Majerech 424e6e5f98 Queue whitespaces. 2014-07-26 23:24:41 +02:00
Ferdinand Majerech 2688591c6a Better func attribs in Queue. 2014-07-26 23:23:59 +02:00
Ferdinand Majerech 9d480d1723 scanDirective is now nothrow and mostly @nogc. 2014-07-26 18:26:39 +02:00
Ferdinand Majerech 14a8e31fa5 Minor cleanup. 2014-07-26 18:20:57 +02:00
Ferdinand Majerech f11fbf3b36 scanTagDirectiveValue returns handle length with return value, not ref param 2014-07-26 18:19:26 +02:00