Commit graph

447 commits

Author SHA1 Message Date
Cameron Ross 989e1b3375 avoid use of a static constructor in parser
this makes usage in module constructors more reliable and improves
consistency with the emitter
2015-10-03 21:52:44 -03:00
Cameron Ross 91cdb9a6f4 fix emitting of unicode characters >0xFFFF 2015-09-25 03:18:37 -03:00
Ferdinand Majerech 2f3d782c84 Fixed compilation with DMD 2.068
Had to use a lot of @trusted as std.variant.VariantN is again less safe.
Will need to change that back once it gets safer, or at least isolate
code using it so that half of Node API doesn't need to be @trusted.
2015-08-23 09:09:24 +02:00
Ferdinand Majerech b5e028d239 Commit missing weOwnStream_ 2015-06-05 13:31:54 +02:00
Ferdinand Majerech 385cfd5420 If we create a File instance, be sure to destroy it. 2015-06-03 21:04:15 +02:00
Ferdinand Majerech 96f64eb221 Merge pull request #29 from soarqin/dmd2067_fix
Fixed compilation for dmd 2.067
2015-06-03 20:33:39 +02:00
Ferdinand Majerech 6d706dd3dc nothrow Node constructors where possible (at the moment) 2015-06-03 20:30:19 +02:00
Soar Qin bfd8654816 Fixed compilation for dmd 2.067 2015-03-17 14:08:40 +08:00
Ferdinand Majerech 05270e5f60 Doc fixes 2015-02-21 14:31:55 +01:00
Colden Cullen df0624fa1f Changed exceptions to take line as a size_t
`Exception` takes `line` as a `size_t`, so this is for consistency.
2014-09-19 12:58:16 -04:00
Ferdinand Majerech 494dcd30d9 tinyendian is now a DUB package. 2014-08-06 16:15:02 +02:00
Ferdinand Majerech 510065b111 Style. 2014-08-06 14:17:32 +02:00
Ferdinand Majerech b254e35762 Unittest build now works with 'dub test' 2014-08-06 14:17:07 +02:00
Ferdinand Majerech 0268a1ea39 Refactored func attribs in Reader. 2014-08-05 23:00:23 +02:00
Ferdinand Majerech ada8335504 Compound pure nothrow @nogc in Scanner. 2014-08-05 22:52:51 +02:00
Ferdinand Majerech cd879c05d3 Spaces. 2014-08-05 22:41:40 +02:00
Ferdinand Majerech 1916b1953a Loader doc fix. 2014-08-05 22:07:35 +02:00
Ferdinand Majerech 893b43edee Style. 2014-08-05 21:31:42 +02:00
Ferdinand Majerech fd93830243 Using the Scanner FastCharSearches wherever they apply. 2014-08-05 21:31:36 +02:00
Ferdinand Majerech 755eb4e468 Moved common FastCharSearch instantiations to Scanner body to minimize bloat. 2014-08-05 21:30:59 +02:00
Ferdinand Majerech d505728824 Moved a branch outside of aloop in scanPlain() to improve performance. 2014-08-05 20:58:05 +02:00
Ferdinand Majerech 57d936ed0f Scanner using prefixBytes() for optimization. 2014-08-05 20:57:30 +02:00
Ferdinand Majerech 3b303f6e82 An ASCII (bytes) version of prefix(). 2014-08-05 20:56:59 +02:00
Ferdinand Majerech 8f94a40730 Doc fixes. 2014-08-05 20:56:30 +02:00
Ferdinand Majerech 568e75d3de Removed decodeCount_, as it's no longer used. 2014-08-05 20:53:04 +02:00
Ferdinand Majerech 2b7ea42199 Removed the old (obsolete) Reader decoding method. 2014-08-05 20:52:43 +02:00
Ferdinand Majerech 92396b4cae An optimized version of forward() with length == 1. 2014-08-05 20:52:05 +02:00
Ferdinand Majerech 34e6f55bd9 forward() now uses upcomingASCII_ 2014-08-05 20:46:00 +02:00
Ferdinand Majerech c828c6b132 peek()/slice() now use upcomingASCII() 2014-08-05 20:44:15 +02:00
Ferdinand Majerech d9079de427 get() now calls slice() directly instead of through prefix(). 2014-08-05 20:42:51 +02:00
Ferdinand Majerech bfa2f1bd5c Using checkASCII in Reader. 2014-08-05 20:42:22 +02:00
Ferdinand Majerech e01c40ede5 Func to count consecutive ASCII chars starting at current Reader position. 2014-08-05 20:36:33 +02:00
Ferdinand Majerech 7409f3bbd9 ASCII optimizations for isPrintableValidUTF8. 2014-08-05 19:34:28 +02:00
Ferdinand Majerech 1c0702f3cd Func to count the num of ASCII chars in string before the first UTF-8 sequence 2014-08-05 19:12:54 +02:00
Ferdinand Majerech 8902ea8806 Minor optimization. 2014-08-05 18:21:42 +02:00
Ferdinand Majerech b2d0c74e56 Minor style fix. 2014-08-05 18:21:29 +02:00
Ferdinand Majerech a89f9e93f7 Removed unnecessary memory allocations in Queue. 2014-08-05 18:20:57 +02:00
Ferdinand Majerech a9333e3dd3 Fixed another 'in' parameter. 2014-08-05 18:15:07 +02:00
Ferdinand Majerech 9ea269de87 Removed obsolete 'final' from Reader methods. 2014-08-05 13:14:42 +02:00
Ferdinand Majerech 3d8de67771 Using peekByte() where possible in Scanner. 2014-08-05 13:14:15 +02:00
Ferdinand Majerech 078269be36 Queue no longer supports types with destructors. 2014-08-05 13:12:07 +02:00
Ferdinand Majerech 7539b40d3d (optimization) Mark ctor now doesn't check file column for overflow. 2014-08-05 13:10:52 +02:00
Ferdinand Majerech 57afd47bb5 Doc fixes. 2014-08-05 01:53:17 +02:00
Ferdinand Majerech fb9525bb00 Obsoleting decodeCount_. 2014-08-05 01:52:21 +02:00
Ferdinand Majerech 75ed314dd6 More FastCharSearch. 2014-08-05 01:51:32 +02:00
Ferdinand Majerech 0424ff5e77 Style. 2014-08-05 01:51:20 +02:00
Ferdinand Majerech 44885cde4e Optimized fetchToken() 2014-08-04 02:26:14 +02:00
Ferdinand Majerech 7360e85a3a More FastCharSearch based on profiling results. 2014-08-04 02:24:26 +02:00
Ferdinand Majerech 5a1e6e994d Fixed a nasty rare bug caused by an assumption that 32 chars take 32 bytes. 2014-08-04 02:23:08 +02:00
Ferdinand Majerech 20048ea995 Using peekByte() in heavily used Scanner methods. 2014-08-04 02:22:09 +02:00
Ferdinand Majerech 8e63f62d7e An optimized version of peek() that reads a byte, without decoding. 2014-08-04 02:20:13 +02:00
Ferdinand Majerech a4befdd866 An optimized version of Reader.peek() with index == 0. 2014-08-04 02:19:34 +02:00
Ferdinand Majerech 063d9754d7 Queue now uses a freelist to minimize allocations. 2014-08-04 02:16:34 +02:00
Ferdinand Majerech 97e717df1b Loader creates Constructor/Resolver lazily to avoid garbage when user-provided 2014-08-04 02:14:01 +02:00
Ferdinand Majerech 6aa50b8898 A benchmark Loader method that scans a file but throws away the tokens. 2014-08-02 23:26:46 +02:00
Ferdinand Majerech c160156346 Fixed the string->char[] Token value move. 2014-08-02 23:25:56 +02:00
Ferdinand Majerech aeee0758a7 Refactored FastCharSearch with more modern string mixin code. 2014-08-02 02:35:03 +02:00
Ferdinand Majerech d32addacda Slices now nonconst in all layers up to Parser, where they get cast to string. 2014-08-02 01:58:20 +02:00
Ferdinand Majerech 7b699c5903 UTF-8 validation now uses UTF-8 decoding code. 2014-08-02 01:37:16 +02:00
Ferdinand Majerech b5da695d6b More @nogc in Scanner. 2014-08-02 01:19:29 +02:00
Ferdinand Majerech e6fdade4a6 Scanner now uses @nogc UTF decoding. 2014-08-02 01:16:29 +02:00
Ferdinand Majerech e1209711af UTF-8 decoding now has versions for validated and unvalidated strings. 2014-08-02 01:15:57 +02:00
Ferdinand Majerech 5932155435 Style. 2014-08-02 01:15:22 +02:00
Ferdinand Majerech fad280060e Better Constructor docs. 2014-08-01 23:01:34 +02:00
Ferdinand Majerech f137db438e Better Constructor funct attribs. 2014-08-01 23:01:24 +02:00
Ferdinand Majerech 66679a601c Moved tinyendian.d out of the dyaml directory. 2014-08-01 02:56:37 +02:00
Ferdinand Majerech a9fb68f340 Removed internals from DDoc. 2014-08-01 02:52:14 +02:00
Ferdinand Majerech 0f017646fc Reverted doc style due to DDoc issues. 2014-08-01 02:51:35 +02:00
Ferdinand Majerech fdf4cecddb Backported a recent std.utf script. 2014-08-01 02:47:52 +02:00
Ferdinand Majerech 830aef8df5 Simpler grepping for 'std.stream' 2014-07-31 14:53:14 +02:00
Ferdinand Majerech 151871e1b3 Better Loader docs. 2014-07-31 14:52:40 +02:00
Ferdinand Majerech 276bed7fb6 Constructor unittests now use the new Loader ctor. 2014-07-31 02:55:38 +02:00
Ferdinand Majerech 68d9124b17 Removed Reader ctor from Stream. 2014-07-31 02:33:24 +02:00
Ferdinand Majerech a74bc8cf3b Reader unittests now construct Reader from a buffer. 2014-07-31 02:28:42 +02:00
Ferdinand Majerech e100047572 Updated Loader examples. 2014-07-31 02:22:42 +02:00
Ferdinand Majerech 626337f6ed More readable error throws in Loader. 2014-07-31 02:16:53 +02:00
Ferdinand Majerech ddd89c22d5 Updated Loader unittest. 2014-07-31 02:09:44 +02:00
Ferdinand Majerech e919f4be51 Added a char[] fromString. 2014-07-31 02:08:08 +02:00
Ferdinand Majerech 24a8e945bd Removed docs of deprecated API. 2014-07-31 02:07:32 +02:00
Ferdinand Majerech d18a41999e Loader ctor from buffer, deprecated old API, constructing Reader from buffer. 2014-07-31 02:01:08 +02:00
Ferdinand Majerech a23c41385a Very minor Parser/Event @nogc/style. 2014-07-31 01:56:36 +02:00
Ferdinand Majerech 03d1183550 Added a Reader ctor from buffer and deprecated the ctor from Stream. 2014-07-31 01:56:06 +02:00
Ferdinand Majerech 4d6634f49b Doc fixes. 2014-07-30 23:30:37 +02:00
Ferdinand Majerech 5da8561df4 Renamed buffer8_ to buffer_, bufferOffset8_ to bufferOffset_. 2014-07-30 23:26:44 +02:00
Ferdinand Majerech 4e2c3e6093 Only validate UTF-8 if we get UTF-8 input (UTF-16/32 validated at conversion) 2014-07-30 22:32:58 +02:00
Ferdinand Majerech 6c15bd95cc Moved unused, but potentially useful Reader code to dyaml.unused. 2014-07-30 18:38:27 +02:00
Ferdinand Majerech e58b092fe1 Removed an unsafe cast. 2014-07-30 18:23:40 +02:00
Ferdinand Majerech 492e36d28a In-place UTF32->UTF-8 conversion in Reader. 2014-07-30 18:21:09 +02:00
Ferdinand Majerech cc1aaf4ac8 dchar->char encoding now also has a version for non-validated dchars. 2014-07-30 18:20:22 +02:00
Ferdinand Majerech f4c57b368b Style/spaces. 2014-07-30 04:46:53 +02:00
Ferdinand Majerech cf3bff517c UTF-8 is now the default input encoding. UTF-16/32 is encoded into UTF-8. 2014-07-30 04:46:28 +02:00
Ferdinand Majerech c1ffa05735 Removed redundant spaces. 2014-07-30 00:37:15 +02:00
Ferdinand Majerech c473ef7dee Removed -8 suffixes from Reader methods. 2014-07-30 00:13:48 +02:00
Ferdinand Majerech eb266b4e27 Removed the -8 suffixes from Scanner methods. 2014-07-29 23:42:50 +02:00
Ferdinand Majerech e5561285c3 Removed UTF-32 buffer offset. 2014-07-29 23:25:22 +02:00
Ferdinand Majerech 33b2a7ef68 Removed the UTF-32 buffer from Reader. 2014-07-29 23:23:45 +02:00
Ferdinand Majerech 736de8beb9 Reader now uses validation to get the number of characters in the UTF8 buffer. 2014-07-29 23:22:16 +02:00
Ferdinand Majerech 74c161c576 validateUTF8NoGC now calculates the number of characters in passed string. 2014-07-29 23:21:07 +02:00
Ferdinand Majerech ffef7bf6fc Removed UTF-32 parts of Reader API. 2014-07-29 23:15:08 +02:00
Ferdinand Majerech d1aaec6a60 Removed the UTF-32 SliceBuilder. 2014-07-29 23:10:46 +02:00
Ferdinand Majerech 207cb249e0 Scanner style. 2014-07-29 23:08:37 +02:00
Ferdinand Majerech 8806cfc1b4 More @nogc in Scanner. 2014-07-29 23:08:03 +02:00
Ferdinand Majerech 18be6b2e5b Removed UTF-32 scanLineBreak. 2014-07-29 23:01:05 +02:00
Ferdinand Majerech 6837156258 Block scalar scanning now works with UTF-8. 2014-07-29 20:58:00 +02:00
Ferdinand Majerech 19ed03cb3e Low hanging fruit for using UTF-8 reader methods 2014-07-29 20:55:24 +02:00
Ferdinand Majerech ecc168dc75 insert() for SliceBuilder8. 2014-07-29 20:52:39 +02:00
Ferdinand Majerech 58e19d75ad Assert message fix. 2014-07-29 20:52:24 +02:00
Ferdinand Majerech 302995354c Fixed a SliceBuilder8.Transaction compilation bug. 2014-07-29 14:43:53 +02:00
Ferdinand Majerech 510357f4c7 insert() instead of insertBack() for SliceBuilder. 2014-07-29 14:41:46 +02:00
Ferdinand Majerech 239152f793 UTF-8 scanPlain and callees. 2014-07-29 04:28:07 +02:00
Ferdinand Majerech d80917419f Removed obsolete UTF-32 methods. 2014-07-29 04:20:14 +02:00
Ferdinand Majerech 4a09338a7a Directive scanning is now fully UTF-8. 2014-07-29 04:19:44 +02:00
Ferdinand Majerech 38143a2c64 Fixed NoGC appender unittest. 2014-07-29 04:18:34 +02:00
Ferdinand Majerech 31acd6aead Removed obsolete comment. 2014-07-29 04:10:42 +02:00
Ferdinand Majerech e565543080 Removed UTF-32 scanAlphaNumeric. 2014-07-29 04:10:30 +02:00
Ferdinand Majerech ef735e280f UTF-8 directive name scanning. 2014-07-29 04:10:16 +02:00
Ferdinand Majerech 4307ccbe82 Fixed a Reader compilation bug. 2014-07-29 03:18:54 +02:00
Ferdinand Majerech 952726aa5e UTF-8 scanFlowScalar. **NOTE:** moved escaping to Parser; can't do it in-place 2014-07-29 03:18:37 +02:00
Ferdinand Majerech 252bf083a7 Fixed a potential Unicode bug. 2014-07-29 03:13:42 +02:00
Ferdinand Majerech b789317df8 UTF-8 scanTag 2014-07-29 03:13:21 +02:00
Ferdinand Majerech de6c1aacdb UTF-8 scanTagHandle. 2014-07-29 03:11:38 +02:00
Ferdinand Majerech 40fe7090d9 UTF-8 scanTagURI. 2014-07-29 03:11:17 +02:00
Ferdinand Majerech 2003a950cb UTF-8 scanURIEscapes. 2014-07-29 03:10:51 +02:00
Ferdinand Majerech 1cc07c263a UTF-8 scanAnchor. 2014-07-29 03:09:59 +02:00
Ferdinand Majerech 2a524bbb5e UTF-8 scanLineBreak. 2014-07-29 03:07:57 +02:00
Ferdinand Majerech 6dd53b55a0 UTF-8 scanAlphaNumeric. 2014-07-29 03:07:31 +02:00
Ferdinand Majerech a9def88eed Docfix. 2014-07-29 03:06:51 +02:00
Ferdinand Majerech 3880adf81d UTF-8 SliceBuilder. 2014-07-29 03:01:16 +02:00
Ferdinand Majerech cb64197bb1 nogcutil import. 2014-07-29 02:59:58 +02:00
Ferdinand Majerech 76cfd7704d forward() invalidates last decoded offsets. 2014-07-29 02:59:33 +02:00
Ferdinand Majerech 2e156a8ece UTF-8 prefix()/get() 2014-07-29 02:59:16 +02:00
Ferdinand Majerech 709ab00e44 A UTF-8 slice(). 2014-07-29 02:58:04 +02:00
Ferdinand Majerech 56057b43ec peek() now uses the UTF-8 buffer. 2014-07-29 02:57:19 +02:00
Ferdinand Majerech ef9053d7f3 Keeping buffer8_ and buffer_ positions in sync. 2014-07-29 02:54:39 +02:00
Ferdinand Majerech 6addaa4cbe Better comment. 2014-07-29 02:52:01 +02:00
Ferdinand Majerech 634418b599 Added UTF-8 version of the Reader buffer (for now, side by side with UTF-32) 2014-07-29 02:51:46 +02:00
Ferdinand Majerech d3846f7970 Removed now unused function. 2014-07-29 02:00:32 +02:00
Ferdinand Majerech 5d78e76f6a Error messages with non-ASCII chars will now show the char, not 'unknown'. 2014-07-29 02:00:13 +02:00
Ferdinand Majerech 7cf9dca57d Function to encode *valid* UTF-32 to UTF-8 2014-07-29 01:59:22 +02:00
Ferdinand Majerech cf15d55da0 Function to decode *valid* UTF-8 2014-07-29 01:58:59 +02:00
Ferdinand Majerech 53b39dc590 Updated copyright and description. 2014-07-29 01:58:22 +02:00
Ferdinand Majerech 6b8ff23859 A function to validate a UTF-8 string. 2014-07-29 01:58:00 +02:00
Ferdinand Majerech 61424b0ac6 A @nogc isValidDchar. 2014-07-29 01:57:07 +02:00
Ferdinand Majerech cac25207f1 parseNoGC can work with code points directly. 2014-07-29 01:55:43 +02:00
Ferdinand Majerech 6e1239fdac Removed unused/untested code from AppenderNoGCFixed. 2014-07-29 01:50:04 +02:00
Ferdinand Majerech 4a4e83112c utf8Stride is now globally visible in reader.d 2014-07-28 23:21:43 +02:00
Ferdinand Majerech 45b15890ca It should be enough to use \x instead of \u for \u0085 2014-07-28 23:19:59 +02:00
Ferdinand Majerech 645b191948 Removed todo garbage. 2014-07-26 23:38:59 +02:00
Ferdinand Majerech f07aaeef87 Reader UTF decoding is now private. 2014-07-26 23:37:56 +02:00
Ferdinand Majerech a8c32430ed Minor style. 2014-07-26 23:37:33 +02:00