I optimized the algorithm, got about 150% of starting speed. Moving function from JSON to outside didn't do much, but I managed to apply a few array optimizations to get the speed up.
I've done some tests with two grammars. One is left recursive and the other is right recursive version of the same self definition grammar for parsing.
Right recursive grammar version looks like this:
(
Parser :> (
OptSpaceOrComments :> (
(/(\s+)|(\/\/((.*\n)|(.*$)))|(\/\*[\S\s]*?\*\/)/, OptSpaceOrComments) |
''
) |
String :> /("([^"\\\n]|(\\[\S\s]))*")|('([^'\\\n]|(\\[\S\s]))*')|(`([^`\\]|(\\[\S\s]))*`)/ |
RegExp :> /\/(?!\*)(?!\/)([^\/\\\n]|(\\.))*\/i?/ |
SymbolName :> /[_A-z][_A-z0-9]*/ |
Root :> (
AExp :> (
Alternation :> (Item :> Root.SExp, '|', Next :> Root.AExp) |
SExp :> (
Sequence :> (Item :> Root.STExp, ',', Next :> Root.SExp) |
STExp :> (
Subtype :> (Left :> Root.Exp, ':>', Right :> Root.STExp) |
Exp :> (
Parser.OptSpaceOrComments,
(
Parser.String |
Parser.RegExp |
NewSymbol :> Parser.SymbolName |
Reference :> (
Parent :> ('' | Root.Group | Root.Reference | Parser.SymbolName),
'.',
Spec :> Parser.SymbolName
) |
Group :> ('(', Root.AExp, ')')
),
Parser.OptSpaceOrComments
)
)
)
)
)
)
).Parser.Root
and I got these rough results when parsing texts by this grammar:
Left recursive grammar version looks like this:
(
Parser :> (
OptSpaceOrComments :> (
(/(\s+)|(\/\/((.*\n)|(.*$)))|(\/\*[\S\s]*?\*\/)/, OptSpaceOrComments) |
''
) |
String :> /("([^"\\\n]|(\\[\S\s]))*")|('([^'\\\n]|(\\[\S\s]))*')|(`([^`\\]|(\\[\S\s]))*`)/ |
RegExp :> /\/(?!\*)(?!\/)([^\/\\\n]|(\\.))*\/i?/ |
SymbolName :> /[_A-z][_A-z0-9]*/ |
Root :> (
AExp :> (
Alternation :> (Item :> Root.AExp, '|', Next :> Root.SExp) |
SExp :> (
Sequence :> (Item :> Root.SExp, ',', Next :> Root.STExp) |
STExp :> (
Subtype :> (Left :> Root.STExp, ':>', Right :> Root.Exp) |
Exp :> (
Parser.OptSpaceOrComments,
(
Parser.String |
Parser.RegExp |
NewSymbol :> Parser.SymbolName |
Reference :> (
Parent :> ('' | Root.Group | Root.Reference | Parser.SymbolName),
'.',
Spec :> Parser.SymbolName
) |
Group :> ('(', Root.AExp, ')')
),
Parser.OptSpaceOrComments
)
)
)
)
)
)
).Parser.Root
and I got these results when parsing texts by this grammar:
I think I can be satisfied by speed, considering that this is a Javascript version operating on 4 years old Celeron M1000 on 1.8 GHz. As you can see, Chromium beats up Firefox for running this parser. Left recursive version performs slightly better. Both versions exhibit linear parsing speed, as they are non-ambuguous.
Ambiguously divergent grammars are parsing texts in logarithm scale, and as such, they are not very interesting for this parser. Natural languages are ambiguously periodically convergent, and as such, should be parsed in linear time.
Don't be intimidated by the grammar format, it is something I invested a lot of effort in and I hope you like it. It is self-explanatory if you are familiar with
BNF grammar language. The two above examples are actually the grammar self definitions in left and right recursive versions.
I guess that the next steps include building English (and possibly other languages) grammars (possibly automatically from
Treebank text files). I think I'm going open source on Github with this, but maybe I want to try to do something weird before publishing (I'll stay silent for now, you'd think I'm crazier than I am).