V-Parser: developing an option towards NLP and some other thingies

  • 34 Replies
  • 5179 Views
*

ivan.moony

  • Trusty Member
  • *********
  • Terminator
  • *
  • 949
    • Structured Type System
Really liking your work Ivan…
Thank you, but I should take these preliminary results with reserve. Looks cool for now, but it's not fully tested yet. :-\

Search speed is finite (GHz) storage is plentiful (TB).
For writing each byte, some time is spent. More there are bytes, more time is needed for their writing.
Dream big. The bigger the dream is, the more beautiful place the world becomes.

*

Zero

  • Trusty Member
  • ********
  • Replicant
  • *
  • 567
  • Offline
    • Github page
Ok I see the point, thanks.
I think Korrelan is talking about memoization.

Hey, thinking about it, chatbots are like big memoized things... but let's not drift away from parsers!
Thinkbots are free, as in 'free will'.

*

ivan.moony

  • Trusty Member
  • *********
  • Terminator
  • *
  • 949
    • Structured Type System
Well, this is something that works too (tested in practice), but it is not linear. The speed is about the same as in graphs already published in this thread:
Code: [Select]
PROCEDURE Parse (grammar, text)
    DECLARE chart: [];
    Populate (chart[0], grammar.TOP_RULE, 0, []);
    FOR each nonempty column in chart  // chart can expand during this loop
        FOR each new item in column  // column can expand during this loop
            IF item.Rule is not terminal
                FOR each alternation of item.rule
                    Populate (column, alternation.sequence, 0, [item]);
            ELSE
                IF item.Rule.SCAN_TERMINAL (column.index, text)
                    item.Aheads.ADD_IF_NOT_EXIST (chart[item.RIGHT_EXTENT]);
                    FOR each occurence in item.Occurences
                        Populate (chart[item.RIGHT_EXTENT], occurence.sequence, occurence.index + 1, occurence.Parents);

    RETURN chart;

PROCEDURE Populate (array, sequence, index, parents)
    rule ← sequence[Index];
    item ← array.FIND (rule);
    IF not found item
        item ← NEW {Rule: rule, Occurences: [], Inheritors: [], Aheads:[]};
        array.ADD (item);

    IF Sequence.LENGTH > Index + 1
        AddOccurence (item, sequence, index, parents);

    ELSE
        FOR each parent in parents
            parent.Inheritors.ADD (item);
            FOR each toInherit in parent.Occurences
                AddOccurence (item, toInherit.Sequence, toInherit.Index, toInherit.Parents);

PROCEDURE AddOccurence (item, sequence, index, parents)
    occurence ← item.Occurences.FIND (sequence, index);
    IF not found occurence
        occurence ← NEW {Sequence: sequence, Index: index, Parents: []};
        item.Occurences.ADD (occurence);

    FOR each parent in parents
        occurence.Parents.ADD_IF_NOT_EXIST (parent);

    FOR each inheritor in item.Inheritors
        AddOccurence (inheritor, occurence.Sequence, occurence.Index, occurence.Parents);

    FOR each column in item.Aheads
        Populate (column, occurence.Sequence, occurence.Index + 1, occurence.Parents);

I thought of just adding a few lines to merge parents of the same right extent to make the parsing linear, but I'm onto implementing new (left-align)<->(right-align) philosophy because I don't like combining several different philosophies, using only fragments of them where appropriate, it doesn't feel right. I have at least two different algorithms by now (both cubic, I guess), and I wonder how many parsing algorithms are even possible, and which of them could be linear...
Dream big. The bigger the dream is, the more beautiful place the world becomes.

*

ivan.moony

  • Trusty Member
  • *********
  • Terminator
  • *
  • 949
    • Structured Type System
Re: V-Parser: developing an option towards NLP and some other thingies
« Reply #33 on: November 27, 2017, 08:02:31 pm »
I think I made it, it's almost linear.  :weee:

You can check it out at: http://moony.atspace.cc/v-parser. Check predefined grammars (partial NLP included), or write new ones. Let me know if you can find out any grammar that is non-linear, relative to a length of input string.

I'll try to publish its little brother together with actual algorithm (about 30 lines) on Github soon. Still have to work out some compact parse forest output and other details. Scripting would also be supported, for binary search, or whatever.
« Last Edit: December 02, 2017, 01:39:13 pm by ivan.moony »
Dream big. The bigger the dream is, the more beautiful place the world becomes.

*

ivan.moony

  • Trusty Member
  • *********
  • Terminator
  • *
  • 949
    • Structured Type System
Re: V-Parser: developing an option towards NLP and some other thingies
« Reply #34 on: December 02, 2017, 03:39:43 pm »
I've published the first version of V-Parser. It is free to use, and it is open source. You can download it from my GitHub repository and test it at online grammar development environment.

Don't hate me, I implemented only CPF output. Hopefully, this would be changed in the next version.
Dream big. The bigger the dream is, the more beautiful place the world becomes.