metalanguage: a language for describing other languages

ivan.moony · « **Reply #15 on:** February 13, 2016, 06:50:04 pm »

... Returning to differ <= and <- operators: less mess for thinking when using it. Also returning to defining new symbols without prefix and existing symbols with "@" prefix: looks more clean and readable.

I had an itch when using Regexp with javascript parsing, there's no way to efficiently check does an regexp match the value at specific offset in a string. Splicing string on every check was not my option, so I decided to program my own Regexp library. I followed some documentation and got this implementation:

Code

RegExp <= (
    Union <= (@SimpleRE, '|', @RegExp) |
    SimpleRE <= (
        Concatenation <= (@BasicRE, @SimpleRE) |
        BasicRE <= (
            OneOrMore <= (@ElementaryRE, ('+?' | '+')) |
            ZeroOrMore <= (@ElementaryRE, ('*?' | '*')) |
            ZeroOrOne <= (@ElementaryRE, '?') |
            NumberedTimes <= (
                '{',
                In <= (
                    Exactly <= @Integer |
                    AtLeast <= (@Integer, ',') |
                    AtLeastNotMore <= (@Integer ',', Integer)
                ),
                ('}?' | '}')
            ) |
            ElementaryRE <= (
                Group <= ('(', @RegExp, ')' |
                Any <= '.' |
                Eos <= '$' |
                Bos <= '^' |
                Char <= (
                    @NonMetaCharacter |
                    '\\', (
                        @MetaCharacter | 
                        't' | 'n' | 'r' | 'f' | 'd' | 'D' | 's' | 'S' | 'w' | 'W' | 
                        @Digit, @Digit, @Digit
                    )
                ) |
                Set <= (
                    PositiveSet <= ('[', @SetItems, ']') |
                    NegativeSet <= ('[^', @SetItems, ']')
                ) <~ (
                    SetItems <= (
                        SetItem <= (
                            Range <= (@Char, '-', @Char) |
                            @Char
                        ) |
                        @SetItem, @SetItems
                    )
                )
            )
        )
    )
)

It would work with some javascript back-end, but when I compared "union" to "set" in Regexp definition, I concluded they are about the same thing, a choice of values detected at parse time. Didn't like this redundancy, so I decided to slightly change the definition of Regexp and to develop my own version of it which looks like this:

Code

    ChExp <= (
        Choice <= (@ConExp, '|', @ChExp) |
        ConExp <= (
            Concatenation <= (@WExp, @ConExp) |
            WExp <= (
                Without <= (QExp, '!', @WExp) |
                QExp <= (
                    OneOrMore <= (@GExp, '+') |
                    ZeroOrMore <= (@GExp, '*') |
                    ZeroOrOne <= (@GExp, '?') |
                    NumberedTimes <= (@GExp, '{', @Integer, '}') |
                    GExp <= (
                        Group <= ('(', @ChExp, ')') |
                        Exp <= (
                            Any <= '.' |
                            Range <= (@Char, '-', @Char) |
                            Char <= (
                                @NonMetaCharacter |
                                '\\', (
                                    @MetaCharacter |
                                    't' | 'n' | 'r' | 'f' |
                                    '0x', @HEXDigit, @HEXDigit, @HEXDigit, @HEXDigit, @HEXDigit, @HEXDigit
                                )
                            )
                        )
                    )
                )
            )
        )

While implementing extra "without" operator to cover negative set from original Regexp, the new Regexp version is more expressive than original one. For example, the expression "(.*)!((keyword1)|(keyword2))" matches any size string that is different from "keyword1" and "keyword2". In regular Regexp it is possible only to exclude specific character from matching, while I've got exclusion of the whole string. The new definition looks more clean and it does not suffer from choice redundancy.

I have to say, probably structured way of defining grammar in Metafigure saved me from pitfalls which original authors of Regexp had in the seventies when they probably used unstructured BNF. I'm kind of proud at Metafigure, it got me a better version of Regexp already, and it is not even finished yet.

Otherwise, I already started to program Metafigure in Javascript, and I plan the crippled version 0.2 soon, which should be sufficient to parse English texts (yes, it is the very NLP - among other stuff - I'm working on).

ivan.moony · « **Reply #16 on:** April 20, 2016, 07:22:27 pm »

Here is what wiki says about "metatheory":

Quote

A metatheory or meta-theory is a theory whose subject matter is some theory. All fields of research share some meta-theory, regardless whether this is explicit or correct. In a more restricted and specific sense, in mathematics and mathematical logic, metatheory means a mathematical theory about another mathematical theory.

The following is an example of a meta-theoretical statement:

â€œ Any physical theory is always provisional, in the sense that it is only a hypothesis; you can never prove it. No matter how many times the results of experiments agree with some theory, you can never be sure that the next time the result will not contradict the theory. On the other hand, you can disprove a theory by finding even a single observation that disagrees with the predictions of the theory. â€

Meta-theoretical investigations are generally part of philosophy of science. Also a metatheory is an object of concern to the area in which the individual theory is conceived.

This is just what I'm after: a general metatheory.

keghn · « **Reply #17 on:** April 20, 2016, 09:40:58 pm »

SGML HTML XML What's the Difference? (Part 1) - Computerphile:

ivan.moony · « **Reply #18 on:** April 20, 2016, 10:44:10 pm »

Though with strict syntax, XML would be a language by which a lot of other languages (if not all) can be expressed. There is even a programming language named "Fabula" whose programs are entirely expressed by XML (but you have to use predefined tags like <string>, <class>, <applet>...)

ivan.moony · « **Reply #19 on:** April 21, 2016, 06:50:27 pm »

Decided to change "<-" operator to "=>", in order to match my new insights from some logic investigation.

So: A <- B becomes A => B.

Functions are also defined somewhat different, with parameters on the left of "<=" / "=>" and the result on the right of "<=" / "=>".

Changed some pronouncements also. "<=" is now read "induces from" and "=>" is read "deduces to". They are not the same as logic implication, although there is some analogy with them, and can be transformed to equivalent expressions with logic implication.

metalanguage: a language for describing other languages

ivan.moony

Re: metalanguage: a language for describing other languages

ivan.moony

Re: metalanguage: a language for describing other languages

keghn

Re: metalanguage: a language for describing other languages

ivan.moony

Re: metalanguage: a language for describing other languages

ivan.moony

Re: metalanguage: a language for describing other languages

Recent Topics

Recent News

Users Online

Articles