Ai Dreams Forum

Member's Experiments & Projects => AI Programming => Topic started by: Zero on January 07, 2016, 07:58:31 am

Title: Structuring natural language
Post by: Zero on January 07, 2016, 07:58:31 am
Hi all!

To help computers understanding natural language, we can make the structure of the sentence explicit (see wikipedia subordination (https://en.wikipedia.org/wiki/Subordination_%28linguistics%29)).

Code
    A > B             A is subordinate of B
    A < B             B is subordinate of A
    A < [ B | C ]     B and C are subordinate of A
    A {B} C           A and C are conjuncts, B is coordinator
    A {} B            A and B are conjuncts


I < need < ( [ your > clothes | your > boots ] {and} ( your > motorcycle ) )

( if < you < have < ( a > moment ) ) > ( I < would < love < ( your > thoughts ) < on < this )

( are < [ there | ( any > rules ) < [ I < should < know | about ] ] ?

computers < are < ( ( very > good ) < at < ( ( following < ( exact > orders ) ) {and} ( handling < ( ( very > specific ) > things ) ) ) {but} ( ( not > good ) < at < ( dealing < with < ( ( new > things ) < they < haven't seen < before ) ) ) )

( for example ) < ( ( [ a | common | computer ] > program < can < turn < [ ( a > report ) < of < ( names {and} ( hours < worked ) ) | into < paychecks < for ( the > workers < at < ( a > company ) ) ] ) {but} ( [ the | same ] > program < could not < answer < questions < [ from ( an > employee ) | about < why < ( the > company ) < will not < pay < for < ( nap time ) ] ) )

To make it even less ambiguous, we can add a number after the word to indicate the meaning. For example, door4  (https://en.wiktionary.org/wiki/door)is "a non-physical entry into the next world, a particular feeling, a company, etc."

EDIT: typos
EDIT: new examples. Questions seems hard to structure...
Title: Re: Structuring natural language
Post by: ivan.moony on January 07, 2016, 01:52:55 pm
Link grammar (https://en.wikipedia.org/wiki/Link_grammar) is today's convenient way of parsing natural language. It is a part of AbiWord, Open Office Word and even maybe Microsoft Word.
Title: Re: Structuring natural language
Post by: Zero on January 07, 2016, 02:37:10 pm
Thanks, I didn't know about link grammar  O0

Here the idea is to fill the gap between natural language (very rich but not formal and hard to parse) and computer language (very clear and easy to parse, but poor). With this we can express rich things formally, it can be used to write imperative programs or declarative data. It could be a language of thoughts.
Title: Re: Structuring natural language
Post by: Zero on January 08, 2016, 09:38:38 am


More examples...

Layout:
Code
    ( clear < the page )
    ( draw < a table < [
        center it on the page |
        it has 4 columns < [
            head color < cyan |
            1 < [ title < name | width < 120px ] |
            2 < [ title < age | width < 40px ] |
            3 < [ title < city | width < 180px ] |
            4 < [ title < occupation | width < 240px ]
        ] |
        it has 10 rows < [
            height < 20 |
            head color < orange
        ]
    ] )

Logic:
Code
    (A < is grandfather of < C)
        < means that <
        ( (A < is father of < B) {and} (B < is father of < C) )

    ( A < is ancestor of < C )
        < means that <
        ( (A < is father of < C) {or}
        ( (A < is father of < B) {and} (B < is ancestor of < C) ) )

EDIT: Deleted bad precedence rule
Title: Re: Structuring natural language
Post by: Zero on January 09, 2016, 02:05:11 am
Finally, it's a kind of directed graph and s-expressions mix

Code

    # name:    section definition

    "name"     reference to section or file#section

    *      wildcard

    >      of
    <      whose

    |      parallel
    /      forward
    \      backward

    { }    custom coordinator
    [ ]    relation distribution
    ( )    group as a whole

    A > B             A is sub-element of B
    A < B             B is sub-element of A
    A < [ B | C ]     B and C are sub-elements of A
    A < [ B / C ]     same + C comes after B
    A < [ B \ C ]     same + C comes before B
    A {B} C           A and C are co-elements, B is coordinator
    A {} B            A and B are co-elements


EDIT: Added sections def and ref. This thing has a name now: DSX syntax. Names with an X are cool  ::)
Title: Re: Structuring natural language
Post by: 8pla.net on January 09, 2016, 03:09:37 am
    (A < is grandfather of < C)
        < means that <
        ( (A < is father of < B) {and} (B < is father of < C) )

May I ask a very polite question -- Can you try to speak this to me in Algebra?
I tried to do it before after reading and to be honest I got stuck, I think  :-[

If Alex is grandfather to Charlie
and if Alex is father to Bob 
then Bob is father to Charlie.

To be fair to myself, I am tired right now.  And, at least it is food for thought.
Title: Re: Structuring natural language
Post by: Zero on January 09, 2016, 03:38:28 am
It's inspired from prolog. If Alex is father to Bob and Bob is father to Charlie, then Alex is grandfather to Charlie. Here we define the "grandfather" relation type using an already existing "father" relation type.

I'm not sure I can speak in Algebra... Did I answer your question?
Title: Re: Structuring natural language
Post by: Zero on January 09, 2016, 03:21:06 pm
I made a vintage page (http://thinkbots.are.free.fr/) out of it.
Title: Re: Structuring natural language
Post by: 8pla.net on January 09, 2016, 07:30:24 pm
Did I answer your question?

Yes!

I made a vintage page (http://thinkbots.are.free.fr/) out of it.

That's very helpful.
Title: Re: Structuring natural language
Post by: Zero on January 09, 2016, 07:38:01 pm
I made a vintage page (http://thinkbots.are.free.fr/) out of it.

That's very helpful.

 :2funny:

EDIT: Wait, it was a joke, right?  :)
Title: Re: Structuring natural language
Post by: Art on January 10, 2016, 04:59:18 pm
Relatively speaking...of course! ;D
Title: Re: Structuring natural language
Post by: Zero on January 11, 2016, 08:53:08 am
Relatively speaking...of course! ;D

 ;D Sure, it had to be said!



So I'm still working on this, here are the first two sections of my vintage page:

Quote
String type is the king of data types. It is both an absolute low-level, since it can be entered directly on the keyboard, and the highest-level since it's where you would store a poem about God.

In a speech, there's an implicit network of links between words and expressions, which organize them by coordination and subordination. Humans can easily rebuild these invisible links, thanks to their common sense, and understand what is said. But for a computer, it's a tough task. Computers need formal languages, where everything is explicit, because they don't have background knowledge to fill the holes. We can use a special syntax to describe explicitly these links. When parsed, a text written in this syntax won't be stored in a single string, but in a meaningful structure of strings linked to one another.

Understanding natural language is hard also because one word often has several meanings. Humans use logic and the context to deduce what is meant. Again, this requires a rich knowledge computers don't have. In our syntax, we make meanings explicit by adding a number at the end of each word. This number indicates which meaning, in the Wiktionary, we're refering to. For example, door4 is "a non-physical entry into the next world, a particular feeling, a company, etc."

Another complex challenge for computers is coreference resolution, the ability to determine which expressions refer to the same entities. We make these coreferences explicit by adding a simple identity tag right after the expressions, when needed. If two expressions have the same identity tag, they refer to the same entity.

Finally, to help computers with named entity recognition, we also use a proper name tag, that is placed right before an expression to indicate that this expression is the name of an entity.
Code
Here is a table showing special characters used in DSX syntax:

    # name = filename;   :    shortcut for long filenames
    # name:              :    section definition
                         :
    "name"               :    reference to section or file#section
                         :
    @                    :    indentity tag
    ^                    :    proper name tag
                         :
    >                    :    of
    <                    :    whose
                         :
    |                    :    parallel
    /                    :    forward
    \                    :    backward
                         :
    = =                  :    custom coordinator
    [ ]                  :    relation distribution
    ( )                  :    group as a whole
                         :
    A > B                :    A is sub-element of B
    A < B                :    B is sub-element of A
    A < [ B | C ]        :    B and C are sub-elements of A
    A < [ B / C ]        :    same + C comes after B
    A < [ B \ C ]        :    same + C comes before B
    A =B= C              :    A and C are co-elements, B is coordinator
    A == B               :    A and B are co-elements

Hope this can give ideas to someone  :)
Title: Re: Structuring natural language
Post by: ivan.moony on January 11, 2016, 01:01:04 pm
Just noticing... You didn't update examples with new co-elements and coordinator syntax on the web page.

Interesting experiment. Anyway, I'd like to see more serious examples about particular use of DSX.

Does DSX name stand for something or it just sounds good?

Edit: typos
Title: Re: Structuring natural language
Post by: spydaz on January 11, 2016, 06:35:52 pm
interesting topic ...

thinking about a similar logic for creating the logic behind inferences which this may be f some use, yet not the sentence tagging more for producing a truth table behind information stored.
I am currently storing information in patterns based on the OWL ontology / conceptNet, which collects a lot of information yet searching and making the data become usable... this would help .... interesting...
Title: Re: Structuring natural language
Post by: Zero on January 12, 2016, 12:04:10 pm
Thanks for your interest, it is highly appreciated  :)

Examples have not been updated yet, they'll be recycled very soon. I don't have much time currently, so I work slowly, mostly on my smartphone during a cigarette... Serious examples (of various granularity) are coming!

The name DSX popped-up when I said that it's some kind of a directed-graph and s-expression mix. But I guess the X just looks cool  ;)

It has evolved, since I replaced { } by = =. The reason is that I wanted to free { } for another use: embedding other languages, like js scripts for instance.

Title: Re: Structuring natural language
Post by: ivan.moony on January 12, 2016, 03:01:03 pm
Quote
The reason is that I wanted to free { } for another use: embedding other languages, like js scripts for instance.

Yeah, I have a similar idea to embed js inside metafigure. I'm thinking of extending it with native assembler in the future. The idea is to have fragments of code held inside metafigure compiled to js or asm and executed on demand. Metafigure would provide a mechanism for converting any AST (abstract syntax tree) into another AST, so you can convert, say python or java to, say javascript or asm on demand.
Title: Re: Structuring natural language
Post by: Zero on January 13, 2016, 04:39:10 pm
I understand, so you can have the raw power of ASM and the flexibility of metafigures...


Here are the updated examples:
Code
    computers < are < 
    (
        (
            (very > good) < at <
            (
                ( following < (exact > orders) )
                =and=
                ( handling < ((very > specific) > things) )
            )
        )
        =but=
        (
            (not > good) < at <
            (
                dealing < with <
                (               
                    (new > things) < they < haven't seen < before
                )
            )
        )
    )
Code
    ( for example ) <
    (
        (
            ( [ a | common | computer ] > program ) < can <
            (
                turn <
                [
                    a > report < of < ( names =and= (hours < worked) )
                |
                    into < paychecks < for <
                    (
                        the > workers < at < (a > company)
                    )
                ]
            )
        )
        =but=
        (
            ( [ the | same ] > program ) < could not <
            (
                answer < questions <
                [
                    from < (an > employee)
                |
                    about < why <
                    (
                        (the > company) < will not < pay < for < (nap time)
                    )
                ]
            )
        )
    )

EDIT: typos
Title: Re: Structuring natural language
Post by: spydaz on February 05, 2016, 12:55:21 am
You have made me go back to "fluent editors" description logic ... Which can be used to describe these "OWL" like structures ..   

For me too many brackets .... Woah .....