Structuring natural language

  • 17 Replies
  • 6276 Views
*

Zero

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1000
  • Ready?
    • Thinkbots are free
Structuring natural language
« on: January 07, 2016, 07:58:31 AM »
Hi all!

To help computers understanding natural language, we can make the structure of the sentence explicit (see wikipedia subordination).

Code
    A > B             A is subordinate of B
    A < B             B is subordinate of A
    A < [ B | C ]     B and C are subordinate of A
    A {B} C           A and C are conjuncts, B is coordinator
    A {} B            A and B are conjuncts


I < need < ( [ your > clothes | your > boots ] {and} ( your > motorcycle ) )

( if < you < have < ( a > moment ) ) > ( I < would < love < ( your > thoughts ) < on < this )

( are < [ there | ( any > rules ) < [ I < should < know | about ] ] ?

computers < are < ( ( very > good ) < at < ( ( following < ( exact > orders ) ) {and} ( handling < ( ( very > specific ) > things ) ) ) {but} ( ( not > good ) < at < ( dealing < with < ( ( new > things ) < they < haven't seen < before ) ) ) )

( for example ) < ( ( [ a | common | computer ] > program < can < turn < [ ( a > report ) < of < ( names {and} ( hours < worked ) ) | into < paychecks < for ( the > workers < at < ( a > company ) ) ] ) {but} ( [ the | same ] > program < could not < answer < questions < [ from ( an > employee ) | about < why < ( the > company ) < will not < pay < for < ( nap time ) ] ) )

To make it even less ambiguous, we can add a number after the word to indicate the meaning. For example, door4 is "a non-physical entry into the next world, a particular feeling, a company, etc."

EDIT: typos
EDIT: new examples. Questions seems hard to structure...
« Last Edit: January 07, 2016, 10:11:56 AM by Zero »

*

ivan.moony

  • Trusty Member
  • ***********
  • Eve
  • *
  • 1410
    • e-teoria
Re: Structuring natural language
« Reply #1 on: January 07, 2016, 01:52:55 PM »
Link grammar is today's convenient way of parsing natural language. It is a part of AbiWord, Open Office Word and even maybe Microsoft Word.
There exist some rules interwoven within this world. As much as it is a blessing, so much it is a curse.

*

Zero

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1000
  • Ready?
    • Thinkbots are free
Re: Structuring natural language
« Reply #2 on: January 07, 2016, 02:37:10 PM »
Thanks, I didn't know about link grammar  O0

Here the idea is to fill the gap between natural language (very rich but not formal and hard to parse) and computer language (very clear and easy to parse, but poor). With this we can express rich things formally, it can be used to write imperative programs or declarative data. It could be a language of thoughts.

*

Zero

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1000
  • Ready?
    • Thinkbots are free
Re: Structuring natural language
« Reply #3 on: January 08, 2016, 09:38:38 AM »


More examples...

Layout:
Code
    ( clear < the page )
    ( draw < a table < [
        center it on the page |
        it has 4 columns < [
            head color < cyan |
            1 < [ title < name | width < 120px ] |
            2 < [ title < age | width < 40px ] |
            3 < [ title < city | width < 180px ] |
            4 < [ title < occupation | width < 240px ]
        ] |
        it has 10 rows < [
            height < 20 |
            head color < orange
        ]
    ] )

Logic:
Code
    (A < is grandfather of < C)
        < means that <
        ( (A < is father of < B) {and} (B < is father of < C) )

    ( A < is ancestor of < C )
        < means that <
        ( (A < is father of < C) {or}
        ( (A < is father of < B) {and} (B < is ancestor of < C) ) )

EDIT: Deleted bad precedence rule
« Last Edit: January 09, 2016, 03:55:13 AM by Zero »

*

Zero

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1000
  • Ready?
    • Thinkbots are free
Re: Structuring natural language
« Reply #4 on: January 09, 2016, 02:05:11 AM »
Finally, it's a kind of directed graph and s-expressions mix

Code

    # name:    section definition

    "name"     reference to section or file#section

    *      wildcard

    >      of
    <      whose

    |      parallel
    /      forward
    \      backward

    { }    custom coordinator
    [ ]    relation distribution
    ( )    group as a whole

    A > B             A is sub-element of B
    A < B             B is sub-element of A
    A < [ B | C ]     B and C are sub-elements of A
    A < [ B / C ]     same + C comes after B
    A < [ B \ C ]     same + C comes before B
    A {B} C           A and C are co-elements, B is coordinator
    A {} B            A and B are co-elements


EDIT: Added sections def and ref. This thing has a name now: DSX syntax. Names with an X are cool  ::)
« Last Edit: January 09, 2016, 03:41:51 AM by Zero »

*

8pla.net

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1194
  • TV News. Pub. UAL (PhD). Robitron Mod. LPC Judge.
    • 8pla.net
Re: Structuring natural language
« Reply #5 on: January 09, 2016, 03:09:37 AM »
    (A < is grandfather of < C)
        < means that <
        ( (A < is father of < B) {and} (B < is father of < C) )

May I ask a very polite question -- Can you try to speak this to me in Algebra?
I tried to do it before after reading and to be honest I got stuck, I think  :-[

If Alex is grandfather to Charlie
and if Alex is father to Bob 
then Bob is father to Charlie.

To be fair to myself, I am tired right now.  And, at least it is food for thought.
My Very Enormous Monster Just Stopped Using Nine

*

Zero

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1000
  • Ready?
    • Thinkbots are free
Re: Structuring natural language
« Reply #6 on: January 09, 2016, 03:38:28 AM »
It's inspired from prolog. If Alex is father to Bob and Bob is father to Charlie, then Alex is grandfather to Charlie. Here we define the "grandfather" relation type using an already existing "father" relation type.

I'm not sure I can speak in Algebra... Did I answer your question?

*

Zero

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1000
  • Ready?
    • Thinkbots are free
Re: Structuring natural language
« Reply #7 on: January 09, 2016, 03:21:06 PM »
I made a vintage page out of it.

*

8pla.net

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1194
  • TV News. Pub. UAL (PhD). Robitron Mod. LPC Judge.
    • 8pla.net
Re: Structuring natural language
« Reply #8 on: January 09, 2016, 07:30:24 PM »
My Very Enormous Monster Just Stopped Using Nine

*

Zero

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1000
  • Ready?
    • Thinkbots are free
Re: Structuring natural language
« Reply #9 on: January 09, 2016, 07:38:01 PM »
I made a vintage page out of it.

That's very helpful.

 :2funny:

EDIT: Wait, it was a joke, right?  :)
« Last Edit: January 10, 2016, 11:39:01 AM by Zero »

*

Art

  • At the end of the game, the King and Pawn go into the same box.
  • Global Moderator
  • **********************
  • Colossus
  • *
  • 5804
Re: Structuring natural language
« Reply #10 on: January 10, 2016, 04:59:18 PM »
Relatively speaking...of course! ;D
In the world of AI, it's the thought that counts!

*

Zero

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1000
  • Ready?
    • Thinkbots are free
Re: Structuring natural language
« Reply #11 on: January 11, 2016, 08:53:08 AM »
Relatively speaking...of course! ;D

 ;D Sure, it had to be said!



So I'm still working on this, here are the first two sections of my vintage page:

Quote
String type is the king of data types. It is both an absolute low-level, since it can be entered directly on the keyboard, and the highest-level since it's where you would store a poem about God.

In a speech, there's an implicit network of links between words and expressions, which organize them by coordination and subordination. Humans can easily rebuild these invisible links, thanks to their common sense, and understand what is said. But for a computer, it's a tough task. Computers need formal languages, where everything is explicit, because they don't have background knowledge to fill the holes. We can use a special syntax to describe explicitly these links. When parsed, a text written in this syntax won't be stored in a single string, but in a meaningful structure of strings linked to one another.

Understanding natural language is hard also because one word often has several meanings. Humans use logic and the context to deduce what is meant. Again, this requires a rich knowledge computers don't have. In our syntax, we make meanings explicit by adding a number at the end of each word. This number indicates which meaning, in the Wiktionary, we're refering to. For example, door4 is "a non-physical entry into the next world, a particular feeling, a company, etc."

Another complex challenge for computers is coreference resolution, the ability to determine which expressions refer to the same entities. We make these coreferences explicit by adding a simple identity tag right after the expressions, when needed. If two expressions have the same identity tag, they refer to the same entity.

Finally, to help computers with named entity recognition, we also use a proper name tag, that is placed right before an expression to indicate that this expression is the name of an entity.
Code
Here is a table showing special characters used in DSX syntax:

    # name = filename;   :    shortcut for long filenames
    # name:              :    section definition
                         :
    "name"               :    reference to section or file#section
                         :
    @                    :    indentity tag
    ^                    :    proper name tag
                         :
    >                    :    of
    <                    :    whose
                         :
    |                    :    parallel
    /                    :    forward
    \                    :    backward
                         :
    = =                  :    custom coordinator
    [ ]                  :    relation distribution
    ( )                  :    group as a whole
                         :
    A > B                :    A is sub-element of B
    A < B                :    B is sub-element of A
    A < [ B | C ]        :    B and C are sub-elements of A
    A < [ B / C ]        :    same + C comes after B
    A < [ B \ C ]        :    same + C comes before B
    A =B= C              :    A and C are co-elements, B is coordinator
    A == B               :    A and B are co-elements

Hope this can give ideas to someone  :)

*

ivan.moony

  • Trusty Member
  • ***********
  • Eve
  • *
  • 1410
    • e-teoria
Re: Structuring natural language
« Reply #12 on: January 11, 2016, 01:01:04 PM »
Just noticing... You didn't update examples with new co-elements and coordinator syntax on the web page.

Interesting experiment. Anyway, I'd like to see more serious examples about particular use of DSX.

Does DSX name stand for something or it just sounds good?

Edit: typos
« Last Edit: January 11, 2016, 03:41:28 PM by ivan.moony »
There exist some rules interwoven within this world. As much as it is a blessing, so much it is a curse.

*

spydaz

  • Trusty Member
  • *******
  • Starship Trooper
  • *
  • 319
  • Developing Conversational AI (Natural Language/ML)
    • SpydazWeb
Re: Structuring natural language
« Reply #13 on: January 11, 2016, 06:35:52 PM »
interesting topic ...

thinking about a similar logic for creating the logic behind inferences which this may be f some use, yet not the sentence tagging more for producing a truth table behind information stored.
I am currently storing information in patterns based on the OWL ontology / conceptNet, which collects a lot of information yet searching and making the data become usable... this would help .... interesting...

*

Zero

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1000
  • Ready?
    • Thinkbots are free
Re: Structuring natural language
« Reply #14 on: January 12, 2016, 12:04:10 PM »
Thanks for your interest, it is highly appreciated  :)

Examples have not been updated yet, they'll be recycled very soon. I don't have much time currently, so I work slowly, mostly on my smartphone during a cigarette... Serious examples (of various granularity) are coming!

The name DSX popped-up when I said that it's some kind of a directed-graph and s-expression mix. But I guess the X just looks cool  ;)

It has evolved, since I replaced { } by = =. The reason is that I wanted to free { } for another use: embedding other languages, like js scripts for instance.


 


Users Online

14 Guests, 0 Users

Most Online Today: 25. Most Online Ever: 340 (March 26, 2019, 09:47:57 PM)

Articles