Ai Dreams Forum

Member's Experiments & Projects => AI Programming => Topic started by: Zero on July 26, 2018, 10:31:22 am

Title: Wild ontology - where am I wrong?
Post by: Zero on July 26, 2018, 10:31:22 am
Here is a piece of JSON extracted from the jsonQ (http://ignitersworld.com/lab/jsonQ.html) website.

Code
var jsonObj = {
    "fathers": [{
        "age": 44,
        "name": "James Martin",
        "daughters": [{
            "age": 24,
            "name": "Michelle",
            "husband": {
                "age": 30,
                "name": "Matthew"
            }
        }, {
            "age": 30,
            "name": "Angela",
            "husband": {
                "age": 23,
                "name": "William"
            }
        }]
    }, {
        "age": 47,
        "name": "David Thompson",
        "daughters": [{
            "age": 20,
            "name": "Amy",
            "husband": {
                "age": 26,
                "name": "Edward"
            }
        }, {
            "age": 20,
            "name": "Dorothy",
            "husband": {
                "age": 23,
                "name": "Timothy"
            }
        }]
    }, {
        "age": 56,
        "name": "Thomas Young",
        "daughters": [{
            "age": 22,
            "name": "Sharon",
            "husband": {
                "age": 23,
                "name": "Jason"
            }
        }, {
            "age": 22,
            "name": "Carol",
            "husband": {
                "age": 23,
                "name": "William"
            }
        }, {
            "age": 20,
            "name": "Brenda",
            "husband": {
                "age": 30,
                "name": "Timothy"
            }
        }]
    }, {
        "age": 53,
        "name": "Jason Martinez",
        "daughters": [{
            "age": 19,
            "name": "Jessica",
            "husband": {
                "age": 24,
                "name": "Daniel"
            }
        }]
    }, {
        "age": 51,
        "name": "Thomas Gonzalez",
        "daughters": [{
            "age": 23,
            "name": "Brenda",
            "husband": {
                "age": 30,
                "name": "George"
            }
        }, {
            "age": 30,
            "name": "Dorothy",
            "husband": {
                "age": 23,
                "name": "Brian"
            }
        }]
    }, {
        "age": 41,
        "name": "James Lee",
        "daughters": [{
            "age": 20,
            "name": "Sarah",
            "husband": {
                "age": 24,
                "name": "Frank"
            }
        }, {
            "age": 21,
            "name": "Carol",
            "husband": {
                "age": 28,
                "name": "Larry"
            }
        }]
    }, {
        "age": 58,
        "name": "Kenneth Brown",
        "daughters": [{
            "age": 23,
            "name": "Ruth",
            "husband": {
                "age": 24,
                "name": "Brian"
            }
        }, {
            "age": 18,
            "name": "Lisa",
            "husband": {
                "age": 24,
                "name": "Scott"
            }
        }, {
            "age": 27,
            "name": "Sandra",
            "husband": {
                "age": 31,
                "name": "Charles"
            }
        }]
    }, {
        "age": 50,
        "name": "Thomas Lee",
        "daughters": [{
            "age": 27,
            "name": "Patricia",
            "husband": {
                "age": 30,
                "name": "Scott"
            }
        }, {
            "age": 21,
            "name": "Jennifer",
            "husband": {
                "age": 23,
                "name": "George"
            }
        }]
    }, {
        "age": 50,
        "name": "Robert Anderson",
        "daughters": [{
            "age": 24,
            "name": "Angela",
            "husband": {
                "age": 23,
                "name": "James"
            }
        }]
    }]
};

The obviousness of the information represented here is striking! This is an absolutely wild piece of data, and yet it makes perfect sense to a human developer. For a long time, as soon as I was planning to manipulate knowledge, I used to think "cyc-style ontology". Today, I'm wondering after all, what's wrong with messy data? Messy data's ok!!!

I'm working on a framework that would allow a hobbyist developer to shape a mind as easily as a chatbot. Well, I'm not saying that building up a good chatbot is easy (Mitsuku is probably a very complex project for example). Still, it's easier to make a chatbot than to make a Schwarzenegger-style metal bones AGI. I'd like a thinkbot developer to feel like working on a chatbot: simple tech, no pressure, hobby, relax, enjoy.

The only real problem is name collision if you're dealing with huge knowledge base. The answer to this is simply not to use named entities. Knowledge is just a list of objects. Objects have keys which lead to values. But they have no IDs, hence no name collision.

Inspired from the entity-component-system paradigm, here is some knowledge:
Code
[
    {
        componentA: value1,
        componentB: value2,
        componentC: value3
    },
    {
        componentA: value4,
        componentD: value5
    }
]
Here we have two entities. The various "aspects" of these entities are represented by the components they contain. Let's take a more concrete example.

The following represents "something":
Code
[
    {
    }
]

It turns out this "something" is a human being.
Code
[
    {
        race: "human"
    }
]

This human is a woman.
Code
[
    {
        race: "human",
        gender: "female"
    }
]

She's an entrepreneur.
Code
[
    {
        race: "human",
        gender: "female",
        occupation: "entrepreneur"
    }
]

She's also a mother.
Code
[
    {
        race: "human",
        gender: "female",
        occupation: "entrepreneur"
        motherOf: ["Cathy", "Jack"]
    }
]

Being an entrepreneur and being a mother are two facets/aspects of her life, which are represented by two components.

Since it's easy to create a pattern-template couple for a chatbot, it should also be easy to create knowledge like this for a thinkbots. I can already hear people screaming "Scalability!" and "Consistency!"... Well my point is that if you create your knowledge representation as simply as you can, you'll probably end up with something that's consistent with yourself, and consistent with how most people would represent things. Moreover, when expanding an already existing knowledge base, you'll instinctively follow the "coding style" of the kb, as any dev would.
So... screw OWL. :)

Where am I wrong?
Title: Re: Wild ontology - where am I wrong?
Post by: 8pla.net on July 26, 2018, 05:40:19 pm
Relaxing and enjoying this new interesting discussion thread.
No worries about being wrong.  Plenty of chances for all of us
here to say,  "Oops!" but that's the fun of it.

Zero said, "Knowledge is just a list of objects. Objects have
keys which lead to values. But they have no IDs, hence
no name collision.
"

This reply is intended to demonstrate a name collision for
"Alex" and "Diane" in JSON, and implied, unique keys when
converted to PHP.  By "implied" I mean invisible but still present.
By "unique" I mean automatic numbering.

JSON:
Code
{
    "fathers": [
        "Alex",
        "Brian",
        "Carl",
        "Alex"
    ],
    "mothers": [
        "Diane",
        "Elizabeth",
        "Florence",
        "Diane"
    ],
    "married": [
        2,
        1,
        0,
        3
    ]
}


Converted to PHP
Code
 $DB = array (
  'fathers' =>
  array (
    0 => 'Alex',
    1 => 'Brian',
    2 => 'Carl',
    3 => 'Alex',
  ),
  'mothers' =>
  array (
    0 => 'Diane',
    1 => 'Elizabeth',
    2 => 'Florence',
    3 => 'Diane',
  ),
  'married' =>
  array (
    0 => 2,
    1 => 1,
    2 => 0,
    3 => 3,
  ),
);

At this point, I will stop here, and save something for my next reply.
Wow, that JSON looks pretty nice, and quite efficient in my opinion.

Title: Re: Wild ontology - where am I wrong?
Post by: squarebear on July 26, 2018, 09:37:31 pm
Funnily enough. That's how I code Mitsuku to be able to answer all those silly "Is a train faster than a snail?" or "Can you lift a church?" type questions. I have a database of around 3000 common objects with attributes for each and have written AIML so Mitsuku can manipulate them to find the correct answers.

Here is part of my entry for the word "tree".

(https://aidreams.co.uk/forum/proxy.php?request=http%3A%2F%2Fwww.square-bear.co.uk%2Fobject.png&hash=ef7b5ac97c8fa34a383401e1f9fb7911dbb9ac5a)

The beauty of this is that the chatbot can work things out without having a direct attribute to answer it. So, I don't have a "is edible" attribute but Mitsuku can work out that a tree is "madefrom" wood and you can't eat wood, so you can't eat a tree. It saves hours of coding answers to random nonsense questions.
Title: Re: Wild ontology - where am I wrong?
Post by: ranch vermin on July 26, 2018, 11:46:27 pm
looks like your going good.

so if the robot has picked up some knowledge, even if it was rubbish a bit,  how is it going to use its rubbish?
Title: Re: Wild ontology - where am I wrong?
Post by: Zero on July 27, 2018, 10:25:18 am
Quote
This reply is intended to demonstrate a name collision for "Alex" and "Diane" in JSON, and implied, unique keys when converted to PHP.  By "implied" I mean invisible but still present. By "unique" I mean automatic numbering.
Yes, there's an index anyway, you're right. But you can choose to ignore it, and pretend there's no index.
Actually I structure my data the other way around: it's an array of objects, not an object with arrays as values. :)

Quote
Funnily enough. That's how I code Mitsuku to be able to answer all those silly "Is a train faster than a snail?" or "Can you lift a church?" type questions. I have a database of around 3000 common objects with attributes for each and have written AIML so Mitsuku can manipulate them to find the correct answers.
It seems that what makes Mitsuku efficient is the enormous volume of data that's been (manually?) produced, and clever architectural patterns like the one you described. Did you have any consistency-related issue, or trouble handling such a big brain (shadows, ...etc.)?

Quote
looks like your going good.

so if the robot has picked up some knowledge, even if it was rubbish a bit,  how is it going to use its rubbish?
My theory is that the dev who coded how to pick knowledge should probably code how to use this knowledge. Learning is an important part of any AI project, but I believe there are other ways than low-level, "pixel style" learning.

I believe in good, old fashion, hand-crafted AI. And documenting the brain is as important as creating it, besause if it's well documented, you'll have consistency on good time-scale.
Title: Re: Wild ontology - where am I wrong?
Post by: 8pla.net on July 29, 2018, 03:17:24 pm
"But you can choose to ignore it, and pretend there's no index.", said Zero.

Yes, but not eliminate it, except in cases very small in size, most programmers agree.
No index makes it non-relational.  An index makes it relational (demonstrated below). 

Here is the second part of my PHP code sample:
Code
foreach($DB['married'] as $father=>$mother)
{
   echo $DB['fathers'][$father]. " married ".$DB['mothers'][$mother]."\n";
}

And, here is the program output:
Quote
Alex married Florence
Brian married Elizabeth
Carl married Diane
Alex married Diane
Title: Re: Wild ontology - where am I wrong?
Post by: 8pla.net on July 29, 2018, 04:00:36 pm
Funnily enough. That's how I code Mitsuku to be able to answer all those silly "Is a train faster than a snail?" or "Can you lift a church?" type questions. I have a database of around 3000 common objects with attributes for each and have written AIML so Mitsuku can manipulate them to find the correct answers.

Here is part of my entry for the word "tree".

(https://aidreams.co.uk/forum/proxy.php?request=http%3A%2F%2Fwww.square-bear.co.uk%2Fobject.png&hash=ef7b5ac97c8fa34a383401e1f9fb7911dbb9ac5a)

The beauty of this is that the chatbot can work things out without having a direct attribute to answer it. So, I don't have a "is edible" attribute but Mitsuku can work out that a tree is "madefrom" wood and you can't eat wood, so you can't eat a tree. It saves hours of coding answers to random nonsense questions.

Steve,

For discussion purposes, I created some pseudo code ( incompatible to your AIML )
to ask three questions about how a chatbot uses objects.

The pattern element creates an object, which I can create my pseudo code in PHP:
Code
$tree = new StdClass();

The set elements do two things. They name the attribute, and give it a value, which I can create my pseudo code in PHP:
Code
$tree->{"named"} = 'tree';
$tree->{"made of"} = 'wood';
$tree->{"has"} = 'leaves';

Three questions:
How does a chatbot work things out having a direct attribute to answer it?
How does a chatbot work things out without having a direct attribute to answer it?
In simple terms, how does a chatbot know not to eat a tree because it is made of wood?

I have a few theories and more pseudo code for our discussion, but I would like to
hear from you ( or anyone else ) first.

Title: Re: Wild ontology - where am I wrong?
Post by: squarebear on July 29, 2018, 10:32:24 pm
It works because I've coded some basic rules into it. So I've told it you can't eat things like metal or wood, so from these 2 rules, it now knows you can't eat any of the hundreds of objects made from wood and metal. Here is another example. If someone asks, "Is a (object) alive?", I have no attribute whether something is alive but I do have a category like this, which can work it out looking at the other attributes:

(https://cdn-images-1.medium.com/max/800/1*RD3osq8yfFyBKFDavPyiAQ.png)

Basically, when anyone asks about an object, it loads all of the information it knows about each object into memory and can then use the attributes to work out the answer. It's extremely rare that I have to add a new attribute, as the ones I have can handle most common sense queries.

"Can you lift x?" can be answered by checking the size of something and so on.

To answer Zero's questions, consistency isn't an issue as I'm the only one working on Mitsuku's AI but yes, her performance as a chatbot is directly related to the amount of information, I've inputted into her over the past 13 years.

I wrote a blog post about my techniques here: https://medium.com/pandorabots-blog/can-you-eat-a-chair-aaa80d251f6b
Title: Re: Wild ontology - where am I wrong?
Post by: Art on July 29, 2018, 10:51:54 pm
What about a defining category or (sub-cat) whereas:

Is it connected by way of a root or vine?
Does it take in/consume nutrients?

If either of these conditions is met for a plant (tree, vegetables, fruit, grains, etc.) then the answer would be it is alive. Once the vegetable/fruit, etc. is plucked or picked then it is no longer taking in nutrients and is no longer alive in the simple sense. Notwithstanding the "Attack of the Killer Tomatoes!" (B-movie). ;)


===========
Final remark/question:
Steve, how much lag /delay is experienced by the addition of all the various conditions for Mitsuku to process? I know you can't include everything or every possibility but is there a lag and if so, how have you been able to deal with it (assuming you have)?
Title: Re: Wild ontology - where am I wrong?
Post by: squarebear on July 30, 2018, 10:39:47 am
No lag at all. Mitsuku usually processes and responds to any input under half a second even when dealing with hundreds of users at once. She has to search and process over 300,000 categories and the responses are extremely quick.
Title: Re: Wild ontology - where am I wrong?
Post by: 8pla.net on July 31, 2018, 04:47:09 am
Thanks Steve,

As your article suggests, I wrote some pseudo code in PHP,
and it works, but not nearly as well as your article explains.   
Though, I think I understand a bit better now.

If you are interested, I proofread your article and found,
"chocoate" (misspelled) in Mitsuku's log.  That doesn't
matter much, of course.   I enjoyed reading your article.


Title: Re: Wild ontology - where am I wrong?
Post by: squarebear on July 31, 2018, 11:04:52 am
Well spotted and thanks for the kind words. The sample logs I posted as examples came straight from Mitsuku's logs, and it was the user who had spelled it incorrectly.

She has to deal with all kinds of spelling errors but I guess that's one for another topic.
Title: Re: Wild ontology - where am I wrong?
Post by: Zero on August 04, 2018, 09:23:43 am
While we're at it squarebear, are there capabilities you'd like to see in AIML? Does AIML lacks some features in your opinion? I mean, it's a powerful tool, but is there still room for enhancement and evolution of AIML?
Title: Re: Wild ontology - where am I wrong?
Post by: squarebear on August 04, 2018, 01:34:21 pm
I'm more than happy with the features of AIML and have yet to find anything I need that I can't do in the language. The addition of sets and maps saves an awful lot of time writing categories and now there are rich media elements like buttons and quick replies, I see it is an ideal platform for building a chatbot.
Title: Re: Wild ontology - where am I wrong?
Post by: Zero on August 07, 2018, 01:51:55 pm
"But you can choose to ignore it, and pretend there's no index.", said Zero.

Yes, but not eliminate it, except in cases very small in size, most programmers agree.
No index makes it non-relational.  An index makes it relational (demonstrated below). 

Here is the second part of my PHP code sample:
Code
foreach($DB['married'] as $father=>$mother)
{
   echo $DB['fathers'][$father]. " married ".$DB['mothers'][$mother]."\n";
}

And, here is the program output:
Quote
Alex married Florence
Brian married Elizabeth
Carl married Diane
Alex married Diane

I was re-reading this answer, and thought it would need a little more attention. You got a point, things should be relational. But I don't agree when you say "no index makes it non-relational". It can be (has to be?) relational without any index involved.

Actually, this is not how the brain works. We don't have indexes for things we know. Instead, we identify and relate things not by their ID, but by their features. For example, one day I was busy thinking about something, and wanted to reach my zippo lighter in my pocket, to burn a cigarette. My hand almost automatically went to my pocket and happily came back with my car keys... well, it was small and made of metal, just like my zippo, and this was enough to identify what I was looking for. Except it wasn't enough this time, but you get the idea.

To explain it with a developer's words, it would be like working with HTML/CSS without using IDs, only classNames, to identify DOM nodes. You're allowed to use ".small.metal" (the small and metal classes), but not "#zippo" (the zippo ID). It doesn't mean you can't identify unique things. For example, the ".myMom" class is enough to identify the only person who's your mother, because she's the only node with this class. But it's definitely not an ID.

Another example. You could be talking about "John Smith", and your interlocutor would interrupt you saying "wait, which 'John Smith' are you talking about?", and you would answer, "the one you met at our wedding". Ok, let's playback the situation. You're referencing a node with classes ".John.Smith". But the interlocutor finds two nodes with these classes, so he asks for more precision. You have to add another class, hoping it will be enough: ".metAtWedding". Only one node has these three classes ".John.Smith.metAtWedding". Then your interlocutor knows exactly who you're talking about.

As you can see, this is relational data manipulation, but without ever using indexes. It's rather a bad news, because computers are so good at working with IDs, and so bad at working with selectors. But that's the price we have to pay, I think.

Now, I was proposing to use simple JSON to store wild ontology. The good news is that objects are nestable: if we want to reference an entity from a component (like a component "brother" pointing to someone's brother), the value of the component doesn't have to be the ID of the entity we want to reference ; instead it can be a set of component/values that forms a pattern to match against.

Oversimplified example, John Smith had a car accident:
Code
}
    whatHappened: "had a car accident"
    who: {
        firstname: "John",
        lastname: "Smith",
        howDoIKnowHim: "met at wedding"
    },
    when: "last week"
}

Notice how we don't need John Smith's ID. Instead, we just give a pattern (firstname John, lastname Smith) to select the entity we're talking about, just like you'd select a bunch of DOM nodes to apply some CSS.
Title: Re: Wild ontology - where am I wrong?
Post by: 8pla.net on August 08, 2018, 06:04:00 pm
In a large non-relational system, without unique indices, duplicates may creep in:

Code
[
{
"what": "Salary",
"who": [
{
"first": "John",
"last": "Doe",
"paid": "1500.00"
}
],
"when": "Friday"
},
{
"what": "Salary",
"who": [
{
"first": "John",
"last": "Doe",
"paid": "1500.00"
}
],
"when": "Friday"
}
]

Title: Re: Wild ontology - where am I wrong?
Post by: 8pla.net on August 08, 2018, 06:10:09 pm
Yet unique indices are machine generated when the JSON gets decoded:
Code
array (
  0 =>
  array(
     'what' => 'Salary',
     'who' =>
    array (
      0 =>
      array(
         'first' => 'John',
         'last' => 'Doe',
         'paid' => '1500.00',
      ),
    ),
     'when' => 'Friday',
  ),
  1 =>
  array(
     'what' => 'Salary',
     'who' =>
    array (
      0 =>
      array(
         'first' => 'John',
         'last' => 'Doe',
         'paid' => '1500.00',
      ),
    ),
     'when' => 'Friday',
  ),
)

Now unique indices tell them apart.
Title: Re: Wild ontology - where am I wrong?
Post by: Zero on August 08, 2018, 09:43:27 pm
Well it seems fair: duplicates happen in real life. It's all about the resolution, the precision of data. In very low resolution, say 8x8 pic, one human really looks like another one. They're duplicate. But if you get better resolution, say 64x64, you start seeing little details, that help you distinguish humans. Same goes here. "John Doe" and "John Doe" look like duplicates, just because we don't have enough details. If you add their age in the dataset for example, maybe you can distinguish them, and tell who's who.
Title: Re: Wild ontology - where am I wrong?
Post by: 8pla.net on August 09, 2018, 07:26:29 am
"duplicates happen in real life", Zero said.
 Yes they do, but they can cause problems
 such as the Y2K bug which could not tell
 the difference between years 2000 & 1900.
Title: Re: Wild ontology - where am I wrong?
Post by: spydaz on August 09, 2018, 10:40:49 am
No lag at all. Mitsuku usually processes and responds to any input under half a second even when dealing with hundreds of users at once. She has to search and process over 300,000 categories and the responses are extremely quick.

This has also been a major problem, Response time!

Is it a programming issue(refactor/Simplify) or data searching issue?

How do you reduce the response time?

I constantly debate whether to move the Processing and data to the cloud to have some grid computing power?