Will the real Suzette please stand up....

  • 79 Replies
  • 32664 Views
*

Art

  • At the end of the game, the King and Pawn go into the same box.
  • Trusty Member
  • **********************
  • Colossus
  • *
  • 5865
Will the real Suzette please stand up....
« on: May 02, 2009, 12:54:17 am »
OK...After checking some IP addresses and doing a little digging I came up with the following info (you can thank me later...):
Be careful what you wish for...you just might get it! ;)
***********************************************************************

http://www.chatterboxchallenge.com/bot_info.php?CBot=23  -  Suzette was there the whole time...DOH!!

http://www.techhui.com/profile/BruceWilcox - the creator or Suzette

http://www.gamasutra.com/view/feature/3761/beyond_aiml_chatbots_102.php - Interesting article about Bruce and why he's writing his own code instead of using AIML.

*********************************************************
Re: Add your chatbot in Bots Directory section.

Postby bruce wilcox » Thu Mar 12, 2009 4:41 pm
My chatbot name: Suzette

Chatbot type (Desktop/Web-based): Web-based

My name (Bot master(s)): Bruce Wilcox
http://jobs.gamasutra.com/resumes/?resu ... qQe0pvXawH

Country: USA

My chatbot homepage: http://66.150.245.139/chat/

Email contact: gowilcox@gmail.com

My chatbot description: Suzette is a Blue Mars Replicant (http://www.avatar-reality.com) . The personality transfer software had been recently upgraded, but, as it turned out, the upgrade had a few flaws. The result was an incomplete transfer of memories from the model host. This has led to emotional instability and at times a schism between the original host persona and the underlying basic personality matrix. The replicant had been originally intended for use as a terraform engineer, but to conceal its instability the Corporation repurposed her as a student working as a waitress in their Polynesian history museum. They figured no one would notice any issues.

My chatbot photo (96x96 pixel): attached

Attachments

    Suzette.jpg
        My chatbot photo
        Suzette.jpg (2.2 KiB) Viewed 75 times

bruce wilcox
     
    Posts: 3
    Joined: Thu Mar 12, 2009 4:34 pm

Top
Re: Add your chatbot in Bots Directory section.

Postby Ehab » Thu Mar 12, 2009 5:19 pm
Thanks Bruce for your submission,

Your bot is located here:
http://www.chatterboxchallenge.com/bot_info.php?CBot=23

If you'd like to enter the CBC 2009 contest , please check the link below:
viewtopic.php?f=2&t=56
In the world of AI, it's the thought that counts!

*

Art

  • At the end of the game, the King and Pawn go into the same box.
  • Trusty Member
  • **********************
  • Colossus
  • *
  • 5865
Re: Will the real Suzette please stand up....
« Reply #1 on: May 02, 2009, 01:17:38 am »
Within one of the above referenced articles Bruce said:

 Reply by Bruce Wilcox on April 27, 2009 at 11:18am
    Suzette is a mashup of technologies, much more powerful than AIML. (which I critiqued in http://www.gamasutra.com/view/feature/3761/beyond_aiml_chatbots_102...). Suzette uses the Wordnet dictionary and ontology, combined with CMU linkparser, a strong pattern language, a runtime system that supports chat (which AIML does not), and a knowledge representation system for storing facts and inferencing.
******** END ***********

So it appears that a variety of methods are employed in Suzette's development which several of us old timers said would most likely be required in order to have a more effective bot.

It's nice to see some one took the lead and did so.

I always thought AIML was vastly overrated anyhow!
In the world of AI, it's the thought that counts!

*

bruce wilcox

  • Trusty Member
  • ***
  • Nomad
  • *
  • 72
Re: Will the real Suzette please stand up....
« Reply #2 on: May 02, 2009, 06:28:39 am »
Sorry that I appear to have been so mysterious. How did you ever run into Suzette if you didnt go thru Chatterbox Challenge to get to her?

Thank you for all your kind words. And your chat runs that help me debug her further.

Anyway, she is not an AIML bot, though a year ago I data mined a bunch of "what is ..." aiml data to initially populate my bot before I had linked in WordNet's definition and done a bunch of original material. So there are AIML dregs of data left, converted into CHAT-L notation.

bruce wilcox

*

bruce wilcox

  • Trusty Member
  • ***
  • Nomad
  • *
  • 72
Re: Will the real Suzette please stand up....
« Reply #3 on: May 02, 2009, 08:22:06 am »
my article in Gamasutra expresses some of my reservations about AIML.  PersonalityForge is much better than AIML, but I have a bunch of issues with it as well. To better understand CHAT-L abiities, here is my critique paper on PeronsalityForge....


The Personality Forge Engine has a better defined system than ALICE. It has much more extensive pattern matching, including matching phrases and sets of words. And it can test for the existence of a saved user variable (memory), the current time/date info, and current emotional state (-5 to 5).

They can match regular expressions of the input (though I feel the regular expression matcher is a waste of time). They provide a rank feature to help you order responses to be chosen, but mostly tell you not to use it because the engine will figure out what to do (so I think the rank feature is also a waste of time).

A big weakness is the absense of a NOT condition, to say something is not in the input at the same time saying what should be matched in the input. Given these inputs: “ I like you”  and “ Do I like you” and “why do I like you”. Writing a pattern that reacts to “I like you” but not to the other two sentences is hard.

They also lack specification of question vs statement form, so telling if something occurred in a question or a statement may be non-trivial.

They have no boundary match conditions to prove a fragment is at the start or end of a sentence.

They cannot specify how many intervening words to stuff into a wildcard match. Combined with boundary match, one can reasonably write a pattern like:
s: ("except for" NEXT _2 NEXT >)  What's unusual about * ?
and expect the output to make sense. But without the ability to bound the wildcard to a small number, you can end up in trouble. Eg. I liked everything about her, except for the fact that she smokes like a pig and I don’t like smokers. An unbounded pattern would create a mess, whereas a match bounded to two words is very likely to stay fine.

They can test if a user-defined variable (memory) exists, but cannot test its value to see if it matches something (particularly for numeric values).

They do not do discontiguous matches, like s: (I AND YOU) which proves that both are in the sentence, but does not require any ordering or specific separation. Similarly they do not do partial ordering like THEN or SOON. They only have the equivalent of our NEXT.

While they have wildcards for wordnet parts of speech and ontological categorization, they do not have wildcards for proper names (eg malename, femalename, humanname, thingname or for numbers.

They only have interjections: yes, no, haha. They have an alternate mechanism for interjections hello and goodby. But they lack surprise, disgust, etc.

Their emotion is limited to a single variable. They do not discriminate emotions. For example, we would have a variable for current feeling toward you, another for our present feeling toward the topic we are discussing, and yet another for our general mood.

They support remapping of input by using GOTO and then you can reformulate the input.
This is not as elegant as a function with arguments which executes in place to do the same job. If you define a function that detects all ways of asking “what is your opinion about”, you can have them all gathered in one place then use the function scattered where needed. If you try to do a goto remap, you have to decide on the basic form you will use everywhere else (the one all remap to) and when you see that form in some area of your text you may not remember the things that remap to it. Or if you want to add a new remap, you have to go figure out where the remaps are.

Organizationally, they have one awful mass of stuff you process. They cannot subset it into topics of conversation readily. This makes adding new data a disaster. There is no support for topics of conversation, except story-teller mode, which makes it run through the default no-match event with the next corresponding statement. That is, they have a form of support for a single topic, which is the story. We can go linearly through a topic’s statements, yet bounce out of order as needed to reply to user queries in the topic.


*

Art

  • At the end of the game, the King and Pawn go into the same box.
  • Trusty Member
  • **********************
  • Colossus
  • *
  • 5865
Re: Will the real Suzette please stand up....
« Reply #4 on: May 02, 2009, 12:18:49 pm »
Thanks for the critique and explanation!
In the world of AI, it's the thought that counts!

*

Freddy

  • Administrator
  • **********************
  • Colossus
  • *
  • 6860
  • Mostly Harmless
Re: Will the real Suzette please stand up....
« Reply #5 on: May 02, 2009, 01:47:47 pm »
Wow, that's some good detective work Art.  This is an interesting development, I can see why she is ahead in the voting.  Thanks for the extra detail Bruce.  Art found you but how on earth did you find us ?

But as for AIML yes I agree it was a nice start, but nothing is ever perfect.

*

bruce wilcox

  • Trusty Member
  • ***
  • Nomad
  • *
  • 72
Re: Will the real Suzette please stand up....
« Reply #6 on: May 02, 2009, 03:21:52 pm »
Same way art found me, really. Google. When Art said he had posted on a UK ai site, I simply hunted for suzette chatbot    and new chatbot   and looked at the site names.

*

bruce wilcox

  • Trusty Member
  • ***
  • Nomad
  • *
  • 72
Re: Will the real Suzette please stand up....
« Reply #7 on: May 02, 2009, 03:30:16 pm »
Part of the reason she seems better, I figure, is that she is writen with a theory of chat in mind. And the engine directly supports a theory of chat. From more of my early notes:

Chat is generally an exchange of words between two participants where facts and opinions are exchanged. The exchange is important. One does not like to give out much information if one is not receiving some back, particularly personal information. Hence a common flow is for one side to volunteer something and, after perhaps optional followup or questions about it, the other side then returns with a similar bit of information. An equivalent flow is for one side to ask a question, the other answers it and either flips that question back or expects the originator to automatically provide the equivalent information.

Chat is predictable in that topics tend to last for a bit, going into deeper and deeper material after skimming off the common initial information. “Where do you live” can eventually lead to “What tourist attractions are there?”

Chat is interesting in that it is unpredictable and goes off on unexpected tangents. I say “I like onions” and you react with a rant about “The Onion”, some liberal writer’s magazine.
------
The following is a statistical analysis of Jabberwacky chat againt humans. Nearly 30,000 sentences were analyzed, and I assume a word is used once per sentence instead of possibly showing up multiple times in a sentence. The input chat line itself may consist of one or more sentences at a time. If a sentence begins with an interjection, I consider that a separate sentence. So I actually fed in 26,378 user input lines, which became roughly 30,000 sentences.

Chat is heavily biased in the use of the pronouns you (9935) and I (8315), as those are the primary areas of interest.  In fact, you was the most frequent of all real words. Sentences of you, your I, my take up 2/3 of all chat.

The what (2114) question dominates the w-words, with intermediate use of how (863) and  why (646) questions and low use of where (232) or when (166).  Do (3832) and Can (961)  are common as questions or statements, as is speaking about what is liked or favorite (1107). A flavor of yes (1977) and no (1218) are 10% of all sentences

The most common human-uttered single or composite word sentences in order: some form of yes (937), some form of no (584), some form of goodbye (140), some form of happy expression (106), a standalone why question (101), some form of thanks (99), some form of funny (63), some form of apology (51), a standalone what question (48).
 
The most common words are:  period (22496), the verb be  (10440), I/me (10308), you (9935), question mark (7227), not (4617), do (3832), a (3487), comma (3396), to (3137), that (2880), the (2423), it (2134), what (2114), some form of yes (1977), exclamation mark (1873),  the verb have (1604), know (1356), and (1276), some form of no (1218), of (1181), your (1108),  like/favorite (1107), can (924), so (904), my(898). The period shows up so heavily because when I split intejections into a separate sentence, I add a period after it.

Negation  (not, never, xxxn’t)  appears in 1/6 of all sentences.  Of the not’s, about 15% are of an adjective that could just be flipped to a positive form.
--------
Issues in Pattern Matching

Our chatbot works as an expert system, matching rules against the input information to find a response. The system has various collections of rules (called topics) that are executed to see if they match. Some topics execute all rules to find all matches. Most execute only until a match is found.  When considering pattern matching there are some fundamental issues that coexist in tension.

Randomness/Variety

First, is the issue of randomness and variety. We try not to have the user see the same output or be able to for certain predict the output. To insure little repetion the system tracks the last 20 replies given by the program and if a new reply completely matches one of those, it will be blocked from being used for now. This becomes equivalent to the pattern failing and the system will go find a new pattern to use.

To insure randomness within common replies, one can use a feature that randomly picks a phrase from a collection of phrases. To insure randomness within a topic, the system can be told to order the responders randomly before testing them. To insure randomness across topics, the SQL queries to find topics that match keywords return the list of topics in a random order (subject to best match however).

Variety extends beyond randomness. You want to vary your sentence structures and lengths as well. Maybe sometimes answering in full sentences and sometimes in fragments. The system classifies responses based on number of words, and you can explicitly request short, medium or long responses. This affects the quibbling area only, because that generally has multiple ways of quibbling within a sigle rule.

Priority

Second, is the issue of priority. You want user-known information to have priority over generally known information, which in turn should have priority over mere stalling or quibbling chat.  And specific replies to have priority over general ones. The current topic should have priority over other topics, so if a question or statement has a response within the current topic, there is no reason to change to some other topic that might also answer it. On the other hand, if the current topic does not have a specific answer and some other topic does, it makes sense to change topics rather than stay in the current one and make a gambit response or quibble.

To control priority some other chat languages allow you specify a priority number to a response. This is overkill and hard to manage. Instead, the standard topic processes responders in order, so you just place them in the priority order you want (usually most specific matching first and more general matching last).  Similarly, Gambits often have a flow that tells a story, and should not normally be scrambled. They can execute out of order only if needed to answer a statement or question of the user. And, unless the topic is marked otherwise, once the system jumps to a later gambit it will continue dishing out gambits from there, continuing that part of the story. Only when it reaches the end of the gambits will it return to earlier on to use up those as well.

We prioritize when choosing which words of a sentence to match against topics to find a match. Sequences of words have priority over single words (so “racial discrimination” has priority over any individual word in the sentence). Fundamental nouns (subject and direct object) have priority over verbs, which have priority over all other words. We also prioritize topics that have more keywords in the input than others.

Priority is also managed by the collections being processed in a priority order. These include the current topic, a special system topic, other user topics, the general quibble topics, etc. Within a topic, one can request a subtopic take control if it can match. So one can order collections of patterns within a topic by a common matching characteristic and then executing the subtopic. This is used, for example, in the quibble system topic. If you use the word not or never this tends to drastically alter the intent of a sentence, so a special negative subtopic is matched FIRST. Only if it finds no match will the quibble topic move onto other quibble choices.

Reuse

Third is the issue of reuse.  Replies to specific statements or questions are information that might also be spontaneously volunteered as gambits when within a topic. To support this, patterns can have labels attached to them and other patterns can direct that their output reuse a label, meaning do the output of what the labelled pattern would do. Therefore most statement or question responders execute a reuse on some gambit within the topic, making it possible to tell the user as much as possible, but replying as focused as possible if the user asks a question or makes a statement.

But in a normal conversation, once I have told you something, I am not expecting to tell it to you again. If I tell you I am a writer, I expect you to remember and I shouldn’t volunteer that again. (Correspondingly, you shouldn’t ask me what I do again.). This is handled in normal user topics by actually erasing the pattern after it gets used and saving that topic’s state within the specific user’s datafile (so each user has a different state of the system based on their chat so far).  If the label target of a reuse no longer exists, then the question or statement cannot respond and a different pattern be matched instead. So if the system has answered what do you do, it cannot accidently then volunteer the same answer later. 

In effect, chatting with someone is a self-extinguishing process. Without new data coming into the system, eventually you will have talked about everything you know and have nothing new to say to someone.  This happens a lot in real life though there is often new data coming in. In the system, however, there is rarely new data coming in. Should you use up all of its data, it will reset itself and start back at the beginning.  The overall system does allow you to add or revise topics. It will continue to work fine with the existing state of all users after you restart the server (a requirement to allow new topics but not required if merely modifying existing topics).

Specificity

Fourth is the issue of pattern specificity. If you write your pattern to exactly match an input it will be triggered correctly but all sorts of closely related input will fail. E.g., if your pattern is  “What is my  name” then it would fail on What is your given name? Or What in heck is your name? At the other extreme if your pattern is “name”, then it would match all forms of questions involving name, but match things like “Name an animal you like.”  The system does not manage this issue directly, but instead allows you a range of choices in how you express the pattern you want. The keywords AND, THEN, SOON, NEXT control the ordering of words being looked for. If you literally want a sequence of words, you would use NEXT (a quoted expression like “what is my name” actually decodes automatically to that sequence of words separated by NEXTs.). Except for special idioms which have this behavior, you are better off using THEN or SOON, which allows other words to intrude without impact. THEN allows any number of intervening words and SOON allows up to two. If your pattern is WHAT THEN IS THEN MY THEN NAME, you have named the essence of a larger set of sentences. In fact, you could probably go WHAT THEN MY THEN NAME safely. It’s like trying to guess the minimum number of words you need to understand on a distorting phone, to know what is being asked. 

The minimum number of words you need often depends on context. If you are talking about favorite things, getting the input: Movie? is enough to know you are being asked “what is your favorite movie” There are pattern features that allow you make the pattern contingent upon being within the current topic or also matching keywords of the topic. So sentences that reference the topic can get you into the topic and once the topic becomes current, you don’t need to use those keywords to keep matching other patterns.

Patterns in continuations have similar issues but can usually be less specific. If you have made a statement like “I like this book a lot”, the pattern looking at the user’s response to that might well just be why because it’s the obvious question and it doesn’t matter whether it’s why do you like it or why is it so interesting  and the odds the user will say why do you go swimming on Tuesdays is extremely unlikely in a continuation context. And even if they did, they wouldn’t find it amiss if you ignored their input and went and answered as though they had asked why you liked it. However, you have to be careful not to be as general as using a single wildcard  “*” that matches anything. The problem is that if a user wants to change topics by asking some clearly different question, the system will ignore him and pretend he asked for the continuation. Using the pattern “what” is better (if the question would logically be what) except that what is a common question in a lot of topics. You may not be able to win here, and again it isnt fatal if the system just plows ahead in place.

Orthogonality

It is important to be able to add new data without worrying about how it affects existing data. The system strongly supports separation of data into topics such that interactions are minimal. If two topics have overlapping responders and either could handle the input, it doesn’t matter. If you are not in those topics, one of them will get chosen and handle the input and steer the conversation that way. If you are already in one of the topics, inertia will keep you there and the answer will come from that topic.

It could be that the lesser congruent topic gets chosen to start with. If the sentence clearly belongs more to a topic, you should decide how you know that. Topic picking comes from finding which topic has more keywords or keyword sequences covering the sentence. Adding keyword phrases to a topic will help focus which gets chosen (phrases are worth more than the sum of their individual words).  For example, the topic baseball may have (baseball player umpire coach ball bat) as keywords and the topic baseball_job may have (baseball player coach umpire). This will not distinguish I am a professional baseball player to the more significant topic of the job. But addding in the keywords of that topic the word professional or the phrase “professional baseball” would route this toward baseball_job without distorting the keyword field with irrelevant words.

Within a topic, if you want to insure two overlapping responders go to different places, make sure one includes some important word and the other says explicitly it does not include it. For example, what is your favorite show could be coded as (WHAT AND FAVORITE AND SHOW) and what is your favorite kind of show as (WHAT AND FAVORITE AND KIND AND SHOW). But they collide because both can match the latter question. But you could reformulate the former as (WHAT AND FAVORITE AND SHOW AND !kind) to insure they don’t collide.
-----------------------------------

A topic begins with all of its t-lines (topic or gambit) lines first (as well as any continuation lines attached to them). They tell the story of the topic. The story should usually alternate between asking questions of the user and volunteering corresponding data, and probably in that order. Only asking questions gets boring quickly and the user feels cheated. Only giving information deprives the user of the sense that you are interested in them. 

The order of asking then volunteering is important. If you volunteer information first (e.g., I own a dog), the user may ask a followup question (which a responder or continuation line could manage) or the user may volunteer some related information (e.g. I’ve heard there are over 300 breeds of dogs).  Asking a question helps force the user to a more narrow range of responses. Then, when you volunteer your corresponding data, it seems like a fair exchange has happened and that continuity is being maintained.

When you write a topic sentence you should generally make it a stand-alone complete sentence. Sentence fragments and brief word answers are fine in continuations and responders because the user has the context of his triggering sentence. A topic sentence does not. It may or may not actually come immediately after its preceeding topic sentence. The user may drag the conversation off to a completely different topic for ages and then when that is done, the system may pop back to this topic and then issue the current topic sentence. Any prior context of this topic has long since been forgotten.

*

wgb14

  • Trusty Member
  • ****
  • Electric Dreamer
  • *
  • 143
Re: Will the real Suzette please stand up....
« Reply #8 on: May 02, 2009, 06:24:48 pm »
Yes the bot is impressive but:

1) Do you plan to release its code as open source?

2) Or at least do you plan to give developers some kind of API and the ability of input their own knowledge to the bot?
I will never understand why people that come up with good ideas refuse to share their work with others. The only reason AIML become king of chatterbots is because it is open-source. This is the third good bot (along with Jeeny and Jabberwacky) that I see, but as long as the technology remains hidden there is no way to see real applications. Chatting is a "silly" domain that will never become of any practical use.  In chatting a bot is impressive even if with non relevant output is able to keep the topic (e.g., we talk about internet and the bot asks are you a nerd?). However, in real applications non relevant discourse is not acceptable by any user. Hence, I will remain sceptic on the effectiveness of these NLP technologies until I will be able to test them in a real context.

Virtual human factory is an excellent example of a real application. Their system has been tested in a realistic context (i.e., training doctor-patience communication skills) with a very good accuracy. I think that their technology is currently the best NLP technology out there for constrain domains. 


           

*

wgb14

  • Trusty Member
  • ****
  • Electric Dreamer
  • *
  • 143
Re: Will the real Suzette please stand up....
« Reply #9 on: May 02, 2009, 10:01:42 pm »
Just read the article Beyond AIML... its an amazing work. You seem to have solved many of the problems that pattern-matchign technology has but may I ask you some questions?

1) I understand the use of a gramatical parser to identify questions from statements but I am not sure about its other uses. Why would you want to know the syntax of a sentanse to match it with a pattern? Unless you employ an automatic grammar corrector (e.g., http://www.wintertree-software.com/dev/wgrammar/faq.html) to automatically correct the user's input a grammatical parser is not much of a use (very few people use proper grammar when they speak or write).

2) I read somewhere above that you are using an inferencing system. The bot doesn't respond at all to any "is a" and "what/who is" sentanses. It appears that it deosn't have an ability to reason at all. I think the most significant contribution of your bot is its strong pattern-matching mechanism but as I said above that remains to be tested (for its effectiveness) in a real application.   

Overall it is an impressive work, and I would like to examine it closer. Is there any way to share some of your work with us?

*

bruce wilcox

  • Trusty Member
  • ***
  • Nomad
  • *
  • 72
Re: Will the real Suzette please stand up....
« Reply #10 on: May 03, 2009, 01:28:33 am »
Let's see. a number of questions asked.

1) first and foremost, Suzette is a work-for-hire of Avatar Reality. So I have no control over releasing her, or the full technical specs or whatever.

2) She has the underpinnings of inferencings, but I'm just starting to put in the appropriate data now. The engine provides capabilities. The CHAT-L script controls the engine and is executed by the engine. Including controlling inferencing.  If I wanted to have Suzette pretend to take drugs, get reallllly spaced out, and then slowly recover, that would all be done from script.

3) The linkparser is part of what determines something is a question or not. Script also does, in part because I treat some commands as questions (like tell me about yourself). The linkparser, allows patterns which can grab the subject, verb, object, etc. Partly for fact acquisition. Partly for other things. For example, if the user says "I often ingest fine danish chocolate", the parser can manage that. The script can use a pattern to grab the verb ingest, parlay that into a meta verb which covers all manner of eating and drinking and inhaling, and then match to a pattern about you metaingesting something and react to that. The system cannot depend on the parser, but it is often available with valid data so it can make use of it.

4)I can't say about OUTSIDE of Blue Mars, but Avatar Reality intends that the chatbot can substitute for the user when the user is offline, having learned a lot about the user by analyzing his chat. And that the user can enhance that by writing their own topics of data in areas they have particular interest. 

Since this technoogy is just coming online (so to speak) Avatar Reality does not have a clear grasp of everything they might do with it and how they will handle it. yet.

*

wgb14

  • Trusty Member
  • ****
  • Electric Dreamer
  • *
  • 143
Re: Will the real Suzette please stand up....
« Reply #11 on: May 03, 2009, 02:30:23 am »
Hi Bruce

thanks for answering my questions here are a couple more (sorry I am just too curious about your work):

1) first and foremost, Suzette is a work-for-hire of Avatar Reality. So I have no control over releasing her, or the full technical specs or whatever.

I understand... but i guess we can ask some questions on specific parts of your work right? I am mostly interested on the use of parser to improve pattern matching.

2) <<The script can use a pattern to grab the verb ingest, parlay that into a meta verb which covers all manner of eating and drinking and inhaling, and then match to a pattern about you metaingesting something and react to that. The system cannot depend on the parser, but it is often available with valid data so it can make use of it.>>

I get the general idea but not the exact details... you are using the parser as an extra component for more accurate pattern matching right? If yes can you be more specific in the details? The CMU parser is quite effective in parsing almost every sentence available to it. What do you do with the syntactic information afterwards?  For example, how will the system handle the sentence "I walk almost every day"? The parser will output something like this:" I.p walk.v almost every.d day.t." How do you use this syntactic information afterwards? What kind of pattern will be required to produce the output "Great me too" and how does knowing the verb (or the other parts of speech) helps the pattern matching process?

3) In your paper you are saying: "there are %words that have special parse or dictionary meanings (like %subject, %noun, or %tense=past"" Does this have something to do with the syntactic output from the parser?  Can you provide a specific example on the use of these words like you did with synonyms in your paper?
« Last Edit: May 03, 2009, 02:42:02 am by wgb14 »

*

bruce wilcox

  • Trusty Member
  • ***
  • Nomad
  • *
  • 72
Re: Will the real Suzette please stand up....
« Reply #12 on: May 03, 2009, 05:00:11 am »
Regarding use of the link parser. Let's start with this.  One of the first things the system does is taxonimize your input. It may never understand what you are saying, but it tries to know ROUGHLY what you are trying to express.  For example:  The user types in "I am the most studious kind of lonely you can image."This is you saying some fact about something of you. Or in pattern terms:
     u: YOU_FACT (!? < * |subject=you < * |verb ) ^set(%intent '$youfact ) ^endtopic()
So. If the system can't find ANYTHING else to do with your input, it can generically respond. For example it could say "I didn't know that about you."  which is appropriate for just about any YOU_FACT.
Or take the topic of the game of Go.  Go is a common word. Very hard to pattern match around it. So to know if the user is talking about Go, we want it to be the subject or object, not just some keyword in the sentence.
    u: ( [|subject=go |directobject=go]) ^gambit()
Or when you insult Suzette, by a sentence like " I despise you". We react to insults by altering the relationship between you and Suzette, so we don't really want to do that mistakenly. So that must be checked by parse, not by keyword.
s: (|subject=you *~ |verb=~~badness *~ |directobject=I )

As for your sentence: "I walk almost every day"? , the system can use the link parser output to swallow that as a fact about you-  the fact being  "you walk" and a time expression "almost every day". So if you asked " when do I walk", the system could find that answer.  (Not saying it does at present, just that the technology is there.)


*

bruce wilcox

  • Trusty Member
  • ***
  • Nomad
  • *
  • 72
Re: Will the real Suzette please stand up....
« Reply #13 on: May 03, 2009, 01:40:41 pm »
In the prior Suzette topic One wrote:

I do believe this is the first time I have had a machine actually come out and tell me directly that "I love you"
I wonder how common this occurs, their must be some kind of condition met but I did not indicate first, with the
exception of some 'small talk' I think i saw.

I respond:

On the other hand, another of your crowd got her to say she hated him. He was badgering her (trying to find out what inferencing she could do).
Neither is all that common, yes there are conditions, and no I'm not saying what they are.

Also re the attempts to see what Suzette could do - the following was inputted:

1. giannis is a person from greece
2. who is giannis?

A reasonable thing to want to do. However, the system didn't have the ability to accept new words. Giannis is not in its dictionary (nor does google find much of this
as a name). I have modified the system to try to allow you to create new proper names as this would have required. But...

the system runs an aggressive spell check. Giannis does not survive intact. It gets changed to Glennis, which the system does recognize.

Also, I hadn't added patterns to acquire facts about proper names. I had plural nouns and determined nouns, but not proper names. Fixed.

3. monemvasia is a city in Greece
4. where is Monemvasia?

Same comments apply. Though really, wouldn't it be NICE if you capitalized your proper names consistently?  Not that a chatbot gets to have that
given to it. Still wont handle this properly w/o a bit more scripting. but can't get to it right now.

5. maria is giannis wife
6. who is maria

The parser does not accept 5 (since you left of the possessive apostrophe). So Suzette cannot build a fact, and thus cannot answer 6. If does put an
apostrophe, then the system now handles this, since it now builds facts of proper names.



*

Art

  • At the end of the game, the King and Pawn go into the same box.
  • Trusty Member
  • **********************
  • Colossus
  • *
  • 5865
Re: Will the real Suzette please stand up....
« Reply #14 on: May 03, 2009, 02:47:49 pm »
Bruce,

When I've chatted with Suzette, "she" indicated that she was from Calais, France, yet when I visited the
Hui site, several members mentioned the fact the Suzette imformed them that she was from Hawaii.

Did a mind merge occur or were there different versions of her (or possibly more random places for her
to choose from)?

I can also relate to what wgb14 is alludding. It seems that every time we hear of a new chatbot, then
locate it for testing, chatting or only see a demo of it in action, we feel somewhat deprived. We are the
afficianadoes, mavens, fans of chatbots yet we're always being told, "Here it is and here's what it does.
You just can't have it." Kind of frustrating.

There's Jabberwacky and wocky, Jeeney, Silvia (seems like a major leap forward) and now Suzette.

If Suzette is entered in the chatterbox competition, who entered it? You or Avatar Reality?

If Avatar Reality OWNS Suzette what's to keep you from creating another chatbot and offering it for sale
complete with SDK, tutorial, brain files, etc.?

How about the ability to control outside devices like X-10, Insteon, Z-Wave, etc.? Voice command?

There are a lot of users (chatbot and AI enthusiasts) who would love to have a more intelligent bot, whether
connected or stand alone (I prefer but realize that knowledge gathering abilities would be limited). Some
users would like to connect the chatbot to their home as a "vocal control" entity or perhaps a conversationalist,
companion or friend for the elderly or shut-in. How about a learning companion for young children...a virtual tutor?

I'm barely touching the tip of the iceberg with possibilities but as with most, I'm sure it's all about money.
I don't mean to be presumptions in your case and please don't take offense but in the majority of cases
it seems that way. Most people wouldn't mind paying a reasonable price for a decent chatbot and some do.
When you consider the standard video game is in excess of $50.00, it would be a potential money maker
as well. Something for you to digest. Unless of course you are bound by some "My hands are tied while
I'm a programmer for XYZ company and they own everything I create" clause or agreement.

Your thoughts please....
In the world of AI, it's the thought that counts!

 


LLaMA2 Meta's chatbot released
by spydaz (AI News )
August 24, 2024, 02:58:36 pm
ollama and llama3
by spydaz (AI News )
August 24, 2024, 02:55:13 pm
AI controlled F-16, for real!
by frankinstien (AI News )
June 15, 2024, 05:40:28 am
Open AI GPT-4o - audio, vision, text combined reasoning
by MikeB (AI News )
May 14, 2024, 05:46:48 am
OpenAI Speech-to-Speech Reasoning Demo
by MikeB (AI News )
March 31, 2024, 01:00:53 pm
Say good-bye to GPUs...
by MikeB (AI News )
March 23, 2024, 09:23:52 am
Google Bard report
by ivan.moony (AI News )
February 14, 2024, 04:42:23 pm
Elon Musk's xAI Grok Chatbot
by MikeB (AI News )
December 11, 2023, 06:26:33 am

Users Online

407 Guests, 0 Users

Most Online Today: 459. Most Online Ever: 2369 (November 21, 2020, 04:08:13 pm)

Articles