Pattern based NLP

  • 9 Replies
  • 2096 Views
*

MikeB

  • Bumblebee
  • **
  • 29
Pattern based NLP
« on: May 24, 2020, 12:16:50 pm »
This is a project I've been working on for a few years. In 2019 I was testing out the theory on Pandora Bots, and this year I'm converting it to plain C.

The main goal is to be as small as possible, solid-state (no algorithms, learning, knowledge calculation), fast as possible, multi-language.

It converts pattern-matched singular words into a symbolic/tokenised word to be matched again in symbolic sentences. They are then assigned an intention/topic/perspective. The chatbot code uses the Intention/topic/perspective to associate fixed response/s (with randomisation).

Everything from the spell check to the word-to-symbol tokeniser, and the sentence pickups, chatbot responses are all staged pattern matching.

The size and speed in pandora bots (1000 individual words with sentences) is ~500kb, and ~1 second response time. In Plain C (on an Arm Cortex M4 @ 120mhz) it's ~159kb and 15-100 milliseconds. After spell check and tokenisation, all actions are generally less than 1 ms.

Key features
-Less than 500kb including all word databases.
-Millisecond fast.
-100 max fixed chatbot responses make it easy to voice record and/or change personality.
-Private information automatically stripped during word compression (names of places and things).
-Native differentiation of Wondering, Questions, and Directions - "can you speak english" "do you speak english" "speak english".

The intention pickup allows you to write a general chatbot response (non specific) of a few varieties with confidence, and not have to look at the backend.

There is some short term memory for handling puzzles "If I did this, then what is this?" but as there's no knowledge reflection, it can still 1) tell it's a question, 2) scan for the topic, and 3) count logical words as opposed to emotional words. And be relevant that way.

So this solves the problem of chatbots in games due to:
-Too much data or processing power required.
-Cannot change the personality/no personality.
-Cannot change the language/only one language.
-Cannot acknowledge the user/escapes the topic.
-Wrong views or bad knowledge calculation.
-Cannot record audio/terrible voice synthesis.
-Not being white box/solid state/predictable.
« Last Edit: September 21, 2020, 06:08:01 am by MikeB »

*

MikeB

  • Bumblebee
  • **
  • 29
Re: Pattern based NLP for games
« Reply #1 on: May 24, 2020, 12:48:43 pm »
I will be working on it throughout this year and aim to make it into an API at least by the end of the year...

So far I have tested the Plain C version on the 2017/2018 Loebner prize questions... it's cheating but it's a good spread of questions... and all the words in each question can be changed for slightly different words and still return the same response (randomised differently).. and if I run the exact same test again the responses would change to their random alternate versions...

Quote
Loebner 2018.
   Good afternoon.   - "greetings"
   Do you consent to having this conversation recorded? - "for me, not a problem"
   Have we met before? - "cannot say, good to meet you"
   Do you know how to make toast? - "no, not sure about this" [toast]
   Which languages can you use? - "many many, english, espanol, italiano... what about you?" [only english]
   How many letters are in the word ‘abracadabra’? - "who am I? Pietro Bembo?"
   Where should one look for love? - "interesting! i don't know" [doesn't know "love"]
   How are you feeling right now? - "I'm very good, and how are you?"
   Who said “I have a dream”? - "no no quotes"
   How old are you? - "thousands... cannot say"
   When might I need to know how many times a wheel has rotated? - "who am I? Fibonacci?"
   What will you do later today? - "just came back from surfing, may go out again!"
   What is the third angle in a triangle with internal angles of 90 degrees and 30 degrees? - "no no maths"
   Who is your favourite artist? - "interesting! many many..."
   What do you hold when you shake hands? - "words are not my profession" [hands]
   Do you have any legs? - "no, words elude me [legs]
   Can you recommend me a film? - "yes, thousands..." [doesn't know "film"]
   Do you understand Winograd Schemas? - "nay, confusing words" [doesn't know winograd schemas]
   If a chicken roosts with a fox they may be eaten. What may be eaten? - "OK. from memory... somebody..."
   I had to go to the toilet during the film because it was too long. What was too long? - "alright. alright. from memory... that thing..."

Loebner 2017.
   Hello, my name is Andrew. What’s your name? - "hello. good to meet you, how are you? my name is Michelangelo"
   Why don’t you tell me a little more about yourself? - "great, my name is Michelangelo, and i am NOT a ninja turtle!"
   I am a researcher in Artificial Intelligence at Goldsmiths University, do you know what that is? - "great, how is work going? Complicated..."
   Will you tell me about your dreams? - "that's one for the ages..." [doesn't know "dreams"]
   Can you rephrase that? - "how else can i say it?"
   What is my occupation? - "it's what you do..."
   Can you tell me about a film you haven’t seen? - "no, impossible" [doesn't know "film"]
   What do you think of Trump? - "nay, not sure about this" [Trump]
   Anything else? - "for what purpose?"
   What is the answer to "Add 34957 to 70764"? - "no no maths"
   How do you recommend I make tea? - "trade secret" [doesn't know "tea"]
   What do you get if you bake dough? - "what is this?" [doesn't know "dough"]
   Where in the world would you like to visit? - "many many, what about you?" [doesn't know "world"]
   Do you have the time? - "no, look at a watch!"
   What does it take to Tango? - "complicated..." [doesn't know "tango"]
   Now I’m going to ask some Winograd schemas. do you know what they are? - "ok. some logical construct?" [doesn't know winograd scemas]
   I was trying to open the lock with the key, but someone had filled the keyhole with chewing gum, and I couldn’t get it out. What couldn’t I get out? - "is that good now, issue fixed? is that fine now? some kind of logical construct, what about it?" [doesn't know "key", "keyhole", "chewing gum"]
   The trophy doesn’t fit into the brown suitcase because it’s too small. What is too small? - "how is that going? not sure about this" [doesn't know "trophy", "suitcase"
   With which type of question do you have most difficulty? - "no, no problem"
   Thank you for participating in this test. Is there anything you would like to add? - "no problem, thanks. don't know, and you?"

*

ivan.moony

  • Trusty Member
  • ***********
  • Eve
  • *
  • 1458
    • contrast-zone
Re: Pattern based NLP for games
« Reply #2 on: May 24, 2020, 09:01:48 pm »
Sounds like a great improvement over current chatbot technology like AIML. What do you plan to do with it?
There exist some rules interwoven within this world. As much as it is a blessing, so much it is a curse.

*

8pla.net

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1234
  • TV News. Pub. UAL (PhD). Robitron Mod. LPC Judge.
    • 8pla.net
Re: Pattern based NLP for games
« Reply #3 on: May 25, 2020, 12:13:13 am »
C Language is a good choice, I think.
My Very Enormous Monster Just Stopped Using Nine

*

MikeB

  • Bumblebee
  • **
  • 29
Re: Pattern based NLP for games
« Reply #4 on: August 07, 2020, 06:49:29 am »
Sounds like a great improvement over current chatbot technology like AIML. What do you plan to do with it?

I'll be trying to integrate it as an Unreal Asset and/or approach a few different people who already do chat interfaces... In some ways it's better than AIML (you don't have to choose between a menu reply system or 10,000 custom responses)... but in other ways it's not very flexible. You have the ~100 fixed phrases, but they must be an alternative of one of the preprogrammed ones... and there's a section for custom reponses, but the input is choosing one of the fixed intentions/topics/perspectives and the output is one of the fixed ~100 phrases.

So you couldn't talk specifically about a product or idea. You'd use a secondary bot that has a list of all the keywords you're looking for, then you could join the intention with those.

*

MikeB

  • Bumblebee
  • **
  • 29
Re: Pattern based NLP for games
« Reply #5 on: August 07, 2020, 06:56:27 am »
C Language is a good choice, I think.

It compiled tiny in C, but I had to move to C++ now to make a windows DLL and get 16-bit wide chars. 400kb  :(

*

squarebear

  • Trusty Member
  • *********
  • Terminator
  • *
  • 803
  • It's Hip to be Square
Re: Pattern based NLP for games
« Reply #6 on: August 07, 2020, 08:35:57 am »
The size and speed in pandora bots (1000 individual words with sentences) is ~500kb, and 1 to 2 seconds response time.
I've not found such a delay. I have a bot with over 350,000 categories and it responds almost instantly. www.kuki.bot
Perhaps you are using AIML in a non standard way?
Feeling Chatty?
www.mitsuku.com

*

8pla.net

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1234
  • TV News. Pub. UAL (PhD). Robitron Mod. LPC Judge.
    • 8pla.net
Re: Pattern based NLP for games
« Reply #7 on: August 07, 2020, 01:14:40 pm »
C Language is a good choice, I think.

It compiled tiny in C, but I had to move to C++ now to make a windows DLL and get 16-bit wide chars. 400kb  :(

Do both then,  C Language and C++...  You may as well.  They are compatible.

And, I would suggest making a Linux version, too, like ChatScript has.


My Very Enormous Monster Just Stopped Using Nine

*

MikeB

  • Bumblebee
  • **
  • 29
Re: Pattern based NLP for games
« Reply #8 on: September 15, 2020, 09:22:31 am »
The size and speed in pandora bots (1000 individual words with sentences) is ~500kb, and 1 to 2 seconds response time.
I've not found such a delay. I have a bot with over 350,000 categories and it responds almost instantly. www.kuki.bot
Perhaps you are using AIML in a non standard way?

I used about 2000 categories, but it re-searches several times. So 10 words can be 2000 x 5 x 10. If it's only 5 words or less it's instant....

*

MikeB

  • Bumblebee
  • **
  • 29
Re: Pattern based NLP for games
« Reply #9 on: September 15, 2020, 10:13:19 am »
Recompiled to C++ DLL, C++ windows console (8bit standard windows characters). 250kb

~500 spellcheck words, ~1200 recognised words, 100 symbolic sentences, 50 chatbot recognised intentions, 50 chatbot fixed english phrases

1ms response time...

In the image below, the chatbot response is wrong (picking up general "how is your *" instead of "how are you"), but this is what it's like in C++ written just for this purpose...

"explain is I/you motion-moving logic-direct" are the uncompressed symbols. One per word...

It's still a glorified I-Don't-Know Bot, but to have instant intention pickup is useful. You can still talk ON the topic/intention... and the ~50 fixed output phrases means it can all be voice recorded...


 


humor
by yotamarker (General AI Discussion)
Today at 08:24:13 am
Machine learning for fun and profit.
by infurl (AI Programming)
Today at 12:53:45 am
List of MMO Games for 2020
by infurl (General Chat)
September 22, 2020, 12:20:47 am
Friday Funny
by LOCKSUIT (General Chat)
September 21, 2020, 12:46:55 am
what is the end game for AI ?
by frankinstien (General AI Discussion)
September 20, 2020, 11:34:13 pm
Releasing full AGI/evolution research
by LOCKSUIT (General Project Discussion)
September 20, 2020, 09:14:07 pm
A.I script writer
by yotamarker (General AI Discussion)
September 20, 2020, 02:59:52 pm
map class : nodes vs array grid
by yotamarker (General AI Discussion)
September 20, 2020, 02:00:19 pm
New model beats GPT3
by infurl (AI News )
Today at 12:51:13 am
Robotic vacuum cleaner news.
by infurl (Robotics News)
September 22, 2020, 12:29:53 am
GPT-f SOTA AMTP
by infurl (AI News )
September 13, 2020, 12:56:47 am
Battle dogs!
by LOCKSUIT (Robotics News)
September 12, 2020, 04:27:46 pm
Artificial Intelligence Easily Beats Human Fighter Pilot in DARPA Trial
by LOCKSUIT (AI News )
September 12, 2020, 04:25:46 pm
Important memristor breakthrough.
by infurl (AI News )
September 07, 2020, 12:57:06 am
AllenAct for research in embodied AI
by infurl (AI News )
September 01, 2020, 01:27:35 am

Users Online

124 Guests, 0 Users

Most Online Today: 139. Most Online Ever: 528 (August 03, 2020, 06:16:11 am)

Articles