Pattern based NLP

  • 20 Replies
  • 29239 Views
*

MikeB

  • Nomad
  • ***
  • 75
Re: Pattern based NLP
« Reply #15 on: January 06, 2021, 08:02:43 am »
Recently added both:
-Tone (9 levels - 3 negative, 3 positive, 3 grooming behaviour/patronising)
-WiC Challenge test (Words in Context - https://pilehvar.github.io/wic/)

The WiC test is one of the few NLP tests that can actually be done on this pattern based NLP, as it's not specifically prediction or knowledge based.

The WiC test (training data & results) is ~5500 lines. It completes in only 2 seconds (1980ms-2000ms), however many of the lines include deep knowledge or some other non-literal meaning to trick everyone, including people, so it'll also trick this NLP... The human score is only 80%. Most NLP's get 60-75%.

Most of the NLP set up is complete, so this year I'll be adding words & sentences in order to get through this test... 

O0


*

ivan.moony

  • Trusty Member
  • ***********
  • Eve
  • *
  • 1480
    • contrast-zone
Re: Pattern based NLP
« Reply #16 on: January 06, 2021, 09:42:45 am »
Hi MikeB :)

May I ask, how do you derive answers to the tests?
There exist some rules interwoven within this world. As much as it is a blessing, so much it is a curse.

*

MikeB

  • Nomad
  • ***
  • 75
Re: Pattern based NLP
« Reply #17 on: January 07, 2021, 03:35:29 pm »
Hi MikeB :)

May I ask, how do you derive answers to the tests?

Hi Ivan, I ignore the selected word that the test says to match, altogether, and just look to see if the underlying intention is the same.

In the line "He wore a jock strap with a metal cup. Bees filled the waxen cups with honey."... the word "cup" means the same. A traditional NLP would see if "metal cup" and "waxen cup" means the same based on knowledge linking, but in the pattern matching NLP I just look to see if the basic underlying intention is the same. So both of these sentences would come under "Person describing" with sub tags "clothing, material,..." and some others. If one sentence was a catchphrase or greatly different then it would return not a match.

Another example... "I try to avoid the company of gamblers. We avoided the ball."...the word "avoid" means the same. Both have the intention "Person explain", so this would return true.

It should get at least 60% doing it this way. There is a way to add catchphrases to get a few more, and some other things I can do with tags. Trying to keep real knowledge linking and deducing as far away as possible...

*

MikeB

  • Nomad
  • ***
  • 75
Re: Pattern based NLP
« Reply #18 on: January 14, 2021, 02:53:06 pm »
Just comparing the two Intentions isn't working out too well. Going to start a specialised way of doing it (still without knowledge) by looking at the words before & after the selected word.

*

MikeB

  • Nomad
  • ***
  • 75
Re: Pattern based NLP
« Reply #19 on: February 01, 2021, 07:27:24 am »
Restructered the WiC / Word in Context test to look at the words before & after the indicated word, similar to how we do it.

A brief overview...

1) Both sentences are formatted (look for odd symbols, double spaces, spelling, words spaced out like "h e l l o", extended laughing "hehehehe...").
2) Pattern-match each word to a predefined symbol from a list (only ~20 different symbols total, out of ~2300 english words. No stemming.).
3) Analyse WiC:
 a) Input: Both sentences, the 'lookup word', and both locations of the word.
 b) WiC function: Check the 'lookup word' (now a symbol shared with ~100 similar words) exists in the WiC / sentence compatibility table (~50-100 entries).
 c) WiC function: If at least one match, check all other words. Highest word count (3-5 words) is selected as a match. Remember compatibility ID. Now check second sentence for a match. Return match true/false.

This is much more detailed than just checking the intention, as it can pick up the same context even if one sentence is an "instruction" and the other is a "person describing". EG. "come/came" (1) "Come out of the closet" (2) "He came singing down the road".

I got the time down from 2000-2600ms, to ~1400ms by removing most of the pre-formatting and only keeping 'Double Space' check as the test is already formatted...

Score is not worthy of publishing because I've only checked about 100 of the ~5500 records! A lot of sentences are reused though so shouldn't have to check all of them.


*

MikeB

  • Nomad
  • ***
  • 75
Re: Pattern based NLP
« Reply #20 on: March 01, 2021, 09:49:25 am »
Still working on the WIC test.

Making progress of about 0.1% per day. (20-30% to go)

There's now 3700 words (+1400). 900 WIC pattern sentences (+800). Re-added spell-checking, so the full WIC test takes about 2.5 seconds to complete.

The scale and pickup is actually immense. Each of the 900 WIC pattern sentences has 3-6 "Symbolic Words". Each Symbolic Word represents 10-500 words. So each of the 900 WIC sentences actually picks up 500,000 - 20,000,000 variations.

Many times I add 10-20 WIC patterns (~100,000,000 word-sentence variations) and it only picks up one solitary record in the 5428 record WIC test... So the test is basic... but the word formatting is still broad enough that you can't just cheese the test.

Another problem is lack of words... I'm estimating I'll need at least 5000-7000 total to get a good result, and all these are hand entered in specific categories , so it's going to take some months...

One side effect is that I'm probably going to drop the old "Intention" categories I used to use for the chatbot and use these new WIC categories instead as it picks up an interesting variety. There are about 50 different groups (will be merging some) along the lines of:
"person or thing started to move / person or thing has him..."
"the object/concept of a had-thing"
"had the concept when..."
"a motion was taken / apply a rule / have-take the concept-chance to..."
"i play/avoid the / objects moved/ordered/fell to the
"logic-action an object"
"moving-action the object"
"an object of objects / vivid objects/objectives of"

So these will be better in chatbot programming.

 


SwarmFarm agricultural robots
by infurl (Robotics News)
February 28, 2021, 12:48:38 am
Microsoft Patent To Construct Chatbots of Dead People Approved
by MikeB (AI News )
February 18, 2021, 06:18:35 am
New challenge: Online Turing test
by Denis ROBERT (AI News )
February 15, 2021, 02:53:24 pm
Loebner Prize 2021
by Denis ROBERT (AI News )
February 10, 2021, 02:20:25 pm
Smart Matter
by infurl (AI News )
February 09, 2021, 05:09:31 am

Users Online

108 Guests, 0 Users

Most Online Today: 135. Most Online Ever: 2369 (November 21, 2020, 04:08:13 pm)

Articles