Pattern based NLP

  • 19 Replies
  • 28942 Views
*

MikeB

  • Nomad
  • ***
  • 74
Re: Pattern based NLP
« Reply #15 on: January 06, 2021, 08:02:43 am »
Recently added both:
-Tone (9 levels - 3 negative, 3 positive, 3 grooming behaviour/patronising)
-WiC Challenge test (Words in Context - https://pilehvar.github.io/wic/)

The WiC test is one of the few NLP tests that can actually be done on this pattern based NLP, as it's not specifically prediction or knowledge based.

The WiC test (training data & results) is ~5500 lines. It completes in only 2 seconds (1980ms-2000ms), however many of the lines include deep knowledge or some other non-literal meaning to trick everyone, including people, so it'll also trick this NLP... The human score is only 80%. Most NLP's get 60-75%.

Most of the NLP set up is complete, so this year I'll be adding words & sentences in order to get through this test... 

O0


*

ivan.moony

  • Trusty Member
  • ***********
  • Eve
  • *
  • 1480
    • contrast-zone
Re: Pattern based NLP
« Reply #16 on: January 06, 2021, 09:42:45 am »
Hi MikeB :)

May I ask, how do you derive answers to the tests?
There exist some rules interwoven within this world. As much as it is a blessing, so much it is a curse.

*

MikeB

  • Nomad
  • ***
  • 74
Re: Pattern based NLP
« Reply #17 on: January 07, 2021, 03:35:29 pm »
Hi MikeB :)

May I ask, how do you derive answers to the tests?

Hi Ivan, I ignore the selected word that the test says to match, altogether, and just look to see if the underlying intention is the same.

In the line "He wore a jock strap with a metal cup. Bees filled the waxen cups with honey."... the word "cup" means the same. A traditional NLP would see if "metal cup" and "waxen cup" means the same based on knowledge linking, but in the pattern matching NLP I just look to see if the basic underlying intention is the same. So both of these sentences would come under "Person describing" with sub tags "clothing, material,..." and some others. If one sentence was a catchphrase or greatly different then it would return not a match.

Another example... "I try to avoid the company of gamblers. We avoided the ball."...the word "avoid" means the same. Both have the intention "Person explain", so this would return true.

It should get at least 60% doing it this way. There is a way to add catchphrases to get a few more, and some other things I can do with tags. Trying to keep real knowledge linking and deducing as far away as possible...

*

MikeB

  • Nomad
  • ***
  • 74
Re: Pattern based NLP
« Reply #18 on: January 14, 2021, 02:53:06 pm »
Just comparing the two Intentions isn't working out too well. Going to start a specialised way of doing it (still without knowledge) by looking at the words before & after the selected word.

*

MikeB

  • Nomad
  • ***
  • 74
Re: Pattern based NLP
« Reply #19 on: February 01, 2021, 07:27:24 am »
Restructered the WiC / Word in Context test to look at the words before & after the indicated word, similar to how we do it.

A brief overview...

1) Both sentences are formatted (look for odd symbols, double spaces, spelling, words spaced out like "h e l l o", extended laughing "hehehehe...").
2) Pattern-match each word to a predefined symbol from a list (only ~20 different symbols total, out of ~2300 english words. No stemming.).
3) Analyse WiC:
 a) Input: Both sentences, the 'lookup word', and both locations of the word.
 b) WiC function: Check the 'lookup word' (now a symbol shared with ~100 similar words) exists in the WiC / sentence compatibility table (~50-100 entries).
 c) WiC function: If at least one match, check all other words. Highest word count (3-5 words) is selected as a match. Remember compatibility ID. Now check second sentence for a match. Return match true/false.

This is much more detailed than just checking the intention, as it can pick up the same context even if one sentence is an "instruction" and the other is a "person describing". EG. "come/came" (1) "Come out of the closet" (2) "He came singing down the road".

I got the time down from 2000-2600ms, to ~1400ms by removing most of the pre-formatting and only keeping 'Double Space' check as the test is already formatted...

Score is not worthy of publishing because I've only checked about 100 of the ~5500 records! A lot of sentences are reused though so shouldn't have to check all of them.


 


Microsoft Patent To Construct Chatbots of Dead People Approved
by MikeB (AI News )
February 18, 2021, 06:18:35 am
New challenge: Online Turing test
by Denis ROBERT (AI News )
February 15, 2021, 02:53:24 pm
Loebner Prize 2021
by Denis ROBERT (AI News )
February 10, 2021, 02:20:25 pm
Smart Matter
by infurl (AI News )
February 09, 2021, 05:09:31 am

Users Online

103 Guests, 1 User
Users active in past 15 minutes:
ivan.moony
[Trusty Member]

Most Online Today: 134. Most Online Ever: 2369 (November 21, 2020, 04:08:13 pm)

Articles