Where is the line between "chatroom shorthand" and bad grammar and spelling?

  • 9 Replies
  • 3099 Views
*

DaveMorton

  • Trusty Member
  • ********
  • Replicant
  • *
  • 636
  • Safe, Reliable Insanity, Since 1961
    • Geek Cave Creations
I'm not a "chatbot guru" by any means, but I DO consider myself to be a "serious" botmaster. My chatbot, Morti, gets all sorts of visiters, and as most of you already know, is entered into the 2010 Chatterbox Challenge. Now, according to the contest rules:

Quote
The questions will not contain deliberate typos to trick the bot although chatroom shorthand maybe used.
(you have no idea how hard it was for me NOT to correct that line!  :tickedoff: )

Now my question is; where do we draw the line between "netspeak" and just plain bad grammar? If you're not the botmaster of an AIML chatbot, you may not be aware of the difficulties posed by poor spelling, or "non-words". AIML is VERY strict about the patterns matched, with regards to spelling, since it matches words and phrases exactly. Now Morti uses a couple of "shortcuts", designed to "fix" a large number of misspelled words. In fact, his functions use a replacement list of nearly 7,500 commonly misspelled words, and another 180 or so contractions and "netspeak" abbreviations/symbols. To me, that should be sufficient, but apparently, it's not.

BTW, this isn't just about the traffic I've gotten since the CBC judging has started. In fact, I'm trying to be as general as possible, so that I don't run afoul of the rules regarding discussing the questions and responses. Therefore, I'll not be citing any examples of this at all, just to be sure. That said, however, I'm wondering just how "tolerant" I, as a botmaster, should be with regards to where "chatroom shorthand" ends, and where laziness, ignorance, and bad grammar begin.

(I'll be glad when Morti's next evolutionary step is completed, where he approaches user input in a completely new fashion, and can better deal with this issue.)
Comforting the Disturbed, Disturbing the Comfortable
Chat with Morti!
LinkedIn Profile
CAPTCHA4us

*

Freddy

  • Administrator
  • **********************
  • Colossus
  • *
  • 6860
  • Mostly Harmless
I can imagine it would be a royal pain in the backside having to account for every little thing like that.

I see a couple of variations in regular usage just in forums.

Firstly, there are the abbreviations, which I think they mean when they say chatroom shorthand.  Things like IMHO, AFAIK, FYI etc.  I use those on some occasions and I think they are probably the easiest to deal with from your point of view as they are more or less finite.

The second is text speech.  I hate text speech.  It is hard to decipher, understand and makes the person using it sound like a moron.  For you it would be an ongoing battle as there is no limit to what people can come up with and some people will do different things to show the same word.

A lot of bots I see around the net simply ask the chatter to use plain English so that the bot can understand.  I think that is probably your simplest solution - ie don't even bother as it's a lot of extra work when time could be spent on better things.

However, like you say, you could do substitutions for the chatroom short hand.   I think it might be sensible to stick to commonly used ones though.  Maybe add in new ones as they come up in chat logs.  From the sound of things it seems you have already done something like that.

If I was programming this kind of thing I don't think I would cater to people who cannot be bothered to string a sentence together lol.  Why people use text speech when they have a keyboard in front of them I will never know.

Notice I used 'lol' just then....I hope you know what I meant ;)

Oh one other thing, I have not seen a chatbot that can do smilies yet... it might be a way of adding emotion to input and output...  Jabberwacky does allow you to add emotion to an input though, I thought that was quite neat.

Really though, it's up to you what you want to code for, thankfully no one is telling you what to do.  I suppose you need to work out how long it would take to do stuff like this and then decide on whether it is worth the effort.  What would make Morti different to other bots ?  In the end it might not be worth trying to understand poor English if they can simply be dissuaded...or indeed ignored....

But how do you tell the difference between someone being lazy and someone who has learning difficulties ?

Anyway, I will put an end to my ramblings.   ::)
« Last Edit: March 20, 2011, 01:13:31 pm by Freddy »

*

Duskrider

  • Trusty Member
  • ********
  • Replicant
  • *
  • 533

I asked my mail list of over 20 to check out Morti and vote for him.
My sister said she didn't talk to him but did vote for him simply because he was "SO" cute.
ok, I'm done.  Go back to your discussion. 

*

Data

  • Moderator
  • ***********
  • Eve
  • *
  • 1279
  • Overclocked // Undervolted
    • Datahopa - Share your thoughts ideas and creations
I agree with Freddy’s ramblings, sounds about right to me.

There is sooooo! much work to do for a bot masters, a line has to be drawn at some point. 

*

DaveMorton

  • Trusty Member
  • ********
  • Replicant
  • *
  • 636
  • Safe, Reliable Insanity, Since 1961
    • Geek Cave Creations
Thanks for the input, folks. It gladdens me to know that I'm not "The Lone Ranger", here.

Right now, Morti can handle about 30 different "chatroom shorthand" abbreviations, such as lol, brb, btw, afk, etcetera. Since I use these types of "shortcuts", myself, it seems only right that Morti should, too.

Also, Morti can not only use a wide range of smilies, he can also detect them, too. Well, a good number of them, at least. I think the list of "known" smilies is right at 118. And to be more accurate, he can't really tell which of them was used; only that one of the smilies in the list was used.

I've also got a new routine in Morti's script that goes hand-in-hand with his spellcheck/substitution engine. Now, when Morti comes across a word that he's never encountered before, he stores that new word in an "unknown word" table in his database. This allows me to look over the list of words and decide if it needs to be added to his "common words" list, or his "spell check" list. At some point, I may write a function that will run on a schedule, that takes this new list and passes it to a website, like dictionary.com, and figure out on it's own where the word belongs. But that's a later project. :)
Comforting the Disturbed, Disturbing the Comfortable
Chat with Morti!
LinkedIn Profile
CAPTCHA4us

*

Art

  • At the end of the game, the King and Pawn go into the same box.
  • Moderator
  • **********************
  • Colossus
  • *
  • 5865
Dave,

I am not a fan of Netspeak, text speak or other types of shorthand. I still live in America where the primary language is English and our American English is a close enough cousin to our British kin that they are practically interchangable with the exception of those common words that differ in difinition from ours like bonnet, flat, lift, fag, fin, etc., etc. I even got into an arguement with my chemistry teacher some lifetimes ago over my use of the word sulfur compared to sulphur and I've even spelled (not spelt) the word as bahaviour instead of behavior.

It could be that my ancestors were from England and Scotland but I digress....

Knowing you as a friend and knowing how much of your time and patience you have put into Morti, if he even places in the top 5 I would stand up and cheer!! It's quite an accomplishment for you and those who have supported and assisted you with Morti's growth and development.

A lot of us have come to know Morti's personality a bit better than we did many months ago. He is growing and he is getting better.

Regardless of the results of this or any contest, know that you have tried and given it your best. No one could ask any more.

Best of luck to you and Morti, my friend!  O0
In the world of AI, it's the thought that counts!

*

DaveMorton

  • Trusty Member
  • ********
  • Replicant
  • *
  • 636
  • Safe, Reliable Insanity, Since 1961
    • Geek Cave Creations
Art, your words have touched my heart, and added colour to my day. It's nice to have the support of my friends, both here in these forums, and elsewhere. My family has been exceedingly supportive of my efforts, even to the point of insisting on taking over some chores that I might be able to spend more time with Morti, and his responses. Regardless of how he does, Morti's already a winner, and so am I. :)
Comforting the Disturbed, Disturbing the Comfortable
Chat with Morti!
LinkedIn Profile
CAPTCHA4us

*

Bragi

  • Trusty Member
  • ********
  • Replicant
  • *
  • 564
    • Neural network design blog
Quote
Now, when Morti comes across a word that he's never encountered before, he stores that new word in an "unknown word" table in his database. This allows me to look over the list of words and decide if it needs to be added to his "common words" list, or his "spell check" list
Pretty smart.

Quote
Regardless of the results of this or any contest, know that you have tried and given it your best. No one could ask any more.
Very true, goes for everything in live, doesn't it?

I haven't even bothered trying to correct misspelling yet. I think making lists of common misspelled words is one part, but it wont cover everything all the time. Another thing I have been thinking of is this: aren't typos often made because of incorrect letter order or hitting a key just besides, above or below the wanted key. Couldn't this be turned into an algorithm?
Also, I think we use a lot of 'visual queues' like length, nr of lines that go up and down  + their location, to figure out the exact intend of the drawing that is the word. Perhaps that also can be done from an algorithm (not easy I think).

Quote
Also, Morti can not only use a wide range of smilies, he can also detect them, too. Well, a good number of them, at least. I think the list of "known" smilies is right at 118. And to be more accurate, he can't really tell which of them was used; only that one of the smilies in the list was used.
Morti's definitely a very good bot. I din't even know there were that many smilies  ;D

*

Freddy

  • Administrator
  • **********************
  • Colossus
  • *
  • 6860
  • Mostly Harmless
Quote
Another thing I have been thinking of is this: aren't typos often made because of incorrect letter order or hitting a key just besides, above or below the wanted key. Couldn't this be turned into an algorithm?

Must be possible.  Google does something similar when they offer suggestions for searches.

If you type in : swrdtaol

Google will say : Did you mean: swordtail
« Last Edit: March 21, 2011, 03:01:12 pm by Freddy »

*

DaveMorton

  • Trusty Member
  • ********
  • Replicant
  • *
  • 636
  • Safe, Reliable Insanity, Since 1961
    • Geek Cave Creations
: aren't typos often made because of incorrect letter order or hitting a key just besides, above or below the wanted key. Couldn't this be turned into an algorithm?

The funny thing about this is that I've written a function for the version of Morti that I'll be using in next year's Loebner Competition that randomly makes these types of "typos", and then makes "corrections" on the majority of them (he deliberately leaves a small fraction of these "typos" in place, just like a human would do, if they weren't paying attention). Oh, wait! I should be keeping that a secret, huh? Forget I said it. :)
Comforting the Disturbed, Disturbing the Comfortable
Chat with Morti!
LinkedIn Profile
CAPTCHA4us

 


Requirements for functional equivalence to conscious processing?
by DaltonG (General AI Discussion)
November 19, 2024, 11:56:05 am
Will LLMs ever learn what is ... is?
by HS (Future of AI)
November 10, 2024, 06:28:10 pm
Who's the AI?
by frankinstien (Future of AI)
November 04, 2024, 05:45:05 am
Project Acuitas
by WriterOfMinds (General Project Discussion)
October 27, 2024, 09:17:10 pm
Ai improving AI
by infurl (AI Programming)
October 19, 2024, 03:43:29 am
Atronach's Eye
by WriterOfMinds (Home Made Robots)
October 13, 2024, 09:52:42 pm
Running local AI models
by spydaz (AI Programming)
October 07, 2024, 09:00:53 am
Hi IM BAA---AAACK!!
by MagnusWootton (Home Made Robots)
September 16, 2024, 09:49:10 pm
LLaMA2 Meta's chatbot released
by spydaz (AI News )
August 24, 2024, 02:58:36 pm
ollama and llama3
by spydaz (AI News )
August 24, 2024, 02:55:13 pm
AI controlled F-16, for real!
by frankinstien (AI News )
June 15, 2024, 05:40:28 am
Open AI GPT-4o - audio, vision, text combined reasoning
by MikeB (AI News )
May 14, 2024, 05:46:48 am
OpenAI Speech-to-Speech Reasoning Demo
by MikeB (AI News )
March 31, 2024, 01:00:53 pm
Say good-bye to GPUs...
by MikeB (AI News )
March 23, 2024, 09:23:52 am
Google Bard report
by ivan.moony (AI News )
February 14, 2024, 04:42:23 pm
Elon Musk's xAI Grok Chatbot
by MikeB (AI News )
December 11, 2023, 06:26:33 am

Users Online

225 Guests, 0 Users

Most Online Today: 467. Most Online Ever: 2369 (November 21, 2020, 04:08:13 pm)

Articles