The End of the Chatterbox Challenge

DaveMorton · « **on:** March 23, 2012, 02:30:33 am »

With Wendell Cowart's closing of the Chatterbox Challenge, we've lost one of the biggest and best chatbot contests that botmasters have been able to use as a means of testing their chatbots, to find ways to improve, or to gain "bragging rights", or simply to have a bit of friendly fun. This is a very sad thing, but not the end of the world. I intend to organize a new chatbot contest, and I want your aid, so that this new chatbot competition will be every bit as good, and every bit as fun, as the CBC has been.

So, first off, Iâ€™m putting out a â€œcall to armsâ€, so to speak. I want to create an informal committee, made up of AI and chatbot enthusiasts and experts, to codify a set of goals and guidelines, and to discuss what the shape of a new chatbot contest should be. Iâ€™m looking for people to volunteer to spend at least a couple of hours per month, mostly in the form of participating in a â€œForum Round Tableâ€ discussion, with some possible email correspondence, as well. If this is something youâ€™re willing to help with (and you havenâ€™t already emailed me), then please let me know, either here, or by email.

I want to find out from everyone what the want to see in a chatbot contest. Tell me what you think worked about the CBC, and what didnâ€™t. I also want to hear suggestions, no matter how off the wall or wacky, for what you would like to see implemented. Letâ€™s all work toward making a great chatbot competition thatâ€™s challenging, interesting, and most of all, fun!

Data · « **Reply #1 on:** March 23, 2012, 03:44:51 pm »

I would like to see a tougher approach to judging the bots, something like 1, 2, or 3 strikes and youâ€™re out.

If a bot gets asked a question and it gives a wrong, misleading, inaccurate or just completely changes the subject as an answer it should get a strike.

The winner could be the bot that goes on the longest without making a mistake or deviating from the subject.

EDIT:
I would also add:

Trying to fool the judges into thinking they are talking to a human should be completely dropped, the judges know they are talking to bots, lets treat them as what they are and see where that leads us.

Art · « **Reply #2 on:** March 23, 2012, 08:50:54 pm »

@ Data,

I like the idea about the bots NOT pretending to be human. Judge them on the premise of the most demonstrated pseudo-intelligent bot. Witty, clever and smart-a$$ed bot answers don't count!! While humor might be appreciated if and when appropriate, a smart retort is usually a sign of the bot simply not knowing or being able to formulate a correct response.

I sort of like the current "weighted" system of judging the bots (3=most correct answer, 2=good answer but not the best, 1=OK but maybe not completely topical as if barely sliding by and 0= answer not correct, acceptable or even close to topic.)

Open judging by quorum of 6 - all together asking / entering the SAME QUESTIONS in exactly the SAME ORDER and recording the answers. Bots' scores are recorded and in a round robin fashion, some bots are eliminated leaving only the best 4 then 2 then 1. Thus, many different sets of questions need to be on hand for the various rounds of testing.
While the judges all pose questions to the bots in unison, their scoring is on an individual basis as most judging is subjective. At the end of each round, scores are totaled and compared. A best of round contestant is selected to move on the the next round of (new) questions.

Ultimately, the "last bot standing" will be the one who most accurately and completely answered the questions.

Just some food for thought. One incorrect answer shouldn't disqualify a contestant just like we don't fire office people when they make
incorrect suggestions or answers (though we'd probably like to). This way seems fair for all and certainly offers a bot the chance to redeem itself for missing a question or two (after all, it's only not human)!

squarebear · « **Reply #3 on:** March 23, 2012, 10:35:04 pm »

The Chatterbox Challenge never wanted the bots to appear human and judged them on their abilities rather than any pretence. The last two years a dragon has won it!

DaveMorton · « **Reply #4 on:** March 23, 2012, 11:11:28 pm »

I don't plan on any method of judging bots based on ability to pass as human, or bot, dragon orTarutaru. As far as I'm concerned, the major criteria for judging should primarily be the bot's ability to handle the smooth flow of conversation, while accurately being able to answer certain simple questions of general knowledge or memory. Anything more is just an over complication, and will make the contest less attractive to potential entrants.

Art · « **Reply #5 on:** March 24, 2012, 12:24:02 am »

I think a lot of folks sort of have a similar take on the Chatterbox Challenge as with the Loebner Contest (In 1990 Hugh Loebner agreed with The Cambridge Center for Behavioral Studies to underwrite a contest designed to implement the Turing Test. Dr. Loebner pledged a Grand Prize of $100,000 and a Gold Medal (pictured above) for the first computer whose responses were indistinguishable from a human's. Such a computer can be said "to think." Each year an annual prize of $2000 and a bronze medal is awarded to the most human-like computer. The winner of the annual contest is the best entry relative to other entries that year, irrespective of how good it is in an absolute sense.)

Even though most realize the difference a lot of people lump the chatbot contests under the same unbrella, if you will.

@Dave, I also thought that several general statements could be made to a bot in question (being tested) as a matter of trivia or even small talk in order to see how well (or not) said bot handles conversational flow both in grasp of concept and topical flow [returning response in line with same or similar idea or method]. There doesn't have to be a Question / Answer format for all queries and some natural sentence interaction might prove equally as interesting as direct Q & A.

I've chatted with LOTS of bots that could carry on a great conversation but would fail if asked criteria based questions like time, history, geography, math, etc. (of course the same could be said of some people). Maybe we're not that far removed from the intelligence thing!!

Interesting developments.

Data · « **Reply #6 on:** March 24, 2012, 12:05:45 pm »

My 3 strikes idea and the â€œnot fooling the judges in to thinking they are talking to a humanâ€ idea were just that, an idea that I thought I would fling in to the arena.

Of course they could be thought through more and polished, but to me it would be a good idea if they were adopted by any future contest.

I also think that a bot should be able to do math, know history and geography and be a general knowledge guru, or yes, we might as well talk to a human.

Bots have the capability of doing maths very fast, have the opportunity of being able to access huge amounts of data or knowledge, lets take advantage of that, focus on it and not just try to make a bot that can hold a conversation.

Just some ideas

The End of the Chatterbox Challenge

DaveMorton

The End of the Chatterbox Challenge

Data

Re: The End of the Chatterbox Challenge

Art

Re: The End of the Chatterbox Challenge

squarebear

Re: The End of the Chatterbox Challenge

DaveMorton

Re: The End of the Chatterbox Challenge

Art

Re: The End of the Chatterbox Challenge

Data

Re: The End of the Chatterbox Challenge

Recent Topics

Recent News

Users Online

Articles