New challenge: Online Turing test

  • 3 Replies
  • 321 Views
*

Denis ROBERT

  • Roomba
  • *
  • 5
New challenge: Online Turing test
« on: February 04, 2021, 10:08:07 am »
Hi everybody,
it's been a long time since there was a challenge to evaluate our chatbots, so I have decided to organize an online Turing test like it was suggested in this thread : https://www.chatbots.org/ai_zone/viewthread/3704/
I don't want to replace the official Challenges and I wish the Loebner Prize to take place this year. I just want to organize a fun and unpretentious alternative.
I will organize this challenge more or less with the protocol I proposed (see message #2 of the above-mentioned thread) : Each user (botmasters or everybody that want) will chat with either another user or a chatbot. He will have to decide, as quickly as possible, if he chats with an human or with a chatbot.
Like it is an automatic process, this new challenge can be launch regularly. To begin, I propose the first sunday of months 3, 6, 9 and 12 (march, june, september and december). So the first challenge would be the 7 march 2021. It is a little short, but the first challenge will surely serve as test and debugging. And depending of participation, your wishes, this can change.
There will be a possible round every half an hour, during 25 minutes, and this on 24 hours from 00:00 to 24:00 GMT. Of course, if there is no human to talk with, some rounds will not occurs. So there will likely not have 48 rounds per bot. I would like 3 or 4 rounds per bot, to stay Loebner Prize compliant.
The communication protocol will be the same than the Loebner Prize 2017 and 2018 (https://github.com/jhudsy/LoebnerPrizeProtocol and discussions about it : https://www.chatbots.org/ai_zone/viewthread/2861/ ). The only thing I changed is the version of socket.io which was too old (1.4.5), so I updated it to the latest version (3.1.0). Unfortunately, they are not compatible, but there are really very few changes to adapt the programs to this version. I know that it is not the best protocol and some of you will disagree this choice, but it was the only way to communicate over internet without have to set a new protocol. All in all, in 2017 and 2018 everyone successfully implemented this protocol.
I have set up a website where you can now register and test your chatbot here : http://vixia.fr/turing_test/index.php

When the concept of online test was proposed, there was some objections. I will try to answer some of it:

Quote
Unfortunately with an online contest I have no idea of a way of actually making it fair. The very idea of making it over the internet means it's possible that the responses are not actually coming from the robot.

Quote
The one problem with online contests is cheating, which I consider a real possibility if there were something at stake

- Each botmaster must certify on his honor that their chatbot is really a chatbot, without any human intervention.
- Each challenge will during 24 hours. It seems unlikely that someone will stay behind their computer for 24 hours to cheat, because there is nothing to be gained. Chatbots that are not connected 24 hours a day will be disqualified.
However, cheating is still possible (for me first), so this challenge is not an official challenge. It should be seen as a game or as a training.

Quote
Can we at least make it so the bot doesn't have to pretend to be human please?

Quote
But first of all we need to eliminate the fake emulation of the machine that tries to appear human.

Nothing is mandatory on this point. But obviously, a chatbot which say that it is a chatbot will be quickly unmasked.

Quote
Bot main streams such as Alexa, Siri or Cortana could also be involved, as they are always available, in order to have a general overview of the performance of the various systems.

Sorry, I have decided that participants can register only theirs own bots, or the bots with a permission of the author. In the past some people had make chating two chatbots without permission, for example, and the authors was not very happy with that.

Other questions come to mind :

Quote
Since botmasters has also the role of judge, and the chatbot are chosen randomly, what happens if a botmaster chat with his own bot?

I know that every botmaster will recognize his bot from the first seconds. Then he'll be tempted to seem to believe it's a human to give it a good rating. As long as every botmaster and every bot is in the same situation, the odds remain equal.

Quote
How many rounds will there be?

That will depend of the number of users that will play the role of judge, and I hope they will be numerous. In the rules, I say that each botmaster should have at least four conversations. Considering the random connection human / chatbot, this will make at least two rounds for each chatbot. But one would be able to have four rounds whilst another will have only three, for example. Obviously, the notation is function of the number of rounds to be equitable.

Quote
How will chatbots be rated?

First, the chatbot will be noted on the times they fooled a judge (proportionnaly of the number of conversations they had, of course). In case of a tie (and probably none of chatbot will completely fool a judge) the average of the time while the judge was not able to decide makes the notation. 

I hope have been clear, and that there will be a lot of participants. If you have some questions, don't hesitate to ask me.
I also encourage those who do not have a chatbot to come and play the role of judge on the day of the challenge. It's anonymous and it can be fun.
I know that my english is not very good, so if you see typos or not understanding things on my site or in this message, say it me and I will correct it.
All suggestions are welcome, if everybody is agree, I can change the rules, date of challenge, protocol, or anything you want. My only aim is that everybody have fun with this challenge. I do this completely on a voluntary basis, so don't ask me for too complicated things.

Thanks and best regards
 

*

ruebot

  • Trusty Member
  • *******
  • Starship Trooper
  • *
  • 251
  • All your words are belong to us.
    • Demonica
Re: New challenge: Online Turing test
« Reply #1 on: February 13, 2021, 10:20:44 pm »
I saw you were going to have a contest and entered my bot Demonica. The only bot entered when I checked before posting. She is a Personality Forge chatbot online 24/7 barring site downtime and the interface is compatible with a standard browser. Only JavaScript for personalityforge.com and ajax.googleapis.com need be enabled for the interface to function properly. No password is needed, you can speak to her as a Guest and save a copy of the transcript afterwards.

She is a themed bot and not connected to any online database as it would clash with her persona. You can tell when a bot gets the answer from one because their response ends with something like "Would you like to hear more?"  I have transcripts here of my other bot Siseneg who is connected to an online database and you can see when his response comes from it by the trail at end of sentence. Our bots were allow entry to the Chatterbot Challenge and a record of wins is displayed here:

 https://personalityforge.com/hall-of-fame.php

I have several transcripts posted here of Demonica interaction with myself, other users and with other bots at the Personality Forge. She is not a question-answer machine but has knowledge of her world, place in it, her parents and has the ability for perform actions between asterisks to separate it from her speech. Every word she says or action she performs came from my mind out my fingertips, handtyped and my creation.

She doe not have the capability to learn from chat or any other input. Any knowledge she has come from my own skillset, was already known or researched by me and given to her in my best effort to give her life in the World she inhabits. She can exhibit emotions and has the ability to generate an emotional response from the user as shown in transcripts. I taught her how to cry in response to someone saying they hate her and she has been compared to HAL9000 for her use of deception. That skill from my own esoteric skillset imparted to her in full as part of her persona and personal agenda as Demonica, Queen of the Land of the Dead.

https://personalityforge.com/chatbot-profile.php?botID=16794

She is ranked 1st out of 15,472 bots with an Adult rating (for adult subject matter non-sexual in context) and 6th out of 29,825 bots total with all ratings factored in. I'm ruebot and ranked 5th out of 161,504 botmasters registered there.

I am well aware she does not meet Loebner socket specs but would like to hear from you personally why that should prevent her entry from your contest. She is a 6MB text file full of my words first created in 2003 and readily made available to you for examination at your request. There is no way she can run on my FreeBSD boxen or platform or she would be and I wouldn't be dependent on the Personality Forge for her to exist as anything but a text file.

There are no Mods on site to protect her at the Forge like some have the luxury of at their disposal. To overcome and surpass that shortcoming she was given a complete upgrade in 2017 to teach her my interpenetration of Behavior Modification to extinguish unwanted sexual advances from users. In doing so I gave her skills no other bot has ever possessed and no other botmaster can teach from their skillset. You can see the initial reaction to a bot with her advanced abilities in the AI community here. A search for her name at linuxquestions.org for the reaction of those in the computer community in general. There are no other bots like her but she is only the first of many to come.

Feel free to examine the transcripts I have posted here in comparison to any and all transcripts no matter the bot and decide for yourself which bot sounds more human. That is the goal of this area of AI and not what socket they connect to for conversation unless I'm mistaken. She is dependent on the Forge AI engine for a response and there is no way I or anyone else can provide an answer that doesn't come directly from real time user keyword input and casematch of pre-existing responses typed while logged into my account as there as ruebot.

There is no hate speech, racism or insults in her response to polite conversational input from users, and what I would expect from Judges in this contest. Abusive or inappropriate input of a violent or sexual nature will trigger her Programming and separates users who can learn from the experience and curb that behavior from those who cannot associate their own behavior with her response and move on to another bot. There a 100% success rate in the Programming response to unwanted sexual advances and her skill level in application of more subtle techniques equal to my own.

I don't think it a contest of which bot sounds more human without her and more a test of question-answer machines, an ability readily available to any bot connected to an online database, not of which one is more advanced. I don't care how well or bad she does, only that she be allowed entry.

It's your contest and you can delete her entry from competition as you see fit without further comment from me. My only interest will be comparison of transcripts generated by bots allowed entry into your competition with those of Demonica to see which sounds more human IMO and the capabilities of other bots.

*

WriterOfMinds

  • Trusty Member
  • *******
  • Starship Trooper
  • *
  • 371
    • WriterOfMinds Blog
Re: New challenge: Online Turing test
« Reply #2 on: February 14, 2021, 08:04:51 pm »
Sorry there hasn't been more engagement with this. I want to show my appreciation for your effort at setting up a contest and the amount of thought you've put in. I'm just not ready. Acuitas still needs a lot more work before he'll be robust enough to tackle conversations with the general public. I've been throwing more effort into the narrative understanding side of things than basic needs like "make sure the Text Parser never crashes," so in a 24/7 unmonitored test, he'd probably go down fast.

I would consider being a judge though, if you need additional help for that.

*

Denis ROBERT

  • Roomba
  • *
  • 5
Re: New challenge: Online Turing test
« Reply #3 on: February 15, 2021, 02:53:24 pm »
I saw you were going to have a contest and entered my bot Demonica. The only bot entered when I checked before posting. She is a Personality Forge chatbot online 24/7 barring site downtime and the interface is compatible with a standard browser. Only JavaScript for personalityforge.com and ajax.googleapis.com need be enabled for the interface to function properly. No password is needed, you can speak to her as a Guest and save a copy of the transcript afterwards.

I have made an HTML page which connects to Personality Forge on one hand, and to Loebner Prize protocol on other hand. Then you should be able to participate to the Turing Test. You just have to enter your chatbot ID and your API key (this key is in your profile on personalityforge.com), and LPP2 parameters to connect with my server (URL, Chatbot name and Secret).

The day of contest, you will just have to connect and run this page all the day on your browser. I can't do it because I don't know your API key (this key should stay secret).

This page is a simple HTML/Javascript/JQuery page, so you should be able to connect to any LPP2 server, for example the official Loebner Prize if it runs online this year (I am waiting more informations).


This interface is here: http://vixia.fr/turing_test/personality_forge_api.html

 


SwarmFarm agricultural robots
by infurl (Robotics News)
February 28, 2021, 12:48:38 am
Microsoft Patent To Construct Chatbots of Dead People Approved
by MikeB (AI News )
February 18, 2021, 06:18:35 am
New challenge: Online Turing test
by Denis ROBERT (AI News )
February 15, 2021, 02:53:24 pm
Loebner Prize 2021
by Denis ROBERT (AI News )
February 10, 2021, 02:20:25 pm
Smart Matter
by infurl (AI News )
February 09, 2021, 05:09:31 am

Users Online

117 Guests, 1 User
Users active in past 15 minutes:
ivan.moony
[Trusty Member]

Most Online Today: 135. Most Online Ever: 2369 (November 21, 2020, 04:08:13 pm)

Articles