Improving the Turing Test to make better chatbots.

infurl · « **on:** January 28, 2020, 09:37:39 pm »

https://ai.googleblog.com/2020/01/towards-conversational-agent-that-can.html

Quote

Modern conversational agents (chatbots) tend to be highly specialized â€” they perform well as long as users donâ€™t stray too far from their expected usage. To better handle a wide variety of conversational topics, open-domain dialog research explores a complementary approach attempting to develop a chatbot that is not specialized but can still chat about virtually anything a user wants.

It probably won't surprise anyone that Google has been developing Chatbot technology too. To achieve this they made some improvements to the Turing Test in the form of the Sensibleness and Specificity Average (SSA), and they used their artificial intelligence algorithms to develop better artificial intelligence algorithms.

Quote

The Meena model has 2.6 billion parameters and is trained on 341 GB of text, filtered from public domain social media conversations. Compared to an existing state-of-the-art generative model, OpenAI GPT-2, Meena has 1.7x greater model capacity and was trained on 8.5x more data.

Real human beings typically score 86 on the new test compared to Mitsuku and Cleverbot which score 56 and other chatbots considerably lower. Google's new Meena chatbot can score a staggering 79 which is approaching human levels.

LOCKSUIT · « **Reply #1 on:** January 28, 2020, 10:11:34 pm »

The Turing Test isn't the way to measure intelligence. The way to measure intelligence is either to personally check if it can solve real world problems, compress data losslessly by Learning the patterns ex. cat/dog and frequencies so it can generate the data back including related data, and lastly can survive death better (which is what 'solving problems' and 'doing Good' means to most Humans. Love=breeding, hence survival.). So goal #1 make it stop you from shutting it off and able to defend against all humans. Well, maybe that comes next, let's make sure it can first stop our own death.

Study/try this and you won't go back.
http://mattmahoney.net/dc/text.html

We want Better Problem Solvers. We need better pattern finding so it can use past experience to re-generate missing future data using related context experiences.

HS · « **Reply #2 on:** January 29, 2020, 12:55:26 am »

They are using an outside-in approach, but there's no telling if it will converge on anything of substance. I'd try to make sure I have something self sustaining, and then try to expand on it. That way you get automatic error checks, as the systems benefits from, or rejects various additions.

infurl · « **Reply #3 on:** January 29, 2020, 01:30:52 am »

Quote from: Hopefully Something on January 29, 2020, 12:55:26 am

They are using an outside-in approach, but there's no telling if it will converge on anything of substance. I'd try to make sure I have something self sustaining, and then try to expand on it. That way you get automatic error checks, as the systems benefits from, or rejects various additions.

I agree. I hope you read the whole article because they say something like that at the end.

Yesterday I read an interesting essay about GPT-2 that was written by someone who knows what they are talking about and I believe the points that they made in that article apply equally well to Meena and all the other useless chatbots that are ultimately descended from Eliza.

https://thegradient.pub/gpt2-and-the-nature-of-intelligence/

The article contrasts the two major philosophical schools of thought about the origins of intelligence, that is, nativism versus empiricism. Nativism postulates that intelligence has to be preprogrammed with rules. Empiricism takes the view that intelligence starts with a blank slate and arises organically out of accumulated experiences.

Technologically these equate to GOFAI (good-old-fashioned-artificial-intelligence or symbolic processing) and machine learning (statistical processing). The latter has had a lot of success recently but ultimately it seems to be going nowhere.

I take the view that they're both wrong, although nativism is less wrong than empiricism.

What we think of as intelligence is actually the result of billions of years of evolution across all living things, and more recently, across all our civilizations. To produce an artificial intelligence empirically, on some level you would have to match the processing capacity of all the human brains throughout history. As even a single human brain is more powerful than all the computers on earth put together, to attempt that seems a bit futile at this juncture.

In short, the rules evolve empirically over time and each generation passes its improved set of rules on to the next one.

Intelligence is an ecosystem.

Art · « **Reply #4 on:** January 29, 2020, 03:26:38 am »

Nativism doesn't necessarily imply any degree of knowledge that Empiricism certainly would. Knowing only the rules for playing baseball, cards or chess doesn't make one a good player. Knowledge gained through playing and learning over time is what makes the difference.

What one then does with that knowledge can and will make all the difference in the world.

@ Lock, Why the preoccupation with death? We all die, it is a part of life. The old die and make way for the new. If we were meant to live forever then we would and no one would have died and our planet would by now be vastly overpopulated to the point of extinction. Then again, why do we think NASA is so intent on making the journey to inhabit Mars?

MikeB · « **Reply #5 on:** January 29, 2020, 05:07:07 am »

On the topic of the Turing test - Sensibleness and Specificity Average (SSA) developed by google researchers, it looks like it's exactly the same as used in regular Turing tests except with a graph - One point for relevance. One point for a specific answer.

At least they're recognising Pattern Matching bots as competitive instead of "you're not the future"/hate speech.

HS · « **Reply #6 on:** January 29, 2020, 06:39:22 am »

Quote from: infurl on January 29, 2020, 01:30:52 am

I agree. I hope you read the whole article because they say something like that at the end.

Umm.. now I have.

Quote from: infurl on January 29, 2020, 01:30:52 am

I take the view that they're both wrong, although nativism is less wrong than empiricism.

Yes, neither Nativism nor Empiricism appear like complete strategies, hearing them explained doesnâ€™t produce an epiphany to kickstart an understanding of intelligence. They create the symptoms, but not the disease.

Quote from: infurl on January 29, 2020, 01:30:52 am

Intelligence is an ecosystem.

An ecosystemâ€¦ Yesâ€¦ Thatâ€™s a good way to describe it. A system which attunes to the world. Overlapping loops of various functions, both detailing the present and building towards the future, synergistically reinforcing each other.

Regarding the communication, more important than good words, are clues about where the words are coming from. If there is no reason for it to exist, the most clever banter can be as thin and dry as a paper cut-out. We should first figure out how to create a system which generates reasons, then we wonâ€™t need to work so hard to make the resulting proceedings seem reasonable. They wonâ€™t have to be reasonable, logical, or consistent at the surface level. They just need to have the signature of a system running on faith and countless assumptions.

Seriously.

The test wouldnâ€™t be about if it makes sense and responds to questions in the correct way, but rather if itâ€™s able to make you self conscious. The â€œSomething perceives me, how do I appear?â€ reaction is what weâ€™re truly hoping for. This could also provide a second, less Freudian explanation for all the pretty female robots. Those could be attempts to recreate that missing sense of presence with an optical trick, to make the robot appear closer to sentience than it actually is. We'd probably be best off with both.

But again, thatâ€™s an outside-in approach. Just grammar and glamour trying to distract from the empty space behind them. Are there any attempts/examples of a core program which provides a basic functional system with some built in goals, which also supports near omnidirectional growth?

Seems like one of the basic goals of a general intelligence would be to have an optimal interaction with the world, not necessarily a victorious one.

squarebear · « **Reply #7 on:** January 29, 2020, 08:23:33 am »

A shame there's no actual way to talk to the bot to validate the claims. The things that stands out for me is that Meena is 341Gb which is 10,000 times larger than Mitsuku. I'm unaware of the hardware requirements of the bot but Mitsuku can run from a USB on around 4Mb of RAM. Given this, I'm curious whether Meena is practical for running locally on a device such as a smartwatch,

I checked out the paper a few days ago and gave Google feedback on it. I usually take these things with a pinch of salt unless I can try them myself. 15 years in the chatbot business has made me rather cynical of any amazing new claims, especially if nobody can actually try it out.

Did I ever mention I can run the 100 metres in 8 seconds? However, I choose not to do a public demonstration of this

Would love to try it out though and am genuinely curious about it.

infurl · « **Reply #8 on:** January 29, 2020, 09:45:15 am »

Quote from: squarebear on January 29, 2020, 08:23:33 am

The things that stands out for me is that Meena is 341Gb which is 10,000 times larger than Mitsuku. I'm unaware of the hardware requirements of the bot.

No, Meena is not 341Gb. That is how much conversational data they processed in order to generate the model that drives Meena. The amount of training data that is used to train a neural network does not bear any relation to the ultimate size of the model that is distilled from it. Chances are, Meena needs fewer resources to run than your Mitsuku does.

What they did that is different is that they figured out a way to measure the quality of a conversation without needing human judges. As MikeB pointed out, the Turing Test already has metrics for sensibleness and specificity. Google's researchers discovered a parameter that is generated by the learning algorithm which correlates strongly with those human measured values, the so-called perplexity of the model.

With that knowledge they were able to design a neural network to generate and test a large number of different neural networks until they found the ones that produced the optimal results for a conversational chatbot. They were able to create a better chatbot in just a few days of computation than you could create if you spent your entire life poring over chat logs.

ivan.moony · « **Reply #9 on:** January 29, 2020, 01:08:23 pm »

I can't wait till I speak to Meena. I like the concept of pairing NN with chatbot technology. But I'm sure there will be a lot of place for improvements. I bet Korr would have great ideas on that cause.

infurl · « **Reply #10 on:** January 29, 2020, 01:16:01 pm »

Quote from: ivan.moony on January 29, 2020, 01:08:23 pm

I can't wait till I speak to Meena. I like the concept of pairing NN with chatbot technology. But I'm sure there will be a lot of place for improvements. I bet Korr would have great ideas on this cause.

Yes that's for sure. If you read the article closely you will have noticed that the algorithm alone achieved a score of 72 but with some hand-tuning that was increased to 79. There will still be a role for human chatbot authors to play.

ivan.moony · « **Reply #11 on:** January 29, 2020, 01:38:59 pm »

Quote from: infurl on January 29, 2020, 01:16:01 pm

Quote from: ivan.moony on January 29, 2020, 01:08:23 pm
I can't wait till I speak to Meena. I like the concept of pairing NN with chatbot technology. But I'm sure there will be a lot of place for improvements. I bet Korr would have great ideas on this cause.

Yes that's for sure. If you read the article closely you will have noticed that the algorithm alone achieved a score of 72 but with some hand-tuning that was increased to 79. There will still be a role for human chatbot authors to play.

I'm counting on the fact that NNs are Turing complete, which means they can perform any possible computation, proof, or correct thought. With carefully related different segments of artificial brain, possibly deciding a degree of statistical correctness, maybe it would turn to the very AGI?

Don Patrick · « **Reply #12 on:** January 29, 2020, 03:58:04 pm »

I find it weird that by treating the two factors as equal, Cleverbot and Mitsuku end up scoring the same, while I personally consider Cleverbot to be rubbish due to its abundance of generic responses. "yes" is a sensible answer to most straight questions but it's terribly inadequate for making conversation. I also think the main reason they use sense as a factor is because they are using a system that can equally generate nonsense, and this is mainly a concern in approaches with neural networks. So they are measuring a self-inflicted side-effect of a specific technology that is not very relevant in the judgment of other approaches.

LOCKSUIT · « **Reply #13 on:** January 29, 2020, 06:01:03 pm »

@Art, the reason er death (well, survival) is the evaluation is because change happens. Be it an idea or person, sorting of particles happens, you can't delete or create bytes/matter/energy, only sort them! So change=death. What survives is the "good" change. More patterns arise during evolution, it was chaos when it began random. We are aligning now. So AGI/physics/evolution is all about change of particle positions / survival, updates if you will. Everyday we seek food to survive, and breeding to populate against depletion. We grow new data too. Ideas 'fight' too. One takes over and overpoweringly radiates 'updates' to the rest, which may feel pleasant like a friendly tip or, well, erm. Being 'intelligent' is all about your skills at finding food, and breeding....so your whole goal is to try to survive/spread, and whoever is better wins against many others and updates the others.

Yes starting from blank slate VS passing down rules, both are needed. You are taught, then you build onto it.

LOCKSUIT · « **Reply #14 on:** January 29, 2020, 06:56:29 pm »

See above my last 2 posts. And so Lossless Compression is an evaluation for AGI (I hope korrelan doesn't use perplexity) because the sorting of particles Learns patterns, enabling the sort of particles to easily work with many unseen future issues given to it. It's an enabling factor when you sort your particles better/Learn patterns. It let's you solve many many problems, and survive/spread. So pattern/sorting = solving issues = surviving/spreading/regenerating lifeform back.

Improving the Turing Test to make better chatbots.

infurl

Improving the Turing Test to make better chatbots.

LOCKSUIT

Re: Improving the Turing Test to make better chatbots.

HS

Re: Improving the Turing Test to make better chatbots.

infurl

Re: Improving the Turing Test to make better chatbots.

Art

Re: Improving the Turing Test to make better chatbots.

MikeB

Re: Improving the Turing Test to make better chatbots.

HS

Re: Improving the Turing Test to make better chatbots.

squarebear

Re: Improving the Turing Test to make better chatbots.

infurl

Re: Improving the Turing Test to make better chatbots.

ivan.moony

Re: Improving the Turing Test to make better chatbots.

infurl

Re: Improving the Turing Test to make better chatbots.

ivan.moony

Re: Improving the Turing Test to make better chatbots.

Don Patrick

Re: Improving the Turing Test to make better chatbots.

LOCKSUIT

Re: Improving the Turing Test to make better chatbots.

LOCKSUIT

Re: Improving the Turing Test to make better chatbots.

Recent Topics

Recent News

Users Online

Articles