New model beats GPT3

infurl · « **on:** September 23, 2020, 12:51:13 am »

https://thenextweb.com/neural/2020/09/21/ai-devs-created-a-lean-mean-gpt-3-beating-machine-that-uses-99-9-fewer-parameters/

Quote

When a system using 99.9% less model parameters is able to best the best at a benchmark task, it’s a pretty big deal. This isn’t to say that the LMU system is better than GPT-3, nor that it’s capable of beating it in tests other than the SuperGLUE benchmark – which isn’t indicative of GPT-3’s overall capabilities.

This is on only one of the benchmarks so far, but it's a general purpose algorithm and you would expect that three orders of magnitude greater efficiency will lead to massive gains across the board. The secret sauce is something called PET (pattern exploiting training). GPT3 and its ilk are not AGI or even AI, but for sure they an important component along with everything else that has been developed in past years.

LOCKSUIT · « **Reply #1 on:** September 24, 2020, 04:47:55 am »

Can someone explain the new idea? It mentions 'clozes'.

"One factor that has gained little attention in previous work is the way tasks are reformulated as
cloze questions. These reformulations can be arbitrarily complex"

From what I've gathered, it seems they found a pattern which is asking it a yes/no type of thing, etc, to kind of prime or hint it towards saying the correct answer?

Like instead of asking:
"I hit my head. It must have been a"
You ask:
"I hit my head. It must have been a //perhaps that// ?brick? or ?truck?"

Which would remove some candidates essentially. But I already know this idea.

infurl · « **Reply #2 on:** September 24, 2020, 10:42:15 pm »

https://en.wikipedia.org/wiki/Cloze_test

Quote

A cloze test (also cloze deletion test or occlusion test) is an exercise, test, or assessment consisting of a portion of language with certain items, words, or signs removed (cloze text), where the participant is asked to replace the missing language item. Cloze tests require the ability to understand context and vocabulary in order to identify the correct language or part of speech that belongs in the deleted passages. This exercise is commonly administered for the assessment of native and second language learning and instruction.

LOCKSUIT · « **Reply #3 on:** September 25, 2020, 01:19:53 am »

But that is the same thing all predictor nets are about...predict the next word "I was walking down the " or "I was _ down the road".......can you give me a better example? NN already learn syntax a>b...

infurl · « **Reply #4 on:** October 01, 2020, 12:48:00 am »

https://syncedreview.com/2020/09/23/google-approaches-bert-level-performance-using-300x-fewer-parameters-with-extension-of-its-new-nlp-model-prado/

Here is another example of a much smaller model beating out the big ones. The goal is to be able to run AI software locally on devices such as smart phones and watches instead of having to offload the work to the cloud.

Quote

Google AI recently released the new trimmed-down pQRNN, an extension to the projection attention neural network PRADO that Google AI created last year and which has achieved SOTA performance on many text classification tasks with less than 200K parameters. PRADO’s example of using extremely few parameters to learn the most relevant or useful tokens for a task inspired Google AI researchers to further exploit its potential.

I can see where this is going. Eventually all these monstrous almost random collections of parameters are going to be distilled down to relatively small sets of rules and symbolic processing will be back in fashion.

LOCKSUIT · « **Reply #5 on:** October 01, 2020, 11:54:25 am »

So that algorithm just above, it's doing basically word2vec but for phrases but has holes like Random Forests?

For example let's say 'the cat' and 'here today' are both very common phrases, and so is the 2 together BUT they never have the same middle look:

the cat was here today
the cat came here today
the cat went here today
the cat sat here today

So the AI above learns this then??? >>> 'the cat _ here today' and uses that for semantic discovery?

New model beats GPT3

infurl

New model beats GPT3

LOCKSUIT

Re: New model beats GPT3

infurl

Re: New model beats GPT3

LOCKSUIT

Re: New model beats GPT3

infurl

Re: New model beats GPT3

LOCKSUIT

Re: New model beats GPT3

Recent Topics

Recent News

Users Online

Articles