Ai Dreams Forum

Member's Experiments & Projects => General Project Discussion => Topic started by: LOCKSUIT on December 25, 2019, 09:53:06 PM

Title: Releasing full AGI/evolution research
Post by: LOCKSUIT on December 25, 2019, 09:53:06 PM
This is most of all my notes, images, algorithms etc summarized/unified in their recent forms. Here. I am 24 and started at 18 all this work. I make discoveries in my brain using mostly vision (visual language; shapes, cats, etc which has context, each explain each other like in a dictionary, a small world network of friendly connections).

https://www.youtube.com/watch?v=Us6gqYOMHuU

I have 2 more videos to share but not yet uploaded. And 2 more notes. The last note will have some more very recent good data but these not yet given are of less immediate importance. The long file though does have a lot of recent knowledge in it though still. It's better you know all it, at least the movie.

notes
https://paste.ee/p/mcnEk
https://paste.ee/p/kQLCx
https://paste.ee/p/CvSsB
https://paste.ee/p/lJYMP
https://paste.ee/p/EmCZt

Code-only of my advanced-ngram 'gpt2':
https://paste.ee/p/7DG3M
https://paste.ee/p/XvVp5
result:
The software was made on a
The software was made on a wide variety of devices, and operating apps and applications that users can easily read as an app for android. It is a bit of a difference, but i was able to get it. The developers are not going to make it through a web applications, and devices i have seen in the running for the mobile apps. Applications allows users to access applications development tools, and allow applications of the app store. A multimedia entertainment entertainment device, and allows platforms enabled access to hardware interfaces. Using a bit of html application app developers can enable users to access applications to investors, and provide a more thorough and use of development. The other a little entertainment media, and user development systems integration technology. Applications allows users to automatically provide access to modify, optimize capability allows users to easily enable. Both users and software systems, solutions allowing owners software solutions solutions to integrate widgets customers a day. And if you are accessing services product, and mobile applications remotely access to the software companies can easily automate application access to hardware devices hardware systems creators and technologies. Builders and developers are able to access the desktop applications, allowing users access allows users to
((I checked the 400MB, not too long copies pastes, like only 3-5 words at most))
https://www.youtube.com/watch?v=Mah0Bxyu-UI&t=2s

I almost got GPT-2 understood as shown at end of movie but need help, anyone understand it's inner workings? Looking to collaborate.

I recommend you do this as well and mentor each other.

more data by my top-of-mind pick:
AGI is an intelligent Turing Tape. It has an internal memory tape and an external memory tape - the notepad, the desktop, the internet. Like a Turing Tape it decides where to look/pay attention to, what state to be in now, and what to write, based on what it reads and what state it is in. The what/where brain paths. It will internally translate and change state by staying still or move forwards/backwards in spacetime. It'll decide if to look to external desktop, and where to look - notepad? where on notepad? internet? where on internet?

It's given Big Diverse Data and is trying to remove Big Diverse Data (Dropout/Death) so it can compress the network to lower Cost/Error and hence learn the general facets/patterns of the universe exponentially better while still can re-generate missing data despite having a small world network (all quantinized dictionary words explain each other). It uses Backpropagation to adjust the weights so the input will activate the correct node at the end. That node, can be activated by a few different sorts of images - side view of cat, front view of paw, cat ear, mountain, it's a multi dimensional representation space, Since the network Learns patterns (look up word2vec/Glove, it's same as seq2seq) by lowering error cost by Self-Attention evolution/self-recursion of data augmentation (self-imitation in your brain using quantinized visual features/nodes), it therefore doesn't modify its structure by adjusting existing connections (using weights/strengths) to remove nodes, it rather adjusts its structure by adjusting connections weights to remove error and ignores node count Cost.

Intelligence is defined as being flexible/general using little data, but walker bots are only able to solve what is in front of themselves, we need a thinker like GPT-2, and you can see the mind has evolved to simulate/forecast/predict the future using evolutionary mental RL self-imitation self-recursion of data. And intelligence is for survival, immortality, trying to find food and breed to sustain life systems, it's just an infection/evolution of matter/energy / data evolution.
Title: Re: Releasing full AGI/evolution research
Post by: Hopefully Something on December 26, 2019, 06:28:43 AM
Oh my gosh, you're so logical, I did not expect that. Use your power for good.  O0
Title: Re: Releasing full AGI/evolution research
Post by: Hopefully Something on December 26, 2019, 08:11:00 AM
You showed the optical illusions I posted! And talked about redundunduncy :)
Crazy text you is real life me, and real life you is chilled out Richard Feynman... This was too much I need to sleep...
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on December 26, 2019, 02:17:24 PM
I think you can skip the heterarchy maybe....simply the hierarchy nodes get activated ex. nodes cat, etc, which parallelly leaks energy to their nearby context ex. 'the cat ate' 'the cat ran' 'our cat went' and these handles leak energy to nearby context 'the dog ate' 'the dog ran' 'some dog went' and so on for proving 'some'='our' as well, including if cat=zebra/horse and dog=zebra/horse then cat=dog! Hence no w2V, just on the fly activated by leaking connections. Solves typos, rearranged phrases, unknown words ex. superphobiascience, alternative words, related words, names, references ex. it/he, and blanks. Then for the candidate words, the winner is the one that is most frequent in knowledge (has energy) and in Working Memory Activation Context (which fade energy/leak), most related to story word (activation leak), most favorite (has energy). This is how to recognize/understand a window where you look/how wide, then which candidate Next Word to choose, then you may also adapt it by translating it too.

It can be run faster to do it other ways but to understand it can be easier by other ways like this.
Title: Re: Releasing full AGI/evolution research
Post by: Art on December 26, 2019, 02:54:09 PM
Finally we get to hear the real Lock!! Good one! O0
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on December 26, 2019, 03:58:54 PM
The hierarchy can self-organize to lower node count Cost (error) by re-arranging the connections that exist, to learn Byte Pair Encoding (segmentation) on the fly too, not just translation or sequence-building on the fly. You don't have to look at all areas/layers of the hierarchy.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on December 26, 2019, 04:56:20 PM
Attention types: Decides which sense to pay attention to ex. buzzing noising or pop up goals, where/how wide to window on that sense, and which Next Word to predict using more attention.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on December 26, 2019, 05:38:12 PM
Think if you had a GPT-2 prompt 1 word long only ex. scratching, it could be seen as scratch ing, so let's say we had only scratch,  the word to predict next is the one seen before in the data, and the frequent one is more likely chosen. This word scratch can also match related words itch scrap etc and then see what frequent word is next. So the closest related word, and the highest frequent word is chosen.

scratch the x64
scratch it x58
scratch back x38
itch my x76
itch shoulder x50

So my is predicted because itch=scratch a lot and my is seen after itch x76 times. But we aren't done yet. The candidate words/tokens are given an extra score, now, which is relational score to story words 'scratch' (no itch exist) in your input prompt. So we end up picking shoulder say. We may also then translate shoulder to adapt it in ex. stomach if your story was about stomach itch.

If we have 2 words in our generated story so far: 'Scratch stomach', we can predict the next word using either word but we want to use both, so we see what nodes/related nodes including positional rearranging exists in the data ex. scratch stomach, itch bladder, bladder itch, etc, and the more distant they are positioned the lesser of course a match it is.

So say we got a 'first I go into the lake and stay there, second I cold freeze in the winter', the 'first' votes on 'second' because it could be 1st or etc translated, frequency, relation, but it is far back and there's other words, so it definitely has less vote being 1/10th the weight and fading in energy by this time, yet it does have a strong vote too.

Say we find 'stomach itch but it has a word in the middle or one missing ex. 'stomach really itch' or 'stomach', this also has less score but still has some.

What if I say 'second, this is that, third, i am this, but i won't say the next because i refuse to repeat myself'. Here I pay attention to ignore the word 'fourth' being said.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on December 27, 2019, 01:43:14 PM
Let me know if you want more, if no one says so then I assume no one is interested.
Title: Re: Releasing full AGI/evolution research
Post by: goaty on December 28, 2019, 10:49:11 AM
If your sure this is the future of a.i. you should just pursue it further yourself,   if you were just to release it all early for complete free, whats new about it theoretically that's exclusive to your system?
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on December 28, 2019, 01:08:57 PM
It is free. I am sharing my knowledge/current stage. There's a lot of new knowledge there. My AGI description talks on its own to itself and asks us/tells us knowledge. It unsupervisedingly collects internet data and researches, and learns patterns/building blocks, then it Answers Questions using RL using 4 steps. It is transforming the data, old to new, then analyzing it again, old to new, this is bootstrapped Self-DataRecursion, data evolution, this is how evolution happens. And bigger brains are smarter by more context data. The tools/skills are another part for that and can be described by text/visual language of the universe. I also tell you what will happen in the end etc.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on December 28, 2019, 04:47:21 PM
Storing, Forgetting, and Recall. Attention does all. Which sense or where to pay attention to. How much attention. To recognize it. To choose the Next Feature by looking at story words and which to ignore (forget). To Adjust the Next Feature. The nodes that store/recall are attentive / pop up as agenda questions.
Title: Re: Releasing full AGI/evolution research
Post by: Korrelan on December 28, 2019, 05:25:57 PM
Quote
Korrelan, I'm really sorry you won't be able to fully process the green/red colors being color blind but, it doesn't matter much lol.

Thanks for the individual consideration Lock, it's much appreciated, but it's ok because I can read.

You did leave that comment in though, before you posted it on every single forum known to man.  ;D

 :)

ED: OH! I see you have removed it from one non Email based, editable forum... cool.

 :)
Title: Re: Releasing full AGI/evolution research
Post by: goaty on December 28, 2019, 05:54:04 PM
Looks like you've got a good system (I didn't think of it myself, sounds good technique),  but is it intelligent yet?  If not your work isn't finished,  just somewhere along.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on December 28, 2019, 06:28:15 PM
I spammed OpenAi with it too hehe, more 2 come

I call this ignition phase, spread it like fire
Title: Re: Releasing full AGI/evolution research
Post by: Korrelan on December 28, 2019, 06:30:59 PM
You did remove my name though? Yes?

Quote
I spammed OpenAi with it too hehe, more 2 come

Lock answer me... you have taken my name off this spam posting your doing?


https://agi.topicbox.com/groups/agi/T7cbcba9a1ae63532/releasing-full-agi-evolution-research

https://groups.google.com/forum/#!topic/artificial-general-intelligence/U8-wPGwON_0

https://www.mail-archive.com/agi@agi.topicbox.com/msg03960.html

So not funny lock...

Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on December 28, 2019, 07:37:55 PM
your not on the openAI's

and wont be in future posts/redirects

my notes pastees have you as K lol



That's where the doctors hang out, now that you infected everyone onto it.
Title: Re: Releasing full AGI/evolution research
Post by: Korrelan on December 28, 2019, 07:55:41 PM
Remove my name from all your spams.

 :idiot2:
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on December 28, 2019, 08:00:50 PM
only 2 allow but i'll try

All other names are have already been made into code name prior.
Title: Re: Releasing full AGI/evolution research
Post by: Hopefully Something on December 30, 2019, 01:11:05 AM
So the meaning of a word is not contained within it but is instead described by the shape of the web of related words as observed from the vantage point of the word in question? Context is the actual word? 
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on December 30, 2019, 03:12:33 PM
Understanding Compression

To learn how to break the law, of physics, we must understand it better.

https://paste.ee/p/kQLCx

"So the meaning of a word is not contained within it but is instead described by the shape of the web of related words as observed from the vantage point of the word in question? Context is the actual word?"
Yes, a given particle of Earth is defined by all of Earth context (and then it re-checks the all to each again, a self-attentional-SelfRecursion of Data-Improvement like editing a Paper), a exponential explosion of heat is given to the core of Earth and self-extracts free energy from burning fuel. Brains do this, atoms do it, galaxies do it. That's why magnetic domains align and propagate brain waves in brain, team of brains, magnets, etc. AGI will be a collaborative project and already is too, we share data. Let's hug each other (real tight hug).

The big bang was unstable and decompressed. Planets re-compress. Atoms do it. Galaxies do it. A brain compresses to learn the facets of the universe by using data compression, so that it can burn fuel and extract free energy/data from old data (just like batteries, gasoline, and our stomachs). Data evolution, data-Self-Recursion. Lossy/Lossless compression both transform data from one form to another. When you compress a file losslessly, it actually is destroyed and gone because it isn't the same file/data. Compressing/firing employees do this too. Luckily, being lossless, you can re-generate it back at a click of a button (or if you destroy a drawing on your desk and re-draw it from memory), however it takes time to evolve it back, sometimes VERY long time. Brute force to find the smallest compression of the Hutter Prize file would take extremely long. Intelligence is all about speed, evolving domains of nodes (cells, neurons, brains, cities) to find which out-pace each other. This aligns the domains of the brain/group to propagate brain waves faster through the cluster and have a bigger electro-magnetic potential. If we use lossy compression, you can actually get the exact file back but takes much longer. A system in space will collect data to grow, then decompress, a self-extracting drive. This decompression is exponentially explosive and results in smaller agents that evolve to compress-extract so they can resist change. Energy (photons) propagate forward but can be pulled in by gravity and will loop around like in a motionless, cold battery. Change=energy release. Unstable. Equilibrium is the opposite. We seen a algorithm can be run perfectly many times, compress, decompress, compress, repeat. To do this requires a form of equilibrium. Wear and tear affects it though. Yet our sperm/eggs has seen many generations. If the universe contracts back, Earth can emerge again by this self-organizing/attention physics. Different systems and their size evolve different but is based on electromagnetic compression/decompression, Earth if became nanobots would simply grow in size and resist change/death approx. better. Lossless compression is so fast because it's all contained in such a small place like a cor rod and is very hot/related, lossy requires ex. the whole Earth, a form of brute force and exponential hints/data evolve it back faster. Lossless, locally, without brains to discover the data, requires only little data. The bigger a system is the bigger file you can re-create from nothing - a human brain can re-generate back almost anything. Lossless, based on how many particles are in the defined system (uncompressed file size which needs a computer to store/run it), has a limit of how small it can become and so does lossy because Earth is finite in size during a given period quantinized and a file can be re-generated back quite fast if some of it is still around - the lossy file, even if incinerated, can be re-generated back based on how many particles make up Earth. Here we see a file can be compressed deeper the bigger the file is or the bigger the Earth is. With such little of the file left (even just the remaining physics if incinerated) it can come back based on large context but has a limit/need (size of Earth/fileData, time, and compute).

We see the communication/data tech builds on itself exponentially faster, bigger data = better intelligence and extracts exponentially more/better data (per a given system size). Earth is growing and heating up by collecting more mass and extracting/utilizing exponentially more energy like nanobots will when they come. We will harvest Dyson Spheres. Our goal to resist change by finding/eating food and breeding (Darwinian survival) could Paperclip Effect us and explode ourselves! A cycle of compress, decompress. Our goal is to compress data in our files, brains, teams, but also to expand our colony of data. Why? To resist change, to come to equilibrium (end of evolution fora given system exponentially faster). These colony mutants/tribes have longer stable lives being so large and using its size to extract so much. The bigger a system is the less it changes. Imagine destroying all instantly-repairing nanobots superOrganism? Can't. And, the bigger a system the more weight/vote/context interaction (heat) is transmitted/infected, not just to extract free knowledge/heat (motion/energy) but also to fix issues/damage. My body/knowledge  stay the same almost yet my cells/blood all change their spots for new ones, the air stays the same yet it blows around Earth, the heat in my walls stay the same yet the heat moves around, Earth is a fractal of pipes, veins, roads, and internet connections to propagate energy, ideas, blood, waste, traps, and negative electricity, simply to loop it around and re-use it. Distribution of data allows global, not just local, flow/alignments. It moves around and the system can resist change/repair/or, emerge. Or goal is to resist change by using large context/collaboration by aligning random domains to get free energy/knowledge. We have to collect/grow big and digest/extract it so we can resist change better. We are doing both compression and decompression of data/energy and possibly are trying to equal them out so we can come to equilibrium jussst right in the middle of the 2 opposites/attractors. The system we become will be exponentially repairing/immune to change - compression and decompression, however we may be growing larger but less dense as it does so to become approx. more immortal. We will likely need a exhaust/feed though, we will need a fine tuned food source and radiation exit for our global utopia sphere/galactic disc loop string.

So we should be very interested in compression, and decompression, i.e. Bigish Diverse Dropout - which data to destroy and remove/ignore/forget, and Big Diverse Data collection/creation by extracting free data using old data context vote/weight in. In the brain, we do compression and can basically still re-generate the ex. Hutter Prize file despite having a small decompression brain. The need to do both ignore/attend are the same process in Dropout or data collecting/harvesting, and the decompression process when ignore/attend which to extract/collect new data from old data is also the same process, and the compress/decompress processes are the same process too - which to remove and which to attend however to attend fast we need to remove fast, hence these 2 steps are not really the same process. However when you do compress data and create a brain/team, it is easy to attend to the remaining keys. During extraction, you use what you Learned (patterns) to decide what to Generate. So they are both 2 different processes I guess. Btw, when you build a heterarchy you need the hierarchy first, and may not even need the heterarchy! The connections of context handles are already laid. I was going to say, making relational connections doesn't compress data on its own yet in effect does, though.

Some concepts above were compression, decompression, equilibrium (no change/death), exponentialality. We seen how we grow mutants that resist change better by using both compression/decompression (destruction of neurons/ideas/employees/lives/Earth, and creation of such) so we can come to equilibrium exponentially faster by large context weight (which exponentially helps compression, and extraction during Generating (ex. GPT-2's 40GB and 1024 token view)). I'm still unsure if we are just growing and exploding. If the universe only expands then we will likely radiate.

Compression looks for patterns and leads to faster domain alignment/propagation and exponentially faster large brain waves/free energy extraction/re-generation from nothing. If we want to compress the Hutter Prize the most, we will need to stop it from generating multiple choices from a given context (it still uses the context). We could sort all phrases in the file like 'and the' 'but the', 'so I' 'then I', and force it to discover the concept that leads to the re-used code 'the' or 'I'.
Title: Re: Releasing full AGI/evolution research
Post by: Hopefully Something on December 30, 2019, 09:19:07 PM
Resisting change is still change though :P. I'd say the goal is to resist entropy.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on December 30, 2019, 09:30:43 PM
Taking the right path is a lot less change than bumping into the burgler with a shotgun 0O. They simply breed/rejuvenate more than they die. The agent stays most similar when from statue it bends down to grab an apple.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on January 04, 2020, 04:55:32 AM
I've got 10 YTbe subscribers now lol.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on January 08, 2020, 01:37:01 AM
I have my man on the compressor algorithm for 25USD from india. I am learning how they work currently and will shortly post my formal formula for AGI. In the meantime see my entries here: https://agi.topicbox.com/groups/agi

Layer Norm......see it now is just >
https://knowledge.insead.edu/operations/warning-do-not-just-average-predictions-6641

GANs compress data...they generate realistic data...so does lossless prediction...the data fed to it, allows it to work on unseen data...because its so similar

https://royvanrijn.com/blog/2010/02/compression-by-prediction/
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on January 09, 2020, 06:58:38 AM
Me and my employee got the compression working. It is 5 bits per character, normally each char is 8bpc. So the 100MB wiki8 would be about 63MB. Good for a start.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on January 18, 2020, 05:35:34 PM
My order-2 made the 100MB wiki8 file compressed into exactly 40,572,450 bytes. Took exactly 12 hours lol in python. The dictionary (I included it into the 40MB) was 2,069,481 bytes. The decompressor was 4,910 bytes (also included in the 40MB). Code is attached for the non-believers. It's in python so you know it was me cus they are usually in C++ for speed. You can try it on the small input I uploaded. https://paste.ee/p/Cd7Va

The world record is 15MB. 25MB away lol!!!
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on February 24, 2020, 04:18:48 AM
It will move incredibly fast. AGIs can think/move a ton faster and replay skills perfectly and erase bad parts. Can nest deep into thoughts "the girl who saw the guy who wanted the man who said to him to go was here". Recall perfectly, have more memory, longer attention, don't sleep eat poop nag etc. AIs live longer than humans, can clone/download skills etc. Many sensors/motors, many types of them, 3D vision using MRI and sims, wireless communication of visual thoughts, full cooperation, fully times updates shared, can store facts when read them instantly and fast, can see/feel nanobots to control them - we can't, and a lot lot more I won't list here. Advanced nanobots will eat Earth in a day. It's really cheap to gather microscopic data and make small replicators to up your computer fabrication and data intake and manipulation accuracy. The more data/ processors/ arms/ eyes they get and better ones they get, the more such will they get!

Inventing 1 AGI and cloning it on a mass fabrication scale is all we need. The most powerful thing will not be inventing 1 AGI per see, it will be cloning workers on cheap replicating computer hardware, data, arms and eyes. I.E scaling AGI and inventing AGI is all we need.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on March 22, 2020, 01:46:48 PM
Lol

https://rule-reasoning.apps.allenai.org/?p=The%20squirrel%20is%20young.%20%0AThe%20tiger%20is%20rough.%20%0AThe%20tiger%20eats%20the%20bear.%20%0AIf%20something%20eats%20the%20bear%20then%20it%20is%20red.%20%0AIf%20something%20is%20red%20and%20rough%20then%20the%20squirrel%20likes%20the%20tiger.&q=The%20squirrel%20likes%20the%20tiger.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on March 26, 2020, 12:57:39 PM
Huge breakthroughs I made.

See the link below if you're new to neural realistic_future generators for text....aka AGI attention:
https://aidreams.co.uk/forum/index.php?topic=14561.75

Distributed networks are the most powerful Systems. Brain and city government for decision forming. They are most robust. And larger Systems are the most powerful Systems. Big brains (big diverse data) and big teams. They are most robust. Both allow you to go deep, fast, building a large concept/prediction based on many parts. With these de-centralized networks, you have duplicate data so that no human node or brain memory node has to be accessed/used by billions of tasks nor take a long time to complete/reach from all nodes globally. The sum of nodes recreates a node. Prediction makes the future based on the past data/world state, and the human brain keeps a energized dialog state in its local agenda focus while a global sub-conscious attention votes on more-so un-desired nodes as well. Prediction is the decision process that is based on surrounding context in an "environment", be it a womb or a neuron. There's many factors/conditions that trigger actions / thoughts (same thing). To make a prediction of the future, you use the past context. Text generators do this. An exact match is the most basic way to see what occurs next. Word/letter frequency is used to choose more likely predictions. The brain is a physics simulator, with its image and sentence "thoughts". Just the act of a word or image/object appearing next results in truth. In big data, you can get exponentially more out of it using intense/deep "translation" instead of exact matches only. So even if the truth appears to be said many times, it can be overrided by invisible truth deep in the data that the data barely says it wants in life. It's all based on the frequency of what comes next in text. Deep translation let's it gather all the truth it needs. It's a simulation based on real data. This "deep translation" is the very evolution/"AGI" we seek. Data self recursively evolves itself and we do this in our own brain as well until com to a settled down colder equilibrium. In the world before brains that simulate the world, the instinctive short term direct response primitive brain and especially the environment itself like ponds and wombs, use context to evolve itself by making decisions. But the first doesn't remember the past, and the second only remembers the past. The third compares the past to previous states.

So, all based on direct frequency (truth), Deep Translation (for human brains that simulate, not primitive, not raw physics) can extract new data from old data (hidden truth) and decide the future prediction (new truth), evolving the mass of data your using to do this. Desired reward guides this to desired outcomes.

Deep Translation improves prediction for the Hutter Prize in all ways. And notice that attention for deciding which question to ask yourself/others or to act it out in motors for real, is based on, past context - the current state of te system/world.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on March 26, 2020, 01:24:33 PM
See last post.

Oh and see, told yous, regenerating/repairing is used in all AI, here it comes up again:
https://www.youtube.com/watch?v=bXzauli1TyU
try it:
https://distill.pub/2020/growing-ca/
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on March 26, 2020, 01:40:17 PM
they even mention "Embryogenetic Modeling"
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on March 26, 2020, 02:09:40 PM
i found that link btw AFTER i wrote all meh text. See I'm spot on in every way lol....translation, context, etc
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on March 26, 2020, 03:10:04 PM
Similarly we inhibit questions when see the outcome we wanted, which is like cells build/branching during organ creation/regeneration.



notice the link says

"The biggest puzzle in this field is the question of how the cell collective knows what to build and when to stop."

Based on context and hardcoded desires, it grows its way forward.

"While we know of many genes that are required for the process of regeneration, we still do not know the algorithm that is sufficient for cells to know how to build or remodel complex organs to a very specific anatomical end-goal"

“build an eye here”

"Imagine if we could design systems of the same plasticity and robustness as biological life: structures and machines that could grow and repair themselves."

"We will focus on Cellular Automata models as a roadmap for the effort of identifying cell-level rules which give rise to complex, regenerative behavior of the collective. CAs typically consist of a grid of cells being iteratively updated, with the same set of rules being applied to each cell at every step. The new state of a cell depends only on the states of the few cells in its immediate neighborhood. Despite their apparent simplicity, CAs often demonstrate rich, interesting behaviours, and have a long history of being applied to modeling biological phenomena."

"Typical cellular automata update all cells simultaneously. This implies the existence of a global clock, synchronizing all cells. Relying on global synchronisation is not something one expects from a self-organising system. We relax this requirement by assuming that each cell performs an update independently, waiting for a random time interval between updates"

Both local, and global shape of the context (what, and where (position)) affect the prediction.

"We can see that different training runs can lead to models with drastically different long term behaviours. Some tend to die out, some don’t seem to know how to stop growing, but some happen to be almost stable! How can we steer the training towards producing persistent patterns all the time?"

Sounds like GPT-2. When to finish a discovery sentence. Keep on topic until reach goal.

"we wanted the system to evolve from the seed pattern to the target pattern - a trajectory which we achieved in Experiment 1. Now, we want to avoid the instability we observed - which in our dynamical system metaphor consists of making the target pattern an attractor."

"Intuitively we claim that with longer time intervals and several applications of loss, the model is more likely to create an attractor for the target shape, as we iteratively mold the dynamics to return to the target pattern from wherever the system has decided to venture. However, longer time periods substantially increase the training time and more importantly, the memory requirements, given that the entire episode’s intermediate activations must be stored in memory for a backwards-pass to occur."

That sounds like Hutter Prize compressor improvement. Takes more RAM, takes longer, for better regeneration to target from Nothing (seed, compressed state).

"it’s been found that the target morphology is not hard coded by the DNA, but is maintained by a physiological circuit that stores a setpoint for this anatomical homeostasis"

We want to regenerate shape (sort the words/articles), and grow the organism/sentence as well. But avoid non-stop growth past the matured rest state goal and stop de-generation.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on March 27, 2020, 08:49:13 PM
A good AGI predictor can answer this. Can you? Try!

"Those witches who were spotted on the house left in a hurry to see the monk in the cave near the canyon and there was the pot of gold they left and when they returned back they knew where to go if they wanted it back. They knew the keeper now owned it and if they waited too long then he would forever own it for now on."
Who owns what?
Possible Answers: Witches own monk/witches own canyon/monk owns house/monk owns cave/monk owns gold/cave owns pot/there was pot/he owns it
Title: Re: Releasing full AGI/evolution research
Post by: krayvonk on March 27, 2020, 09:05:41 PM
I think that sentence is confusing and if the a.i. couldnt answer it probably wouldnt be so bad a thing.

The keeper of what?  u should dictate it.

Uve got something like word2vec in yours dont you - did u see my post?
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on March 27, 2020, 09:16:51 PM
omg! someone answer it lol....it's EASY.........easy!!!!.........plz?

no w2v yet, but i'll get my own in there and i'm coding it! :)))))))
Title: Re: Releasing full AGI/evolution research
Post by: krayvonk on March 27, 2020, 09:31:52 PM
If ur understanding the text so literally,  the robot would be really easy to trick.
Yeh, its fine if your catering the sentences for it, and its still pretty cool i guess if it could parse its way through that mumbo jumbo you wrote.

Giving the robot a bullshit detector is very important,   its the big "assimilation of lies" problem.   There is a solution tho... thats what more important to me right now.

But I guess all NLP has that problem just about,  so no biggie.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on March 27, 2020, 09:40:19 PM
it DOES flexible to the text.....to get the best answer probable to the answer can get.....u just cant answer my test above :)
Title: Re: Releasing full AGI/evolution research
Post by: krayvonk on March 27, 2020, 10:01:39 PM
Monk owned the gold?
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on March 27, 2020, 10:04:56 PM
Huuuuuuu! U got it!

GOOD JOB! AGI pet deserves a kibble treat. Pats heads lightly. Perr. Open mouth now.

Now we need the AGI TO DO THAT!!! Come on you guys! This is our last stand.
Title: Re: Releasing full AGI/evolution research
Post by: krayvonk on March 27, 2020, 10:08:17 PM
What I want to know,  if its this easy to make A.G.I  how come it wasnt done years ago.  and what about this a.i. winter - lies?  To me it seems.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on March 27, 2020, 10:11:38 PM
Cus you gotta grab it by the balls and say "I'm gonna do it!".

I don't hear anyone hear saying it with enough courage!

That was a real typo btw!
Title: Re: Releasing full AGI/evolution research
Post by: krayvonk on March 27, 2020, 10:42:49 PM
Yeh well what were ppl doing in the 80's  *scratching* their balls?!?!?

The whole vietnam war went thru killed everyone, and why didnt they have drones back then - makes NO sense.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on April 08, 2020, 12:48:03 AM
Way I see motivation/ life is this. Young humans and animals are hyped to enjoy life, not bored of anything yet, and we should stay that way. Try to imagine a really incredible new gift just appeared in your back bedroom, and you feel really excited and want to go rush to go see it. That's true motivation. Dead motivation is giving up or feeling like you've finished the video game. Just start a new game! A reason to live is to have motivation to do so, because staying alive / motivation to do so comes 1st, not 2nd. If we were this motivated, maybe more people would understand we can get things we can't even get right now. And a longer life. Maybe then we could work hard on AGI like we really mean it.
Title: Re: Releasing full AGI/evolution research
Post by: Art on April 08, 2020, 03:16:34 AM
Because they didn't have the technology back then. The Vietnam war wasn't officially declared over until 1975 and there was no such thing as a PC or home computer, let along small enough technology to understand or implement a small enough form factor to allow such things as Drones to be produced or even conceived!!

We have come a long way in 40 years, from small to large and still can't decide which is better! Small Tv's, Small radios, Small ICs, transistors, modules, etc. then larger TVs, Big Screens...Big cars then smaller more efficient cars and the ping pong match continues. But drones in the Mid 70's afraid not.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on April 28, 2020, 12:41:28 AM
Linking my last big threads of progress:
code - https://aidreams.co.uk/forum/index.php?topic=14561.90
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on April 28, 2020, 10:27:44 PM
Reddit removed my AGI book thread, perhaps it was the spiritual kid I was arguing with or an ignorant "leader". Mods fault either way.

anyway new post
https://www.reddit.com/r/agi/comments/g9vzeo/how_the_human_brain_does_math_with_examples/
Title: Re: Releasing full AGI/evolution research
Post by: Korrelan on April 28, 2020, 10:55:46 PM
The reason they removed your post is because you had an odd number of Elf Heads, most people confuse Elf and Elve, the latter being an upward electrical cloud discharge.  An Elf Head however is awarded for each logical reference in a thought, after reading your post you only managed an odd number (-3)… so they removed it.

 :)
Title: Re: Releasing full AGI/evolution research
Post by: infurl on April 29, 2020, 01:46:18 AM
Don't give up genai. You are improving, albeit very slowly. It is a long road and you are starting late but it will be enough to have traveled a little way along it.

You need to eat better food or you won't be alive to enjoy it much longer. I'd also recommend researching the concept of empathy. Even if you don't have any, you can always fake it. When people actually like you, they are more inclined to listen to what you have to say.

PS I'm not going to unblock you here yet, but you may take some comfort in the fact that I haven't blocked you anywhere else yet. Given your track record, consider that as a win.
Title: Re: Releasing full AGI/evolution research
Post by: WriterOfMinds on April 29, 2020, 04:18:52 AM
I've been watching /r/agi for a while.  It's not uncommon (seems like every week or two lately) for someone to show up saying, "I'm starting a new AGI project, join me!" or "AGI isn't that complicated, I figured it all out this morning in the shower, bask in the glory of my theories!"  None of these people have demonstrable results, just half-baked ideas -- and after annoying the subreddit residents for a bit, they soon disappear again.  So I think everyone over there is a little jaded.  Can you really blame them?
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on April 29, 2020, 05:14:21 AM
Ya I noticed the kids there, it looked like me in my early days, they are/ were me, yet distasteful as well isn't it - it isn't good they have less experience. I noticed kids wait on every question and are precise, adults will pick only the cherries, kids are much more free timed... good while bad, no free lunch.

It was actually, utterly hilarious, that one kid was talking just like me when I began, like oh ya this is simple, it's do all this by doing this, [overlooking things], etc
Title: Re: Releasing full AGI/evolution research
Post by: infurl on April 29, 2020, 05:33:53 AM
I've been watching /r/agi for a while.  It's not uncommon (seems like every week or two lately) for someone to show up saying, "I'm starting a new AGI project, join me!" or "AGI isn't that complicated, I figured it all out this morning in the shower, bask in the glory of my theories!"  None of these people have demonstrable results, just half-baked ideas -- and after annoying the subreddit residents for a bit, they soon disappear again.  So I think everyone over there is a little jaded.  Can you really blame them?

I just assumed they were all the OP of this thread, creating a new account every time he got banned.

We used to get a lot of those types here, but it's been a while. I don't miss them at all because they would get abusive when they didn't win any converts to their cause. The Reddit moderators typically ban them before they even get to that stage.

Due credit to our moderators, being crazy here won't get you banned, but being abusive will.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on April 29, 2020, 08:08:46 AM
Consider video. Humans recognize object features, like apple, stem, curve, even a dot. Some apples are green, some are distorted, some may look like a piano (almost never lol). It is the shape of a piano, and the color and texture zig zags of an apple, so depending on which you focus on you will notice one in isolation. Of course to recognize 'apple' fully will require not just texture and color but also shape, meaning 'apple' doesn't activate fully because it has the shape of a piano in our case here. But it can be sure it is most likely an apple for various reasons, and edible. We can recognize an apple being thrown as 'motion' and 'motion of apple'. Human seem to write text as they see vision, like I lift my arm and throw a ball to Tom. So humans will see a sequence of these features like apple...moving...hits wall....apple disappears....wall remains. And humans will remember important features like food and will talk about them on their own to creating related data. Humans use text as their gateway to say what they are thinking. Apples, motion, and shapes appear on Earth multiple places, there's re-occurrence. Vision can store a sequence in an image or a video of images. Position matters less in a single frame unless you pay attention to a let's say pirate map painting a trail to an X like a>b>c>d>X...in that case you see a sequence in a single image! However how long did you look at the image? Ah so it was a video, of what you saw in order. You always see 1 feature at a time, it may be 'apple', or 'multiple fruit', and when it is 'apple' it may slightly activate 'multiple fruit' if the surrounds seep into your attention by accident.
Title: Re: Releasing full AGI/evolution research
Post by: Hopefully Something on April 29, 2020, 03:07:21 PM
Its cool to realize that there are more things than things, because motions and arrangements of visible things are useful too... So, you memorize those motions and arrangements, then recognize them one at a time! But how do you realize what’s useful to recognize? We do most of it with reward systems and the guidance of symbols, (which can indicate indirectly rewarding things). We have no inherent reward system for a lamp, but lamps can lead to things with direct rewards (limbic system), which is what we’re after. We can adapt a decreasing limbic response to indirectly rewarding things to help us remember them, and their order.

An arrangement of symbols can immortalize, and transfer, a useful (leading to rewards) separation of features from the background, * (which I guess would be the very first noticed thing) *. But then how do our reward systems know what’s useful and warranting a reward? Evolution causes everything to have reward systems aligned with the survival of its species. Evolution for AGI would = changes which cause survival. The exact nature of the reward system should match the capabilities of the AGI. The reward system should morph to maximize survival, which should prompt alterations to the AGI’s capabilities to maximally satisfy the new reward system. By then the environment will have changed the reward system even further, which would indicate new useful changes to the AGI’s capabilities. Does this go on forever? Things can't improve forever...
Title: Re: Releasing full AGI/evolution research
Post by: Yervelcome on April 29, 2020, 05:26:50 PM
The "lamp" example is instructive. If you want to classify it based on rewards and sub-rewards, you can split it into two types of utility:

1. How to use it - "I'm in the dark, and I want light to see".
2. What to call it - "I want communicate this clearly to others, to tell them to turn it on, or to tell them to buy one, etc".

These two are two different types of utility.

In the case of (1), I actually don't care if it's a lamp, a chandelier, a standing lamp, or a wall light. I want to take action that helps me see.
In the case of (2), the specific name and type is important, because I have to communicate it to others using words.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on April 29, 2020, 11:00:46 PM
NOTE: just read my book lol.... latest version >>> https://workupload.com/file/ZK2zt4PJDev

HS says "so, reward and...?"

The 4 golden supreme importance factor/determiner in recognition, and prediction after the recognized context, is based on reward, frequency, energy remaining in nodes, and relation to story words. The prediction is affected by story words, the words in the story are clarified from 'it' to what they refer to or from 'bank' to 'river', and words in the story are recognized as other similar words if need during finding matches. Once gather matches, you take the end word to get the entail prediction candidates.

HS your thoughts are all over the place.... :) Not that bad this time I guess though.

The 2 golden tricks in the brain is How to talk truthfully in the maze of sentence planning, and where to go in the maze of thought paths (reward). Frequency and reward and relatedWords works all this....it'll talk like GPT-2 but about stuff like dirty waifues and nanobots. Truth=good prediction. Trust = context....same thing, all decisions are based on context, be it leaf falling or brain worrying about a tree leaf.

HS asks "how long does evolution and goal updating last for"
The God sphere is the final technology. The final iphone. The ultimate machine. The instant regenerator and best predictor of the future. A hive of cooperative nanobots. It can morph and create anything at high speeds, like TV screens but in 3D. The whole thing is a wireless distributed network that acts as neuron nodes, hands, and eyeballs. The brain is the body sensors and motors simultaneously. Goal is always Survival. Units have sub goals like find food or attack enemy ship or find where John went in the fridge basement and need his iphone and find the oldest msg on it.... goals update yes, by related words, food=money, money=programming, etc....which allows reward to transfer node to node and leak just like energy. Root goals are harder to change. Your higher layer goals do change until get to the final form that survives most probablistically.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 02, 2020, 01:29:28 AM
Note: GPT-2 trained on 40GB of text, but I show below what also makes it tick besides dataset frequencies.

In this video I do tests to GPT-2 and show proof that the word to predict is heavily influenced by story words no matter where they are in the story. Watch as I force GPT-2 to change the %s as I add words to various places. Notice also, as I show at the end using "bab7" (y4Fa2b works too), if it seen it follow in the story, it makes it even more likely, even though never seen in the dataset. Also, when you talk about trees etc it will also boost up not only trees candidates but leafs grass etc etc. It uses something like Word2vec not only to recognize the sentence (it also focuses on important words so to summarize it), but also for voting on the prediction candidates, see!

https://www.youtube.com/watch?v=RWd-LBgC9UM&feature=youtu.be

It allows it to consider/ recognize very very long unseen context by using something like word2vec to find context matches that aren't exact and by focusing on important words that are more rare, and then to stretch even longer and become more accurate as well it boosts candidate predictions using remaining energy from prior activations. More recent words boost the predictions the most, they have not lost energy as much.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 02, 2020, 12:55:38 PM
Hehe they are upvoting meh!!
https://www.reddit.com/r/agi/comments/gbv8bg/video_my_tests_show_what_makes_gpt2_tick/

it's at 5 upvotes
Title: Re: Releasing full AGI/evolution research
Post by: krayvonk on May 02, 2020, 01:36:31 PM
You know what would be a good form for a robot - in a text based system?
Try plugging in AD&D second edition missions,  and u have 2 ppl in the "dialogue" ,  The doungeon master, and the player.

It would only have to be 2 components,  because the dungeon master counts for all the other players maybe.
This way, it can occupy the position of both the doungeon master, and the player, in the "dialogue"

Then if you wanted it to be a real robot,  you need to do a computer vision thing, where it turns into the doungeon master text,
and then the robot acts as the player, so you have to convert the player text, into the robot actions.   At all the moments where the player succeeds, add a doggy treat,  and itll always search for these moments.


Then you just have to non-stop type in fast forwards for a couple of years to build up the 40 gigabytes of text,   and then it should be ready to go!  =)
Title: Re: Releasing full AGI/evolution research
Post by: Yervelcome on May 02, 2020, 10:52:50 PM
I'd like to give that robot DM the knowledge of what it feels like to walk through a forest killing goblins.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 03, 2020, 04:47:56 AM
Iv'e been coding my own text predictors and researching them hard and I notice the key is frequencies of what usually comes next after your context [th] ex. you see [e] has high count observations. But you want the past window to be very long and grab your entail frequencies from those recognized features ex. [my dog ate the][k]. Which allows good letter prediction. GPT-2 predicts BPE words. You actually mix multiple windows because longer context matches are rarer but you can at least get some frequencies of what's seen to follow them. To help find longer matches you'd want to "translate" words cat=dog, accept different position appearance, and focus on rare words to summarize the context, so that when you look at the past 30 words you can find multiple matches in memory even though there is of course no experience exacting matching it - the alternative words, position, and filler content is in there but is similar or doesn't matter. So in the end, frequencies is running it, and even the recognition cat=dog is based on discovered shared contexts, based on frequencies. Probabilities run it and if a match is not exact then it's predictions will all get less weight.

Yes, what I show in the video appears to helps prediction by making the predictions more "similar" to the *rare story words (especially more recent words), it can look at ALL the past context. The main prediction in these algorithms however is from looking at the past ex. 3 or 20 words to get multiple "similar matches" to see what usually follows the matched contexts. You can look farther back if you 1) attend to only rare words and ignore ex. 'the', 2) can use similar words 'cat/dog', and 3) use similar position.

When you know "Paris is the capital of France" and see a new prompt "The capital of France is " you predict Paris with o-k accuracy because the context mostly matches this (and a few other things in your brain), and the 2 words that are switched around exist but with similar positions.

A good question is, do story words actually take their own vote on the prediction candidates? Or do we only use context matches to see what does come next? Well, if I keep adding the word 'cat' to the start of my prompt, it makes 'cat' more probable, inch by inch the probability rises, which would be unlikely that matches are finding this is what commonly follows. Below is a new video testing it out to see if the prediction is influenced from context matches solely or if it does actually use as well all story words to mindlessly vote on the next word (if the input is all cat, it's likely it will continue saying 'cat' or the similar).

https://www.youtube.com/watch?v=kF8U2FD9JXc&feature=youtu.be

I could try in Allen these inputs: 'bat' or 'bat bat' or 'bat bat bat', or, 'wind' or 'wind wind' or 'wind wind wind'....and no matter the word used, it will predict the same word, with more probability the more times it occurs. In the dataset it trained on is only briefly similar phrases, and I don't think they predict the same word that occurs in them. Yes my input matches them more and more because of similar words and hence the prediction will be similar, but, I don't feel out of 40GB there is enough "matches" to achieve that.

Keep in mind it predicts the *same* word, you'd think 'bat bat bat bat bat bat' would match things like 'my bat saw a bird on a bat but bats fly in bat caves' etc and would often predict only similar words like cave or bird....how many matches could you get that incrementally improve the prediction of 'bat'!? Impossible.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 03, 2020, 07:03:16 AM
Thought Experiment - What does a body do for a brain?


In this thought experiment, I take a look into how the brain uses input and output from its body.

Let's imagine we have a brain in a body, and is stationed in a large laboratory that enhances its research and exploration. The reason it is not stationed in a normal home is because technological advances come from experiments and data processing, not sleeping eating and sitting on home furniture.

The lab and body won't intelligently move on its own, so let's direct focus to the brain now.

The brain cannot make any non-random output decisions yet, as it has had no input experiences yet. To do better output it needs input first. So the first thing we can do is either feed it lots of diverse image/text data, or let it try to walk forward and receive data about which motor actions got motion. So far so good, we just made it store its first experiences.

Up to now the brain has only received random input data (note the real world data isn't random but the source is) from all sorts of sources. It didn't decide where to collect data from, as it couldn't output any non-random decisions.

Now our brain can decide to tweak its walking skills to further improve its speed, or can decide to collect text/image data from certain sources such as particular websites or particular real life experiments. For example it may be trying to invent better storage devices and wants to see if it's predictions are correct or may want to collect data from there simply. Testing it's predictions is also data collection because it boosts it's already existing beliefs's probabilities.

The trend here seems to show that as it collects data, it is getting more precise where to collect it from. Without output from the brain, the brain could never collect data from specific areas. The brain is learning where to collect data from.

The 2 uses the brain has for output is 1) specific data collection, and 2) implementing solutions ex. a new product to market and seeing their mission completed (this is also data collection).

Our brain, if it had a solution to a big problem on Earth will all road-bumps covered, could just tell us it of course. It wouldn't absolutely require a body to implement a "plan".

The "coming up with a *new* plan" is done in the brain, it needs a lot of on topic data also. The output of the brain to the body is just to collect more desired data.

What is most interesting is when you have a lot of data, you can make a lot of connections in it and generate new data by using a method called Induction.

So what do yous think about the idea that we could make a powerful AGI without a body? I mean it still would have output to talk to us and input from various websites and experiments it asks us to do, but it wouldn't need a LOT of real life experiments if it has a lot of data because it can mostly generate its own data at that point and fill in gaps.

So most of its output in that case would be either a solution to a problem or a few requests of where to collect new data from if its solution isn't ready yet. Other than than it would be mostly doing induction internally using huge amounts of data. After all, experiments are only to collect data, we can give it lots even if not from precise tests.

My point here is AGI creates new data and is an induction engine, and works better with huge amounts of diverse data and on-topic data as well. That's all its input does - is provide data. The output is to collect certain data. But AGI *needs to generate *new data using all this data and/or find part of its solutions in data. For example finding a device in nature or an advanced civilization would be a solution that eliminates many sub goals. It could read about it in data too, if it trusts that data.

In that sense, AGI is about re-sorting existing features/data into new data/ devices or skills. To do that it needs a lot of data. AGI generates the future using a Lot of context.

What do yous think? Can we sort of get away without a body and just make AGI in the computer and talk to us using its text/vision thoughts? And can we get way with doing lots of specific experiments from the right locations and times and just use the slightly-more random big data? To me it appears to be yes we can. The AGI could still extract/form new answers from the large data. Like you know how Prediction works right? You can answer unseen questions or image hole fill in? So AGI can just 'know' answers to things using related data. And, what if it can just watch microscopic data and do all sorts of random experiments to see what "happens" and build a better model!? It is true though brute force is not computable in our case, but it's an idea.
Title: Re: Releasing full AGI/evolution research
Post by: Hopefully Something on May 03, 2020, 08:18:50 AM
To the degree and distance that things can be predicted, we should try to build a predictor AGI to do that. Its a visionary project with potentially great value to humans and sundry.
You are so devoted, and have invested so much time into this project that its probably difficult to talk to others about it in a productive way, because you are the only expert on this specific hypothetical technology. Its mostly still in your brain, as far as I can tell. I'll be curious to see how you will be able to manifest it for real, and how good such a large scale predicting machine will be able to get. Its definitely possible.
Title: Re: Releasing full AGI/evolution research
Post by: WriterOfMinds on May 03, 2020, 04:41:24 PM
@LOCKSUIT: The latest seems like one of your more coherent and easily understandable posts.  More like this, please?

It's my opinion that AGI does not need to be embodied, though I know that there are many who would not share that opinion.  However, I suspect that there's more to making an effective bodiless AGI than just "feed it a ton of data." 

The prediction machine you're describing could easily be a Chinese Room.  If you don't know what that is, I recommend reading up on it. For any input it is given, the Chinese Room produces reasonable output, but it doesn't really "understand" anything because it has no symbol grounding ... i.e. it can effectively manipulate data, but it has no idea what the data "means" to itself or anybody else.  So it speaks coherently, but does not communicate; it has activity, but is not really an agent.  A Chinese Room could be useful, but whether it qualifies as an intelligent entity is debatable.

Then again, given what your particular goals are, so long as it produces "discoveries" in its output you may not really care?  However, providing your AI with some symbol grounding could still be a faster and more accurate way to achieve results than relying solely on pattern-finding in otherwise meaningless data.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 04, 2020, 06:39:00 AM
Quote
@LOCKSUIT: The latest seems like one of your more coherent and easily understandable posts.  More like this, please?
😃 Well... They kept deleting my posts on Reddit.... So I had to make it really really really simple to grasp so that 0 energy was needed to realize it... But sure.

But in text, there is meaningfulness/ grounding. Words share context, you'll find in big text data that cats and dogs both run, eat, sleep, have tails, etc. And you can describe what a word means by using Prediction or Translation ex. 'loop' is a cyclic iteration of some process repeated many times, or 'loop' = 'iteration' etc.

All sensory data can associate, all a body/output does for a brain is collect yet more data but from non-random sources. AGI is all about collecting/ creating new desired data from old states of Earth/ old data...self-modifying data...Earth evolves/ generates itself. Intelligence at its root is how long a machine can maintain its form for, hence it seeks to understand the universe and grow in size to create an army.
Title: Re: Releasing full AGI/evolution research
Post by: Art on May 04, 2020, 04:23:33 PM
"...So I had to make it really really really simple to grasp..."

To borrow a quote from Einstein: "Everything should be as simple as it can be but not simpler!"

Title: Re: Releasing full AGI/evolution research
Post by: ivan.moony on May 12, 2020, 07:57:21 PM
Hi Lock :)

Any new discoveries lately?
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 13, 2020, 07:44:03 AM
No more biggies yet, but will share more my current work soon...

I mostly have more to implement than need to discover...

The remaining puzzle pieces to "my AGI" won't exactly come from coding my current plan though, but rather more back to the drawing board.

It would be best if I had a real-time chat team that can "construct AGI now" and come to some sort of agreeance of an architecture, Person A would be like, no, because this makes more sense, person B would be like, oh, then you need this, etc, then we could make AGI already finally instead of our 1man missions.

If we don't hurry, we're all gonna die.... We have the chance to become nearly immortal young-again kings, let's do it...
Title: Re: Releasing full AGI/evolution research
Post by: ivan.moony on May 13, 2020, 08:17:55 AM
No more biggies yet, but will share more my current work soon...

May I propose to try to compose a bit shorter video this time (if that's your favorite form of expression). It would be nice if you structure your form of expression into chapters, sections, paragraphs... It helps understanding. Also, there are templates of expressing that researchers usually exhibit, like:


It doesn't have to be this exact template, but some form of structure would be desirable.

I mostly have more to implement than need to discover...

The remaining puzzle pieces to "my AGI" won't exactly come from coding my current plan though, but rather more back to the drawing board.

Yeah, those are cycles I found in my work too. It exchanges between thinking out, implementing, then thinking more thorough, then reimplementing more again, ...

It would be best if I had a real-time chat team that can "construct AGI now" and come to some sort of agreeance of an architecture, Person A would be like, no, because this makes more sense, person B would be like, oh, then you need this, etc, then we could make AGI already finally instead of our 1man missions.

It is very hard to gather a crew, and there are issues about being a leader, and being a follower, all the way down the command chain. I generally don't like being a part of that kind of structure. It's more like everyone has their own plan, and others have to abandon their plan to follow the leader. I don't want anyone to abandon their plan for me, and I don't want to abandon my plan either. So I choose to work alone.

But it might be a good idea to create a collaborative thinking platform software where people would have a structured chalk board and choose on their own whether they want to bring up a new issue-subissue, or help solving existing ones.

If we don't hurry, we're all gonna die.... We have the chance to become nearly immortal young-again kings, let's do it...

I believe It is not that much bad. Dying could be a bit of an itch, but we won't be neither the first ones, neither the last ones to die. Anyway, somewhere in the future, someone might find a way to bring us back from the death, but it may be a case that it is better to be what we call dead, then to be alive. But it is worth of checking and being aware of the difference anyway. Personally, I would want to know what it's alike to be dead. My presumptions say that it is the only ethical form of existing, and that being alive is a gift we got from someone, but we would want to return it one sunny day to that someone. This is just a feeling, I might be wrong.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 13, 2020, 08:53:00 AM
It is well know in Evolution that smarter machines emerge, but what makes them smart is that they can survive longer by modeling the world with inborn reflexes, schooling, and discovery. We evolved vocal cords and hands, to communicate ideas and build on them. Everyone's root goal is survival, shelter, food, and breeding, sure you can like race car boats and skydiving but a regenerative, tool-building, army is better at cooperating/ competing over limited resources, The ultimate technology will be very smart, it can instantly make you anything at whim, but it isn't just anything that goes on in physics, but things that maintain the form; duplication of information, mutations, food finding, radiation (waste). So, because humans can't live forever and are perplexed at how to achieve it, they criticize the ability to achieve it, even though they'd take it right up if they never had to age! It probably wouldn't even be a thought, no one would know ageing existed one day, they just carry on living. Machines can be programmed to erase memories and be happy always. Most humans say they want to live at the moment, some do say they don't but most won't. It's because of physics.
Title: Re: Releasing full AGI/evolution research
Post by: Art on May 13, 2020, 02:30:24 PM
...and to dust, you shall return...
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 13, 2020, 05:28:05 PM
Had to re-watch my new video below to make sure it has everything most important in it as intended. Looks good. Do watch the full video, some things are further ahead/ is never a dull moment. First read the text below until you find the video, it's important...

I guess I'll give a little text speech below to go with it as well to solidify the presentation, just in case. Do note that below is also more many important paragraphs, this is just a presentation one for the video though:



As shown in the video, the brain is a collection of mini hierarchies, re-using shared feature nodes linked as contexts.

My working code uses a trie/tree and each letter and phrase seen up to now has frequencies stored based on how many times it (a node) was accessed. Larger phrases have fewer appearances. These are strengthened connections/ weights in the hierarchies!! They decide how much energy goes to a parent (how open the channel is). Multiple nodes in my code get activated and each has multiple parents that get activated too, and some are shared by other recognized nodes ex. 'the a' and 'he a' and 'e a' and ' a'.

You can see as well if It did recognize multiple similar nodes too (translation) they would as well share similar prediction candidates. All this works on its own. No backprop! > We increment node frequencies based on node accesses, and related nodes cat/dog are discovered/activated based on shared context leaking energy from cat to dog nodes on its own.

The prediction candidates that entail the ends also retain energy from prior activation, more if more recently, large proof is below BTW, and lesser if was a similar node that activated it. That's Temporary Energy, it can look at all the last 1,000 words!

Permanent Energy is always active memory, Rewards, look at Facebook's Blender chatbot that uses a Dialog Persona to make it talk about its agenda! It can have multiple goal nodes.

My design allows goal node updating by leaking reward to similar nodes ex. food=money or food=dinner, and now I will start predicting/ talking the next words about money now. Root goals are not as changeable, the artificial rewards are sub-goals.

The energy in the net defines itself over time, energy from multiple activated nodes leaks to a single candidate prediction (top k predictions, usually the most probable one, especially if very high probability than other candidates). My code stores Online the predicted Next Letter right now and that is what generates new discoveries that are somewhat true, depending on how confident the prediction probability is.

A good brain exploits and uses the likeliest nodes as prediction, not random data collection. But in dreams you can see we do not generate/ talk about our desired/probable nodes, but random ones, especially activated ones from the last day. It wants you to explore and generate more randomly by not using Permanent Rewarded goal nodes, and look around the last day's experiences to search for a while for discoveries.

The goal is to see how to predict/get to the desired outcome... It needs a lot of data to be sure it "made it" and isn't simply being told "aliens arrived, u can stop working no artificial organs now". It has to search sometimes for a while, and go through many sub goals, until it fits with the data... This part massively confuses me, how it actually knows how it reached the answer/ implemented in real life or has a solid discovery..... Or I mean how it knows which sub goals to make and which get met.... Pretty much we want it to make many desired discoveries, and listen to us if needs more data (either time to implement its idea or IOW feed it new data....to get new sub goal question rewards)

--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------

(video) How can I improve my AGI architecture?

In the video below, I walk you through a fair amount of my AGI architecture I've been working on for 5 years. I'm looking for if I am missing something or if am on to something. The design is meant to be very very simple and explain a lot of how thinking occurs. Below is how my text predictor code works (100MB compresses to approx. 21.8MB), please read it twice before jumping into the video, you will learn some fundamental things all good text predictors are doing using frequency. Frequency is also used for discovering the word cat=dog. Note that that compression is for evaluation and is different than compressing a network to learn a better model. I should have also included in the video that Summarization, Translation, and Elaboration would be controlled by how much energy is allowed - you only say important features when you Summarize,  not frequent or unrelated or unloved words.

How my text predictor/ compressor works (100MB>21.8MB):
My algorithm has a 17 letter long window step along the input file 1 letter (byte) at a time, updating a tree as it sees new data. The tree's branches are 17 nodes long because it adds a window to tree (after it finishes its search process described next), and updates node counts if passes any node. For each step the window takes, the algorithm searches the tree for 17 different searches each a letter longer. The children leafs (the final letter of a searched branch) are the predictions with counts seen so far in the file. Layer 1 nodes are children too and need no match. The tree is storing the frequency of all 1/2/3.../17 letters seen so far. The children are what allows you to predict/compress the next letter accurately. These 17 sets of predictions must be mixed because while the longest set is more accurate - we have less statistics, sometimes only 2 counts. We start with the longest found. Ex. 14 letter match in the tree. The 14th set of predictions may say it seen come next a=44, b=33, f=25, w=7. I sum a set's counts up to get a total of (in this case) 109, then I divide each count by the total to get %s that all add up to 1% ex. 0.404% 0.35%.... Now for all these predicted %s, we still have 13 sets to mix and must remove some % from them each. So what I do is I check the total counts of the set against a Wanted Roof ex. 109<>300 (maybe we don't even need to mix lower sets if we got enough stats), and so I cut each % of each prediction by about 1/3rd then in this case. And in this case we still desire 66% more stats. For the next set, if say we have 200<>300, I take away 2/3rds from the 66% - meaning we still desire 22%, not 66% - 2/3rds = 0%! I take away the % got OF the % still desired. A little bit of lower sets always leak in therefore, which is better because we can never be sure even if surpass Roof by lots. Besides, it gave better results. But Roof is decided by how many predicted symbols are in the set (total unique symbols being predicted), so if i have 2 then Roof may be 8 counts wanted. Also, while the Roof is based on how many different symbols are seen in the set, we get a slightly different Roof if we are on the ex. 5th set, i.e. if we have 4 letters in the set #14 then Roof is ex. 33, but if it is set #5 then Roof is ex. 26. Also, based on the Roof's size, a curve's bend is modified. This Activation Function curve/threshold gives small/large total counts in a set an even smaller/larger total (but it isn't used in the Arithmetic Coding, it's only used for deciding how much % this set gets in our mixer). This is meant to be a exponential activation. Finally a global weight is given to each set ex. the 14th set is always given 0.7% of the weight it was going to get lol. I hardcoded the numbers for now but the code isn't grossly large of course. If they were adaptive and were based on the data then the compression would be even better. I just noticed I do exit the mixing before reach lower sets if the Roof is ever surpassed, I'll have to test if this is useful. The Arithmetic Coder takes the combined sets i.e. the prediction %s are combined a, b, c + a, b, c + a, b, c ..... = a, b, c (softmaxed so all the predictions add up to 1% i.e. a, b, c = 1%), and the AC then takes a high and low bound 1-0 and takes the middle between the high and low, and starts misusing each % of the set, until matches the final letter in the window (same process whether compress or decompress). So say we stop once reach b in our set ex. a, *b*, c, we are in the float precision now of ex. 0.45-0.22. WE take middle again (0.23) and start misusing (once the window on the file takes another step. The encoding decimal keeps getting more precise, storing the whole file. To work in 16 byte float we need to carry away locked digits, meaning if the high and low are both now 0.457594-0.458988, we store '45' and get now 0.7594-0.8988, and we are going to be taking the middle of these 2 to make the decimal more precise then. This long decimal is then stored as a binary bin number ex. 6456453634636=10100011100111010011. I didn't implement the window to store the last  few letter as branches i.e. the 17 letter window adds itself to tree but before predicting next it could add the 16, 15, 14, etc as shorter branches which would help just a 'bit' more. I didn't implement the removing same counts from lower sets that are just from the higher set, because it hurt compression, i.e. if there is 9 counts total in set 3 and 99 total in set 2, 9 of the counts in set 2 are the same observations and 'should' not help us reach Roof. I'll look into it more. Lastly, escape letters, my first set we mix is a dummy set that has super small weight and has every possible letter, in case we need to encode/decode one and hasn't yet seen it in the file, hence requires a small room in the AC high low bounds. I also hardcoded each probability in this dummy set, common letters get more weight. Compression/decompression takes 2 hours and 16 minutes for 10MB, but Python is slower. Ram is fairly big because I didn't implement the pruning. My algorithm handles incomplete/noisy information (uncertainty) unsupervised Online hence the mixing of window models. Better net or net compression and/or file compression and insight extraction (not decompression of FILE !), faster code and less RAM Working Memory used, all lead us closer to AGI, and smaller code does (a bit).

My code is in Python but for now I'm linking Shelwien's Green in C++, it's very similar. https://encode.su/threads/541-Simple-bytewise-context-mixing-demo

Video:
https://www.youtube.com/watch?v=-9mGm6175BQ

I think one key difference in ANNs for text is the network doesn't store nodes that can be displayed as solid letters and phrases as mine can, for example the lowest layer nodes a b and c may all point to the parent node 'abc', which has ex. a count of 5 times seen so far, but the 'a' that builds it has only 3 accesses seen so far. So instead of blended words or phrases, like 'cotg' made from cat/dog you might even get 'cOtG' where some children affect the node less. I'm unsure yet if that's useful.

From testing GPT-2 and making my own algorithm last year, I have strong evidence that nodes retain energy and the frequency predictions are helped out by already existing energy sitting in related nodes. As you know, when you hear something, it remains on your mind for quite some time. The last 80 words read are all energized, stored in order but as chunks, and are not on paper anymore but in your brain! They *need* to remain active in your brain. The more activated a similar phrase node - the more activated its prediction parents will be. But word nodes may also leak energy to other similar word nodes as well. The energy sitting around definitely will add to the prediction energies therefore, see? If 'leaf' is activated 40 words ago, and our prediction predict letters from word nodes, the leaf and grass etc nodes will also be pre-activated some bit. These energies eventually fade off your mind exponentially.

We can see Facebook's Blender uses also Permanent energies using a "Dialog" as they call it, making it *always talk/ask as if it has an agenda for being a communist. These nodes are hard reward coded from birth and *should update other related nodes to create new sub goals for the food node goal it will never change since is more reward hardcoded, you know you can't change the food node as its critical for survival.
https://www.youtube.com/watch?v=wTIPGoHLw_8

My main points here is frequency in predictions runs my code, and recognizing similar phrases will increase counts (found using frequency, closest affect it most in delay time), using energy to boost related predictions helps a ton, and permanent reward does too. See how all that and more work in the hierarchies? What more can we do!? Can you add anything!?

I'm really excited if even just one of yous can advance the AGI design I'm at. I've seen a lot of ANN variants like variants of GANs, LSTMs, Autoencoders, etc etc, they seem to have things like residual connections, layer norm, convolution windows, many feedforward networks stacked, etc, while my design just sticks to a single collection of hierarchies. Of course you can get the same result by similar ways or by breaking it down into multiple tools with math tricks to get same result. But I'm looking for a more explainable architecture that unifies everything into the same general network, and can worry about the math tricks later. That's why I say in my work that to predict the next word, we ex. look at the last context (like GPT-2 does) and activate multiple similar phrase nodes in the hierarchy and see what entails them all, they are all little judges/hierarchies. I don't hear many people saying this, just RNN this, GAN that, no actual straightforward theory.

Transformers have been proven tangibly better than LSTMs in all areas (check out OpenAI and BERT etc), and the Attention Is All You Need papers says, well, it in the tittle. and was written by Google researchers. Transformers are parallel and can process much more faster than RNNs. And you don't need the recurrentness or LSTM schema, which is confusing.

I've read many articles on Transformers, they have a long process and many things used, and after reading them all there is no explanation how it actually works, anywhere, I'm the only one no Earth saying how GPT-2 works. There is some explanation if you look at Word2Vec or the Hutter Prize algorithms like PPM, but no one "knows" how GPT-2 works.

Energy remains in nodes and fades....see:
Improved edit:
Our brain is always dreaming, even when not dreaming or daydreaming. We actually recall stored features (especially energized or loved ones ex. when asleep from the last day, but we can do that in day too it just makes sure you explore, though you rarely exploit in dreams ex. work on AGI only) and recreate/create an experience, in the brain - we can't feel the real world.

Even more proof:

The proof that shows Temporarily Energized nodes do affect prediction is not just the fact that nodes recently heard must stay in memory active, but also if you had only a small dataset ex. 0KBs and were shown a prompt "the cat and dog cat saw a cat and the " - the next word is not going to be predicted well, much, but out of the 10 words in that prompt, 3 are "cat", so our probabilities can slap on 0.3 probability to predict "cat" next! This is much more powerful if discover cat=dog by shared context, we can see cat/dog appears 4 times in the prompt, often the past paragraph will talk about grass, leaves, trees etc if is about trees. Because all paragraphs will always contain "the" more than any other words, we ignore common words.

Also, Permanently Active nodes and Semi-Permanent Active nodes have reward on them which makes you talk about question goals you love/desire. So our "GPT-2" would talk about likely ways, to get what it wants. Mental RL. If nodes have Permanant activity which affects predictions, it's more likely that Temporarily Active nodes also affect prediction.

With my viz in the video (the image of the hierarchy), input goes up and activates nodes, and as well the predicted next word (parent nodes), but only the winner node (usually top candidate probability). The energy chaining the text in my design does not need to flow back down the net (generate output) to do this, because energy just leaks and keeps leaking, as you talk to yourself in your brain you hear the next word predicted and loops back into your net from bottom but maybe not as I just explained why...it would only loop back and activate the same node anyway. Another thing I said was you could duplicate the net and flip it so input goes up, output goes up out, not back down, but again, unneeded duplication and doesn't make it faster in this case.

Also, humans read text word by word level usually, not parallelly, hence far back nodes are losing energy, you must implement that in a parallel approach. Also as it talks to itself and humans it can only generate 1 word of the future and doesn't have it all yet. So for training on big data, you could do a parallel approach, but not for new data. Also the brain learns a a bi-directional context around a word feature, when it predicts the next word it only uses the left hand side past but its memory let's it see into the future before write next words, so in this sense learning a whole sentence fed in in parallel doesn't make the hierarchies any different, it increments frequencies (strength), adds nodes/connections, etc, same way as non-parallel approach, and the brain is predicting by looking ahead and is also using bi-directional network storage to recognize the feature it is looking at too.

So, learning data in parallel seems to work (storage-wise, all data/ relationships are/ can be captured), and prediction of new words/data is done in the net by leaking activity, bi-directional translation and future look ahead still work for prediction too.
Title: Re: Releasing full AGI/evolution research
Post by: ivan.moony on May 14, 2020, 07:32:47 AM
Why only one static image through entire video? I'm trying to suggest some form of presentation turned into slides. Also,you might want to turn your algorithm into a pseudocode (https://en.wikipedia.org/wiki/Pseudocode) and show it in one of the slides. If it's too long, try to replace blocks of instructions with a single descriptive lines that you can develop on further slides. That would make an awesome improvement to your presentation. I think the goal is to describe what you want in the least number of words necessary. Simplicity and conciseness should be an advantage, and it is easier to achieve it through sound/vision than only through sound.

Also, planing  of what is to be said/written could be outputted on paper as the first working version that you polish/shorten/optimize over days/weeks. Sometimes making a ten minutes video could take weeks if you are after perfection. As you approach perfection (you can come near, but you can't reach it), your final video should make a better overall impression, you would be taken more seriously, you would have more viewers, and mods should finally have a true insight in your work instead turning their head on the other side after the first minute of seeing the video.

Let me show you something:
I hold that (1) gives a much better impression than (2), and I got much more positive reactions on Reddit, comparing those two. Some of us, less educated (including me) are not blessed with eloquency that people usually reach at university, so we have to spend more time to achieve a visible result.

Someone told me that people averagely spend overall 5 seconds on a web page once they visit it. So, when you are making a page, those five seconds are the most important to attract visitors to stay longer. Likewise, it could be a case that the first 10-20 seconds of video decide whether a viewer will click stop and go away, or continue viewing. So it might be worth to spend some extra time on those first 10-20 seconds.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 14, 2020, 08:45:08 AM
Everyone is suggesting to be clearer. True... Still I hope yous tried reading/ watching....it took me a full 29 days of work to make sure a lot of things where in my AGI/universe book intuitively - that thing is honestly not something I'm going to re-write for a while.

The new AGI vid did take 36 mins and is more summarized key points, that is possible. But still, 5 years worth of research and a partially complete AGI blueprint, in 36 minutes and a text to go with it, and you want it in 10 minutes :) !

Can anyone say from my new video above what confuses them? Maybe I can see your reactions... Where (and what) do you say "ohp, I'm lost, why is this useful or how does it do x" ?

The below is a format for Papers but my work still presents the right things at he right times I feel....I summarize, elaborate, share related work, show code and results, and the whole thing is very intuitively explained. I really don't want to explain my work that way that I'm not comfortable with, my way is best :D When I bring up OpenAI's  GPT-2, I bring it up.... All u have to do is collectively store it all, the order is there, it's not like I say the cat bed food ate slept in bed bowl then went to -_-

Title
Abstract.
Related work.
Your contribution (theory).
Experimental setup.
Results.
Conclusion.
References.
Optional appendix with supplemental details.
Title: Re: Releasing full AGI/evolution research
Post by: ivan.moony on May 14, 2020, 09:33:06 AM
The new AGI vid did take 36 mins and is more summarized key points, that is possible. But still, 5 years worth of research and a partially complete AGI blueprint, in 36 minutes and a text to go with it, and you want it in 10 minutes :) !

There is a technique of gradual summarizing I like very much. First explain everything generally in 20 seconds (1). Then explain it again, but in 5 minutes (2). Then explain it thoroughly in whatever time you need, but by repeating (1) and (2) on each sub-section of the thorough explanation.

Five years is a lot of time that deserves a careful thought about presentation. I'm not saying this or that way is the best, but I think freezing one still image over 30 minutes does not instill confidence about something that might turn into a serious advance. My advice is at least put some thematic textually structured slides behind the speech, and please, make an effort of creating the algorithm pseudocode. That way you'll see more clearly strengths and weaknesses of your approach comparing to others.

I'm just trying to help, but it's your project after all, and you probably know the best way of how to behave responsible about it. I may only wish you well.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 14, 2020, 09:51:18 AM
Quote
"There is a technique of gradual summarizing I like very much. First explain everything generally in 20 seconds (1). Then explain it again, but in 5 minutes (2). Then explain it thoroughly in whatever time you need,"

I literally was just thinking about that on my own, cool eh? True.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 15, 2020, 09:54:31 AM
Reminder: my big last posts on previous page(s), don't miss!

Something I posted on OpenAI's group chat on Slack with the Clarity team (ROFL):

I'm still trying to find a good angle to share/ intake, as I'm sure I have insights and sure yous know things I don't too. Circuits's goal is Clarity in understanding existing vision networks, and ultimately AGI in the long run, because they want to improve the design and therefore need more answers to how the brain works. Let's try this question for now: Do the existing vision nets yous are focusing on use Backpropagation? I'm sure it results in a similar weighting that a no-backprop approach would create! So why not a more natural/efficient way of Learning? In my design for AGI I stick to wiring together features recently activated ex. 'h' and 'i' to get the new node 'hi' and I use node accesses to update the connection weights so that if 'hi' was seen 8 times - the connections would each have 8 strength. This very idea runs my simple trie-based Predictor. Further, when 2 features like dog and cat both have snow around them, this would cause dog to light up when cat lights up, hence wiring cat to dog and updating the connection as well. Nodes with few frequencies get pruned/ forgotten or blended with other nodes ex. 'well hi there' and 'hello my friend' becomes 'hi friend my', or 'hi' and 'hello' become 'hielo'. Sometimes (I think) weights are rightfully unevenly stronger and some letters trigger it more than others ex. 'hiElO'.

BTW did everyone know this Deep Learning trick to prune weights?
http://news.mit.edu/2020/foolproof-way-shrink-deep-learning-models-0430 (edited)

Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 17, 2020, 10:26:37 AM
Intelligence just rised again;

Convolution, Pooling, and RELU can be done in my design by time delay of individual nodes firing - if "the cat ate food" is heard it may activate features "the cat" and "ate food" which may activate the node "the cat I saw ate food". Little parts get matched, sometimes not fully as shown because of time delay/ order is different, which ni turn these partially activated nodes activate yet higher nodes partially too! Basically it's ok if not exact match, order of parts of parts of parts is resultingly similar. For Pooling, we can reuse an already existing mechanism - when prediction candidates are activated in my hierarchy design, only 1 usually top prediction is heard/spoken, meaning even though others do have energy, they get a lot less weight. And RELU is exactly that.

&

1) All neural networks compress themselves to learn the latent salient features, they prune low frequency nodes/ connections, blend nodes, store features only once and increment frequencies (strengthen axon weight), trigger related nodes (translation), etc. Once a network is compressed/ learns a good model of the data fed to it, it can predict/ generate True data from the same distribution by using top k predictions softmaxed.
2) Lossless Data Compression is another thing, which is the best Evaluation method for testing Predictors, a neural Predictor is really good at guessing the next letter/ word and can store a separate file that stores error correction (steering the top k predictions to the correct one) and that compresses a file.

Object recognition is used in vision, and text neural networks. The goal is to work with strings of objects; sentences in time. AGI is all about updating where to collect or generate data from, it uses large context in a model to make decisions/ the future state of Earth:
https://www.reddit.com/r/agi/comments/gcln3p/thought_experiment_what_does_a_body_do_for_a_brain/

Elaborate? Maybe summarize haha. In my movies you see my network has a lot of things for Prediction; frequencies, translation, robustness to typos etc, activation functions, energy remaining, etc. Prediction is truth. It models the data distributions. Trust is based on context; truth. The 2nd major thing in my network is Reward, deciding what website or what mental thought to look into or which motor action tweaks to try is a recursively updating process of where to collect/generate data from and evolves its own goals to achieve the root goal Survival. This goal finding steers the prediction to a path that it wants to meet so that it know "How" to get what is "Wants". The brain modifies long term memory nodes that exist etc, short term working memory (temporary energy), and permanant energy (reward). It updates all 3, to reach the reward answers.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 18, 2020, 06:08:25 PM
May 18 2020
Locksuite posts on OpenAI's Slack channel:

The article mentions Nick looking for curves in images and couldn't just choose 1 of 4 classifications for a given image, and says "We were surprised when we saw that activations fell naturally into different levels of activation.". Really? I knew that. I actually know something deeper. Check out this image: https://ibb.co/x5J7s9k

If my hierarchy schema reads "cat", it activates the "cat" node including other nodes ex. "cattle" and "tac" and of course parts of itself ex. "at" and "t" to variable degrees. Each node has predictions of what comes next. They combine predictions by shared parent nodes. I have a real algorithm that does this. Energy flows rightwards only, you can't repeat the alphabet backwards naturally. Only how it was stored in order.

As shown in my image, the "cat" node activated will activate neighboring context nodes, and through these local channels and only through these channels will trigger/discover nodes like "dog" that share the same contexts - cats and dogs both eat, sleep, run, lick, etc. The cat, cattle, dog, etc nodes are activated by variables amounts and all mix predictions. It can recognize unseen sentences plus use many matches/judges for prediction.

My hierarchy schema can also discover/trigger my=your if it stores "my dog" and "your cat", because the shared contexts are, while not exact, similar, hence leaking energy still! Further, if both rabbits and horses are dogs, animals, 4 legged, cute, and have 2 eyes, then rabbits triggers horses. This is getting deep now, very parallel, each node leaks energy and each of these then does too...
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 19, 2020, 12:40:19 PM
A natural and explainable brain? Another visit to my design.

New pictures are provided too.

I have been working on AGI for now 5 years full-time (I don't get paid), on mostly text sensory but it turned out to be very very insight-full, I have a very large design with many of the "bells and whistles", while the architecture itself that can run it all is very simple and has made some of my friends cringe. Below I lay down a good portion of my work. I really hope, you can help my direction, or that I help you.

Nearly every part of my design/ architecture can be found in the AI field. Hierarchies, yup. Weights, yup. Rewards, yup. Reward Update, yup. Activation Function, yup. Word2Vec, yup. Seq2Seq, yup. Energy, yup. Pruning, yup. Online Learning, yup. Pooling, yup. Mixing Predictions, yup. Etc. It's when I unify them together I start getting a new view that no one else shares. I'm able to look into my net and understand everything and how they all work together, that's why I was able to learn so much about the architecture.

I've coded my own Letter Predictor and compresses 100MB to 21.8MB, world record is 14.8MB. Mine updates frequencies Online and mixes predictions, and more. I still have tons to add to it, I will likely come close to the world record easily. How it works is in the supplementary attached file.

So I'm going to present below a lot of my design, showing how I unify a things together in a single net. And you can tell me if there's a more natural way or not. I've tested other people's algorithms like GPT-2 and they can accomplish what I present, but the natural way to do it is not shown in an image or ever explained like I explain it, they just stack black boxes on each other.

See this image to get a basic view of my architecture. It's a toy example. https://ibb.co/p22LNrN It's a hierarchy of features that get re-used/ shared to build larger memories. The brain only stores a word or phrase once and links all sentences to it ever heard. That makes for a extremely powerful breeding ground. Note the brain doesn't store a complete pyramid like I show in my image, just bits n parts; a collection of small hierarchies. So think of my image as a razor tooth saw, not a single very tall pyramid triangle. https://ibb.co/d4JVm55

Notice all nodes are too "perfectly" clear? Well nodes can be merged "whalkinge" and have variable weights "wALkinG" and be pruned "my are cute" to get a "compressed" fuzzy-like network but we can for now keep a clean hierarchy so we can easily see what is going on!

I have a working algorithm (trie/tree-based) that updates the connection weights in the tree when accesses a feature (in the same order time ex. a>b>c, cba is a different feature), so it knows how many times it has seen 'z' or 'hello' or 'hi there' in its life so far! Frequencies! This is my Online Training for weights. Adding more data always improves my predictor/model, guaranteed. I tested using not Perplexity but Lossless Compression to Evaluate my model's predictions. So now you can imagine my razor tooth hierarchies with counts (weights) placed on connections. Good so far. Starting to look like a real network and can function like one too! https://ibb.co/hC8gkFC

Now for the cool part I want to slap on here. I hope you know Word2Vec or Seq2Seq. It translates by discovering cat=dog based on shared contexts. The key question we need/ will focus on here now is how does the brain find out cat=dog using the same network hierarchy? Here's my answer below and I want to know if you knew this or if you have a more natural way.

https://ibb.co/F4BL1Ys Notice I highlighted the cats and dogs nodes? The brain may see "my cats eat food" 5 times and then, tomorrow, may see "my dogs eat food" 6 times. Only through their shared contexts will energy leak and trigger cats from dogs. There's no other realistic way this would occur other than this. The brain is finding out cats is similar to dogs on its own by shared strengthened paths leaking energy. So next time it sees "dogs" in an unseen sentence like "dogs play", it will activate both dogs and cats nodes by some amount.

We ignore common words like "the" or "I" because they will be around most words, it doesn't mean cats=boat. High frequency nodes are ignored.

Word2Vec or the similar can look at both sides around a word to translate, use long windows, skip-gram windows, closer words in time have more impact, and especially the more times seen (frequencies). My hierarchy can naturally do all that. Word2Vec also uses Negative Sampling, and my design can also use inhibition for both next word and translation Prediction.

Word2Vec uses vectors to store words in many dimensions and then compare which are closer in the space. Whereas my design just triggers related nodes by how many local contexts are shared. No vectors are stored in the brain... Nor do we need Backprop to update connections. We increment and prune low frequency nodes or merge them etc, we don't need Backprop to "find" this out, we just need to know how/ why we update weights!

There's a such thing as contextual word vectors. Say we see "a beaver was near the bank", here we disambiguate "bank". In my design, it triggers river or wood more than TD Trust or Financial building. Because although "near the bank and the building" and "near the bank with wood" both share bank, the beaver in my sentence input triggers the latter sentence more than the financial one.

Word2Vec can do the "king is to queen as man is to what?" by misusing dimensions from king that man doesn't have to find where queen is dimensionally without the king dimensions in man to land up at woman. Or USA is to Canada as China is to India, because instead of them lacking a context they both share it here but the location is slightly off in number. But the brain doesn't do this naturally, just try cakes are to toast as chicken is to what? Naturally the brain picks a word with all 3 properties.

To do the king woman thing we need to see the only difference is man isn't royal, so queen is related to woman most but not royal, hence woman. This involves a NOT operation, somehow.

Ok so, when my architecture is presented with "walking down the" it activates multiple nodes like "alking down the" and "lking....." and "king...." ..... and "down the" and "the" and also skip-gram nodes ex. "walking the", as well as related nodes ex. "running up that" and "walking along the". My code BTW does this but not related or skip-gram nodes yet! What occurs now is all activated nodes have shared parent predictions on the right-hand side to predict the next letter or word. So "down the" and "the" and "up this" all leak energy forward to "street". This Mixing (see the Hutter Prize or PPM) improves Prediction. You can only repeat the alphabet forward because it was stored that way. Our nodes have now mixed their predictions to decide a better set of predictions. https://ibb.co/Zz91jQQ

My design is therefore recognizing nodes despite typos or related words. It can also handle rearranged words like "down walking the" by time delay from children nodes. Our "matches" in the hierarchy are many, and we have many forward predictions now, we can take the top 10 predicted words now. We usually pick the top prediction, mutation makes it not perfect on purpose, it's important.

You may wonder, why does thinking in the brain only hear 1 of the top 10 predictions? All 10 nodes are activated, and so are recently heard nodes kept Active! If they were heard, you'd hear them in your mind, surely? If you imagine video in your brain, it'd be very odd to predict the next frame as a dog, cat, horse, and sheep, it would be all blended like a monster. The brain needs precision. So Pooling, as done in CNNs, is used in picking from top 10 predictions! Other nodes and predictions still are activated, just not as much.

Also, Pooling in my architecture can be done for every node outputs! Not just the final high layer. Pooling helps recognition by focusing. Pooling can be controlled indirectly to make the network Summarize or Elaborate or keep Stable. It simply says or doesn't say more or less important nodes, based on the probability of being said. Like you may ignore all the "the" or you may say a lot of filler content that isn't even rewarding like talking about food (see below).

When given a prompt ex. "What do you want to eat? What?" you may first parrot exactly the start, and some may be said in your own loved words I, fries, etc. Or you may just say the entail. You might just say what they said and stop energy forward flow. And you might just say fries in replace of "What?". Why!? Because their words, and your loved words fries, I, etc are pre-active.

One more thing I'll go through is Temporary Energy and Permanent Energy in my architecture. You can see Facebook's new chatbot Blender is like GPT-2 but it has a Dialog Persona that makes it always say certain words/ nodes. So if it likes food or communism, it will bring it up somehow in everything. Just look at what I'm writing, it's all AI related! Check out the later half of this guy's video: https://www.youtube.com/watch?v=wTIPGoHLw_8

In my design, positive and inhibitory reward is installed on just a few nodes at birth time, and it can transfer reward to related nodes to update it's goals. It may see contextually food=money, so now it starts talking about money. Artificial rewards are changeable, root goal is not modifiable as much.

For Temporarily Active nodes, you can remember a password is car and forget it, but of course you retain car node. This is a different forgetting than pruning weak weights forever. GPT-2 is probably using the last 1,000 words for prediction by this very mechanism. The brain already has to keep in memory the last 10 words, so any predicted nodes that are pre-active from being held in memory get a boost. If you read "the cat and cat saw cats cat then a cute" you predict cat, and the cat node is already activated 4 times just recently. You're holding the words in your hierarchy nodes, not on paper anymore. So yes energy is retained for a while and affects the Probabilities predicted!

I once played Pikmin for half the day, and when I went in the kitchen things looked like Pikmin faces or I seen them faintly but still somewhat vividly running around things. It causes dreams more random predictions from the top 10 or 100 predictions. It's not really good predictions in dreams.

You can see how this helps. Say you only read 100,000 bytes of data so far, and you now read "the tree leaves fell on the root of the tree and the", you have little data trained on so far, but you can predict well the next word is Probably a related word to tree, leaves, etc, so leaf, tree, branch, twig all get boosted by related words from recently read words. And it's really powerful, I've done tests in this area as well. The Hutter Prize has a slew of variants I presented. Like looking at the last 1,000 letters to boost the likeliest next letter. That's good but not as commonly accurate or flexible as word prediction using related words, instead of Exact letters! Big difference.

I look forward to your thoughts, I hope I provided some insight into my design and tests. I hope you can help me if there is something I'm missing, as my design does do a lot in a single architecture. I don't see why it's a good idea to study it as a stack of black boxes without fully understanding how it makes decisions that improve Evaluation (prediction). While my design may be inefficient it may be the natural way it all fits together using the same nodes.

To learn more, I have a video and a different but similar run through my design in this file (and how my code works exactly): https://workupload.com/file/Y4XhZPYHzqy
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 22, 2020, 10:14:05 AM
Earth formed about 4,500,000,000 years ago. The first duplicating cell on Earth emerged about 3,500,000,000 years ago. Humans, capable of improving their own tools and own design, emerged on Earth just 6,000,000 years ago. Since the last couple of thousands of years on Earth we have been radically improving our vocal/ phone/ computer communications, data storage/ computation, and transportation technologies to name a few big ones. We now have huge skyscrapers and every year or so a better iPhone or AI. The data exchange/ combination and mutation and specialization lead to the ability to more quickly do it again but faster. It's all going to go down now (for Earth) in the next ~100 years. We will invent an artificially duplicating hardware/computer that is efficient and programmable, even adaptive. Simple nanobots. It will allow us to improve the nanobots further and make bigger faster computers and collect more diverse data. All of Earth will become a grey goo nanobot swarm that can predict well (especially being a formatted terraformed planet being a fractal now, knowing where, when, and who all is using least data to know so is most efficient), regenerate super fast, create or become anything at whim, and will continue to grow in size by eating planets (although will have to be not so dense or else becomes a star/uranium and explodes radiation). We already can see Earth becoming a fractal, stores are built near stores, homes lined up...

The trend? Evolution is evolving longer living structures. Longer Lifespans, is the way Evolution works and prefers. Humans already seek longer lives.

I'm therefore not working on the wrong technology. AGI is the next species in Evolution. I have found hundreds of capabilities they will have that put them way ahead of us easily. Humans were smarter than apes by far, and AIs will be exponentially way more than us.

And my approach to AGI is not wrong. AGI needs not just more existing data/compute but a smarter discovery Extractor/Generator to create NEW desired data to its held questions (duplicating old data with mutations). The output of AGI is only to either implement plans or update where to collect data from, those silly RL walker robots do this and GPT-2 should if we improve it to do so. It is specializing in where to collect new data from, which question, which source. I don't really need a body for my AGI therefore. Output is just for implementation or data collection specialization updates.

For example, my algorithm I made from scratch, compresses the dataset enwik8 (100MB) to 21.8MB, which means it predicts pretty ok, and my net predicts better the more data it sees, for example if I used the dataset enwik2 (100 bytes lol) it'd compress it to only ex. 70 bytes. Get it?

SO, with the same dataset enwik8 of 100MB, how can I predict better if I don't have more data? Add more data. WHAT!? Yeah. Let me show you. When you find discoveries in the enwik8 dataset ex. cat=dog by shared contexts, you can recognize longer unseen sentences more robustly, and more! The world's best compressor can get enwik8 to 14.8MB. See?

To give a clearer example: If I window the last word of my context, to predict the next letter or word, ex. "the [cat] ?_?" > "the cat ate", I know, from up to 100MB, with experience, what follows it. BUT, i I look at it like "the [horse] ?_?" and "the [dog] ?_?" etc these words share the same en-tailing words usually so they will surely be helpful. And THAT, gives me more data/insight. Patterns are in the enwik8 dataset, some words are inter-exchangeable!

Oh, so here we see now: Basically, evolution moves faster near the end because of more data mutation. Hence, more storage, communication, and compute improve immortality of data lifespans. And not just more compute/data, but virtual/extracted insights, is where you get the most data. Hence, AGI and a faster/bigger computer both advance evolution! But AGI is much more potent at doing so.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 22, 2020, 08:49:46 PM
https://www.youtube.com/watch?v=TF5cJqXBwhc
Title: Re: Releasing full AGI/evolution research
Post by: ivan.moony on May 22, 2020, 10:26:06 PM
If you ask me, it's better than before, but try to pick a real example of compressing, like a short meaningful sentence, showing what happens to which variables/arrays on each loop step. Like a kind of showing what the algorithm does, step by step, accompanied by clearly shown input position/variable states/generated output. Just a suggestion.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 23, 2020, 08:08:51 PM
From another forum:

Quote
Intelligence requires that there multiple possible futures, otherwise we would simply be mechanically unfolding a pre-determined destiny.

First I'll start with the obvious side of the coin. We do know our universe is at least somewhat predictable, that's why we can repeat the same lab experiments around Earth, and build neural models of the world to learn patterns. The laws of physics make our world at least partially unfold along a deterministic path. A computer simulation or calculator is also replay-able - it's predictable. We are able to understand things because they are not random.

On the other hand, the word "random", by definition, means an outcome/result that varies maximally. So instead of a computer algorithm spitting out the number 5 predictably, you may get 1, then 8 next time, then 3, 0, 8, 6.... If it wasn't [maximally] random you'd get outputs like 6, 4, 5, 5, 6, 4, 4, 5. So random just means a wider variation. It doesn't mean it disobeys the laws of physics. It just means the view we look at something is ignoring fine details. For example, a woman can write down which color dresses she'll show you, so she knows the order of colors, but you don't, so to you it appears unpredictable, but to her, maybe the dress order is down pat and she remembers it like the alphabet. Another example, you fill a glass with milk until it pours over the rim, but the side that leaks first is different each time. Why? The glass is perfectly flat at the top let's say. It's because of the direction the human poured the milk in that caused it to pour over different sides. The human didn't stand in the same spot each time! The definition of random, that I gave here, basically equates to: you don't know something, so you output the wrong answer. But someone else can know what will happen! Lol. In other words, in the physics/laws we have, you can get an algorithm that outputs 5 each time, or outputs 4, 6, 5, 5, 6, 4, 4, or outputs 7, 1, 3, 0, 8, 4, 2, 8. And there's an actual reason behind it. Not magic. The definition of "random" I gave here, therefore, is when you lack information. You don't know what will occur. But once you learn what will occur, you know in the future what would result. This is assuming you can look in a computer or brain to see the stored algorithm. If you can't know what's inside, then you don't know if it will output 5 every time.

So, do we have another definition for the word "random"? Yes. I call it True Random. It would need to break the laws of physics. For example, an atom or particle would be shot into space, travelling, and after 45 minutes, decides to change its direction! There's no reason it should have, though. Nothing touched the system and nothing left the system. Now, we already know our world is at least 50% not True Random, but predictable. And in computers there's a thing called redundancy that stops errors from popping up. You can run a car simulation perfectly each time, the same way each time! You could run a human simulation, with no True Randomness! Unless it makes us act the way we do. So, True Randomness may exist, and it may be helpful in making more robust predictors that handle uncertainty. You could just make your world/borg garage larger. Larger systems can avoid errors and damage more than brittle delicate small systems. It takes longer for the errors to show up. So the borgs could more easily predict where things are at the high level. Now, one could argue that if particles acted truly random 50% of the time, it would show up in computer car simulations! But it doesn't. So the real reason we get errors is because there is faults at low levels we don't know about. That's all. Not True Randomness. Now, can we solve this? Yes. We already are. Humans produce babes without the DNA information disappearing. We can repair cars indefinitely. But we can't know where every particle in our system is, for to do so would require knowing where the particles (that make up our knowing) are, which is impossible. You could make everything into solid cubes, but you still can't model your world perfectly, only approximately.

The 4th side of the coin is magic orbs from God herself. Unfortunately, if you were hoping for this to be a valid thing, you are mistaken. Magic has no place, magic has to be either True Randomness, Randomness, or Laws of physics. There's no, such, thing, as magic. Either a particle moves as expected based on its and/or other surrounding context/conditions -OR- it pops into/outof existence some "move" or "particle" or "law" that truly is random. Say we had a genie ghost waving its hand with Free Will, granting wishes. The way it works is not by a existing predictive mechanism, but by popping into existence stuff, and must be non-random stuff. But why non-random? Because the genie would not exist, it'd be illogical soup. But what sort of "dimensional ether" is remembering or directing non-random creation in real time? We need something already existing to do this. A designer who creates a designer who... So it's impossible.

Quote
Your compression thingy will basically produce something that spews language, gibberish actually because there is no world model or understanding behind it. Much more importantly,there is no path to general problem solving, or even generalized language gibberish spewing, just a specific language.

"Your compression thingy"

This shows you lack understanding. Gosh. Lossless Compression is just an Evaluation for my neural net predictor I made. I could use Perplexity. Same algorithm, just different test of how good my algorithm is at predicting data in the distribution.

"will basically produce something that spews language"
"gibberish actually because there is no world model or understanding behind it."

Again, you're lacking here. Neural networks learn a model of DATA. Be it text or vision. - Both are language. Which means they CAN learn PATTERNS. Patterns mean frequency, because in a dataset you may see the letter 'z' or word 'grommet' appears not too often! Maybe nothing re-occurs! Maybe the whole dataset is tttttttttt. So you can predict/generate the likely future, being the letter 'e' or word 'the'. Now, because of these re-occurring letters or words, words like cat & dog can be found to share the same contexts. Dogs eat, dogs jump, cats eat, cats jump. Thank god the word "jump" appears at least twice lol. Else no semantics! SO: A neural model can learn the letter 'e' appears very frequently, 'z' appears infrequently, 'cat' is very contextually similar to 'dog', and 'cat' is very different than 'jog'. Neural Models help organisms to survive longer in Evolution. Even if you don't believe text data mirrors human vision_thoughts data, you can still trust the algorithm can work on ANY dataset by "finding" patterns. In FACT, the Transformer architecture used in GPT-2, works on vision and music datasets.

"or even generalized language gibberish spewing, just a specific language."

First of all, the algorithm I already coded from scratch can predict the next letter of any language/ generate other languages too, like Hindi, French, etc. You just feed it such dataset and it learns the patterns. Currently I use enwik8. Now, my future algorithm, and the already existing GPT-2 made by OpenAI, can already learn cat=dog semantically by shared contexts, cat/dog are interchangeable and it can recognize unseen sentences. It helps it knows what entails a given word or phrase by looking at many many similar situations from past experience. As well, it can learn hello=bonjour, if it is fed diverse data that has enough French words! This works for vision too. And if you use text + vision you will need to associate them in the same time they were shown.

"Much more importantly,there is no path to general problem solving"

You've literally just asked me how to create AGI. AGI needs to solve many different types, of Hard Problems. To do so, it needs a large/diverse model, not just so it can solve various domains, but so it can use all sorts of domains when solving a problem in a given domain. It needs to know frequencies or IOW Cause > Effect probabilities of our physics (dogs usually breath, not eat) to logically think about paths it COULD take. And must take a path it desires too, to reach the desired outcome. It must wait at steps, until they are completed. It must update goals through induction/semantics. Food = money = jobs = truck = wrenches. It will ask new questions and seek new data from specialized sources or questions. It may need to search/mutate answers before mentally generates a good well-backed/aligned answer. It needs to be told when you look at 2+2=, it must be a precise answer, not 8, even though it kind of answers the question. It needs to be told when you are unsure of the prediction for 2+2=, you must look at it a different way or collect more specific data, if it is unsure about 573+481= it can look at it a different way (assuming you are sure of 2+2=4 etc etc). You are told to resort to look at [5]73
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 26, 2020, 02:14:59 PM
Little update. I'm using all chars now in my predictive model, just a few letters in my input were all ???? no matter the special char that *should* have been there. My code is approx. 114 lines of Python. Could make it a tad smaller. Also a python pro could make it yet more smaller. It's made of 4 parts; tree storage, tree searching, mixing/weighting predictions, arithmetic coding (evaluation of the algorithm).

Reference: world's best compressor gets 14.8MB for 100MB (enwik8 dataset)

1,000,000 bytes in
Shelwien's Green --- 256,602 compressed
My algorithm: --- 252,591 compressed

Green: 100MB > 21,819,822 bytes
I should at least reach then: 21,478,751 bytes

update:
NEW: 251,699
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 26, 2020, 09:37:21 PM
Managed to remove 6 lines of code.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 26, 2020, 10:18:08 PM
How to use my code:
https://www.youtube.com/watch?v=3wTOSLOA9GM

My code:
https://workupload.com/file/Sbx7a5q77r3
The hardcode can be minimized, it has patterns.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 28, 2020, 10:21:13 PM
Confused? Someone was.

Now you feel the frontier of hard work not presented properly. Don't repeat this. Most do this.
You can adjust the number at start of video I'm on to compress how many letters, mine's set to 10,000.
I change the yes/no at top to compress/decompress.
I change the bottom thing to print the input2 to get either the encoded file compressed or the output decompressed. You plug in the encoding to top where shown.
My dataset is in folder shown. Enwik8. Well, part of the start of it.
I was showing at that part of the video how the out.txt was same as the input file.
BTW to decompress you need to modify as shown the input file so only 16 - 3 letters are in in....up to the x letter shown lol.
At the end a do a little calculation to get the actual bytes being compressed.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 29, 2020, 12:37:38 PM
mostly same but clearer rewrite:

Here is my [latest] python code I made from scratch.
And a new 1min video of me using it.

https://workupload.com/file/9x5Ft5EfBfn

https://www.youtube.com/watch?v=q0m-v9192o4

The hardcode in it can be minimized, it has patterns, I just left it for now. Most the prediction accuracy/ compression is not done by the 3 long middle columns, actually.

Mine can compress 100MB to approx. 21.4MB. World record is 14.8MB.

My code is explained in my book. Basically the more text my code sees the better it can predict which letter follows the last ex. 8 letters, and I mix up to 17 such ex. "The cat ran_" "he cat ran_" "e cat ran_"....

If my code found matches robustly with similar/rearranged position words, I could mix 8000 matched experiences instead of 17. It would help compression a ton. Maybe I'd get 17MB.

And if I held recently activated words, they would be pre-active and require less data to predict what word follows "cat cat cat cat", by just using the already activated nodes themselves! Similar words also become pre-active. Candidate nodes are already pre-active in the brain because the brain holds onto words it heard, so it's a natural thing, they are pre-selected/triggered for softmax output. I examined how GPT-2 does this and I once made a code that worked by doing this, it helped a lot.

And I also know yet more ways to get the compression down / prediction more accurate.

I link in my book also Shelwien's version in C++, I followed how his basically worked. His compression is 21.8MB.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on May 31, 2020, 01:43:16 PM
To do zero-shot or one-shot learning means you are getting [information] from somewhere. Same for multi-shot learning.

Learning (finding) patterns in big data improves intelligence.
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on June 02, 2020, 05:36:32 PM
Did anyone here ever realize 1) more data makes AI make smarter decisions, and 2) you can get more data from the same-size dataset? Here's 2 ways how (I know many more ways):

What do you think?
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on June 02, 2020, 09:34:54 PM
I suspected this for years but here's bit more proof:

Say you only see a prompt "cats" and can't generalize to dogs. With the predictions eat, run, etc, you can actually translate these. This helps you predict better. While you've only seen 'eat' and 'run' follow 'cats', you know eat is dine, very similar actually let's say, so you could say 'dine' has a higher chance of appearing than 'bike', all by looking at just the predictions!
Title: Re: Releasing full AGI/evolution research
Post by: Hopefully Something on June 02, 2020, 10:17:07 PM
Maybe it could benefit from some larger higher-level generalizations and heuristics to then predict lower level ones. Think big sine waves made of smaller sine waves etc. Fractal sine waves. You can get good top down predictions, better than trying to get a correct big picture by studying the nature of details. Like having a general prediction for what a book would consist of, then dividing that idea, and elaborating on each part, etc.

The most efficient process for putting rocks into a bucket is biggest to smallest, that way all the pieces will fit together. The biggest pieces of data make a rough approximation of the idea, then progressively smaller pieces make increasingly accurate and precise approximations.  If you put small ideas or rocks in first, some parts of your model may be precise, but the whole thing won’t end up being accurate to life. 
Title: Re: Releasing full AGI/evolution research
Post by: LOCKSUIT on June 02, 2020, 10:47:05 PM
Ah. I see what I've been doing wrong.

I should have started with a blow up doll. I knew it.

Oh wait, already been there.

That's a good discovery HS. Big rocks go in the bucket first. Then you shake it as well to use up all space. Wahahahaha. You literally do nothing to make the bucket heavy. I heard a company I forget which is letting AI improve circuit boards like that, big components go on first.

Well, HS, that's exactly what my current code does, [long] context matches in the tree shown in the image above - they get most weight during Mixing. You can also do Byte Pair Encoding top-down too but too high is not needed plus costs way too much resources. Hmm, perhaps, reward or temp energy etc can take weight first during Mixing, I never thought about that, even though I should have lol. For example you may consider your friend's opinion on the future word to predict, and ignore attention to other information, you just don't look at it.