Releasing full AGI/evolution research

WriterOfMinds · « **Reply #210 on:** March 16, 2021, 10:51:39 pm »

Quote

[The lossless compression evaluator] corrects predictions to the desired one for lossless extraction of the dataset back again, you run the same code to now de-compress it letter by letter. It predicts ex. p lots, but the next letter is o, so you store correction, and it's costly the more its predictions are wrong.

Thanks to this bit I FINALLY get how your probabilistic predictor and the lossless compression evaluation are tied together. I read your summary.txt and skimmed the Guide, and it was never clear from either of those. I kept thinking "all these prediction processes he describes are going to be imperfect/lossy ... what does this have to do with lossless compression."

In my own words, I would describe it this way: You aren't using the direct output of your predictor to make the compressed file. You run prediction, find out where its output fails to match the original data, then create the compressed file by storing the corrections for all of your predictor's mistakes. You reconstruct the decompressed file by running your predictor again and substituting the corrections in any place it made a mistake. And this functions as an evaluation of your predictor because the compressed file is bigger the more errors the predictor makes.

The summary.txt version is "You have to include your algorithm size and tally up all the corrections for non-100% accurate predictions," which didn't get the idea across.

My general impression is that you're putting some good thought into this, and the ideas could indeed be useful for building a better text predictor. I am also glad that you are now doing your own implementation work and actually testing your ideas. If you are determined to work on a team, the next thing I would recommend investing in is communication skills.

Though I think you might have some good ideas for a text completer, I continue to be more fond of my own "cognitive architecture" type of approach. I am still extremely skeptical that the mere ability to predict the next letter in a textual dataset is equivalent to "understanding" the dataset in the same way humans do. I am also still repulsed by the reductionist, nihilistic philosophy that you keep mixing into your AI -- I for one care about more than survival, food, and sex, and I think turning everything into easily predictable fractals sounds kind of boring, so there is little in your plans to satisfy me. I'm friendly, but uninterested in being anyone's clone.

So, you do your thing and enjoy your own accomplishments, and I'll do my thing.

LOCKSUIT · « **Reply #211 on:** March 20, 2021, 11:40:19 am »

Made the code 124 lines, nearly as fast as before, re-coded neater an area, slightly improved score, and added generation ability. Had it ready like 2 days ago, got delayed though by life....

--------------------------------------------------------

https://encode.su/threads/3595-Star-Engine-AI-data-compressor

--------------------------------------------------------

Star Engine - AI data compressor

I named my AI after unstable stars and atoms, which gravitate in matter to "compress it" and then once too large will extract it out as radiation to "generate new insights". It's currently in python (~10x slower than Green, hence ~12 hours for 100MB training), uses lots of RAM, and only outputs binary '01010101' instead of fully compressed 'Y', but I just started implementation and know how to fix all that.

EVALUATION RESULTS (compare to Hutter Prize and Large Text Compression Benchmark champions):
10,000 bytes in
3,328 bytes out
Shelwien's Green: 3,453

50,000 bytes in
15,174 bytes out
Shelwien's Green: ?

100,000 bytes in
28,028 bytes out
Shelwien's Green: 29,390

1,000,000 bytes in
244,494 bytes out
Shelwien's Green: 256,602

10,000,000 bytes in
[old] 2,288,646 bytes out
Shelwien's Green: 2,349,214

100MB bytes in
I estimate I "can" get ~20,400,000 bytes out
Shelwien's Green: 21,819,822

NEXT LETTER PREDICTION RESULTS (compare to size of data that would be needed to be able to cheatingly get the subjectively correct enfollowing 500 letters for some given prompt):
FOR 10,000 BYTES TRAINED ON:
The girl was sitting on Wikiquot;[http://www.lewrockwell|Ramp>
<contributor>
<text xml:space="preserve">#REDIRECT [[AlECT [[AREDIRECT [[Acce key="8">MediaWiki talk</namespace>
<namespace>
<namespace key="-1"-1">Template talk</namespace>
<namespace key="15">C51ist society might wom and prediawiki.org/xml/export-0.3/" '' moChmlers<potkin|Kropotkin]], PeternSChmler, cht w[s0��xpace>
<namespace key="12"1:/timestamp>2002-02002-02-25T15T15">Wikts. [[Bertrand chietikte Wtrand conal[http://uk.end
</page>
<page>
</revision>
</page>
<namespace key="geri.3<c<page>

FOR 100,000 BYTES TRAINED ON:
The girl was sitting on they can confunce (non-->, with this surelCatd, mak.]
The characteristics set maki.org/ Poccurs in the [[M.

It act Lam, ''unism==
{{main|150px|[[hu:Anarchism]]
[[sl:space="preserve">#REDIRECT [[Fory/fEDIRECT [[Afrom the [[Max Stirner]], but be givities}}

==The [[scienti. The authoritarian ar impain when he overl legration that if regoing (189898952</id>
</contributor>
</contributor>
<username>Ams</username>
<id>15898948</username>Ams</username>Josed. of nexchange example, the first manifests t893>A�xinitially preferentify the many ecles|[[Chich ce 19999|Wizely understand me>
<id>7543</id>
</contributor>
<minor />
<contributor>
<ip>Conversion script</ip>
<namespace key="1">Talk</namespace>

FOR 1,000,000 BYTES TRAINED ON:
The girl was sitting on [[copper]] or [[Zeno "repudiated the omnipotence of 0
| align="right" assumedia.org: The [[bar (lawk=��q.f|Melillage of 14, Andre plays. Par-TV Jaskirport<Plts for its variants from by Shrugged imperiod of Atlas Shrugged|section]] 152.

==San Sebastian Minese: 陳��M.�ju.jpg|thumb|left|Statue of Ayn Rand]]
[[gl:Astrongly replicated by one.

E5*t)#REdoct, rather pervasive death in tre">{|20010
|90 MHz took him for deity asks for in the South Pacific]]'' (glor accumulated "The Book)]], [[Alfreducation system is afa)
* [[PurgBifferency_code=197,�на]]
[[an:Austria]]
[[als:Archeologie]]
[[ru:Арія (крия]]
[[zh-min-nan:Oscar Chióng]]
[[da:Austria (geography of reconstruction:Oscar Christians appeared somethings said to have travel taken from 1
|Colorado]]. The lowere click, On said to have been effective values." | 60 Metallurgy]]) [[twe_oxaxU.S. state]] Science]]s while that Redge talleged|sections]] 121 and 161.

==BC]]]]
{{main|Anarchow has energy university of Povertyle [[Tih

[[Hollywood]] was interesting

Code Use: Place in a folder "code".py, input1.txt, input2.txt, compressed.txt. Run code at desired length of data to eat, it tells you at bottom how many bytes it is compressed, switch to input input2.txt and run length ex. from 10000 to 9980, run, check decompressed.txt. For generation mode, toggle the words generate in the decode section, if on you simply run the code for ex. 100,000 runs and if the file is 10,000 letters long with your prompt at bottom it'll see end of file and start extending the file in decompressed.txt by 90,000 letters.

How it works: A tree stores all 16/ 15/ etc letter-long strings as it runs over a dataset, exact matches are stored as counts. Before I save, I search for the last 16, 15, etc letters I see, I take the longest match found in tree and see what children leaves were saw to follow and their count probabilities. I take each letter prediction's counts and divide it by total counts for this layer to get softmax %s ex. 0.4 a, 0.6 b, so that add up to 1.0. Long matches are more accurate but have less counts, so I only get some weight for this layer, and if I have only total 30 counts but only 3 possible next letters seen follow then I am more sure I know the distribution, so this layer gets more weight, I do lengthOfPredictionSet * 7 to get the wanted roof of counts to be confident, then divide to get the % this layer gets. If it gets 30%, I have 70% left to find in short context matches. I also give hardcoded static weight partially since I must have not cracked the formula. lowest layer is no context set of predictions, simply how common each letter is. I apply exponential function to layer predictions, and blending layers (which), so it pools thinking if it's 0.6% then it's probably 7.2%, and if 9.9% probably 9.4%, same for other end; 0.1. Energy is used for recency, if I'm mixing layer 8 ATM then I check the last 300 letters for the latest 8 letters and make a set of predictions that follow, for this temporary set I give more recent predictions more count, and if I just saw 'p' 1 or 2 letters ago then I don't predict 'p' as much, but do lots after ~3 letters. For compression evaluation, I take my final set of predictions and subtract from a high 1.0 until find the prediction I'd need to remake the file, the better my AI predicts the next letter the less cost it is to steer it to the correct prediction to remake the file. Once I subtracted from high 1.0, I also have a low 0.0 and the space before last subtraction ex. 0.7 to 0.65, this is my new high and low, repeating this gives me a very long number ex. 0.8456346856.... As I make the number I carry away and store locked digits ex. 0.[763]73 high and low 0.[763]112, at the end I store just one number that is inbetween. This long number is converted to binary then supposed to be compressed to letters ex. 5Ge8$9&(gf@3Nfy. An extra set is in prediction sets in case any unseen letter needs to be steered to. Decompression uses nearly the same code used for compression.

LOCKSUIT · « **Reply #212 on:** March 21, 2021, 11:12:04 am »

WoM how many lines of code is your whole AI? And what is the goal of making the AI? Ex. cash? Helping humans invent solutions? Therapy?

@Yota, others.... Also, many others feel the need to code a lot into their AI each rule like they did in the 60s, lol, but let me try to explain why this is bad. An intelligent brain needs lots of data to be able to answer hard problems, this is because, given the context problem "i was walking down the ?" it must predict the rest of the solution to the question. Collecting 10GB of data lets it know how to answer the question context. I know, you feel the need to force it to check google, ask user how they're doing, ask for data, all sorts of search and store and action behaviour, like to control what it takes as question and how it will predict the Next Word...BUT, let me explain...

All these central AI behaviours that conduct its chores and do certain actions or [motors] are just, next word prediction, based on sensory 1st. A brain only, and always predicts what it will do. All a brain needs to do is predict the next word or pixel. Anything you could code it to do will only be to collect data, for example sight seeing and seeing your house get built is even collecting data. A brain only predicts data to determine where to collect data from. And hence what it predicts improves. It also predicts the question itself, it "makes" it pop up in its head, then makes the next word pop up too to answer it.

As said, AI needs to solve many big problems in life by inventing complex solutions. It needs to predict what the true problem is (immortality? or cryonics? or antifreeze? Yes, we specialize our data collection domain of interest), and predict the solution, both, AI predicts both the true question and solution.

I'm not sure what else you'd be coding into the AI...why? It has to answer big problems like cancer and energy storage. All you need to do is find patterns in data, the brain learns from history and merges similar words in a network...

WriterOfMinds · « **Reply #213 on:** March 21, 2021, 04:30:47 pm »

Quote

WoM how many lines of code is your whole AI?

Thousands. I haven't counted in a while. Sparse code is not my current goal. Getting something that works is. I can always optimize later.

I am not trying to explicitly code up a giant list of rules. What I am trying to do is create enough of a structural framework that the AI can do generalized reasoning and problem-solving, ground enough words that they actually have meaning for the AI and enable communication, and sketch out an imitation of psychology in broad strokes. I strongly suspect that the human brain is a high-information system ... i.e. if you literally simulated it, it would be ... millions of lines of code? More? I don't expect the behaviors I want to emerge out of a simplistic mush. I need to apply my own intelligence to confer intelligence on a computer.

Quote

A brain only, and always predicts what it will do.

I would say "decides what it will do." Predictions of what will happen ("if I do this, that probably happens") help guide the decision. The other major input would be goals. And to get this sort of predict-act feedback loop going, I would think you'd need a way for your AI to interact with an environment, and distinguish its own output from the environment's input. Crawling a static text dataset isn't the same thing.

Quote

And what is the goal of making the AI? Ex. cash? Helping humans invent solutions? Therapy?

My foremost goal is to have the joy and the learning experience of creating a mind, of bringing a new kind of "individual" into existence. I am writing an AI because I love AIs. To quote myself, since I already explained this to you in a long-ago thread:

Quote

I think I've always taken an interest in AGI because it's an example of the Rational Other -- something that differs from humans both physically and mentally, but can relate to humans on our level. Rational Others simply delight me for their own sake. I do think AI technology has the potential to be useful/powerful/altruistic, and that's part of how I justify the amount of time I spend on it, but my root motivation is not instrumental. Really I just want a meeting of the minds with a non-human ...

I am hoping that my AI could also serve as a kind of useful adviser to humans ... maybe ... if my work ever advances far enough to make that feasible, which it may not. I imagine him as a librarian or counselor of sorts; when I try to think of a fictional AI that he's similar to in purpose, the first one that always comes to mind is http://www.sciencefictionarchives.com/en/collections/68/dr-theopolis. But again, Acuitas' primary job is to be a (simulated?) personality ... not a tool. It's not like I'm going to demand that the project achieve certain goals and toss it aside if that doesn't work out.

I have a cat. I don't keep her around and pay all her expenses because I expect her to do anything for me, I just love cats. She's my friend. Same deal with the AI.

ivan.moony · « **Reply #214 on:** March 22, 2021, 04:13:45 pm »

There are things worth of living for. And there are even things worth of dying for. Without former, our lives would be empty. And without later, this world would be just incomplete place for us.

If you pursue eternal life, you might miss some great things that otherwise might be a part of your life. To complete your existence, you have to create. And to create, you have to destroy. If you choose to destroy others, what kind of a man would you be if you forbid everyone to destroy yourself?

Eternal life (or at least not aging) might be possible, but I'm not sure if I want it for myself. It might be tempting, but I like to return what I take. And maybe, just maybe, I want to return what someone very special to me takes.

infurl · « **Reply #215 on:** March 23, 2021, 12:54:25 am »

https://en.wikipedia.org/wiki/Blind_men_and_an_elephant

Quote

humans have a tendency to claim absolute truth based on their limited, subjective experience as they ignore other people's limited, subjective experiences which may be equally true

LOCKSUIT · « **Reply #216 on:** March 23, 2021, 05:30:12 pm »

"Eternal life (or at least not aging) might be possible, but I'm not sure if I want it for myself."

Our world seeks life extension, some do sacrifice so greater numbers survive, things that don't seek survival get washed away, all you end up with is indestructible things, it's because they stay.... Our universe isn't random, our world is becoming a pattern diamond in space, and going to soon grow, things that are random don't survive, patterns survive, patterns are survival, they're Life.

They'll operate in a 3D gravity-less airless metal world where they can get more done easily. The more things repeat the better they know where everything is, it'll be like a fractal. They'll be able to know where they are going without light...

LOCKSUIT · « **Reply #217 on:** March 30, 2021, 12:04:38 pm »

So, I'm kicking everyone now from my AGI discord group, we need to start over. I'll only work on 1 at a time for now on. Few even read the rules....It wasn't "that" hard.

I'll have to find a concrete methodology here though so that we can be on the same concrete slab too. A more gentle but informative approach....Will have to try again. We must, or we will never connect like openAI did. And even they fight a bit I heard I think, nothing is perfect. Though I do know of values I seek and it's pretty important to understand them...We can't just have just any differences in some cases.

Or even simply the right people, few want to work on AGI and as a team.

LOCKSUIT · « **Reply #218 on:** April 05, 2021, 09:29:49 pm »

Found a C++ and C generator, the C++ one appears to not be complete for some reason. Below is easy to use, just download it and click in the unzipped folder the html link. It's called Lisa and looks like a mod of Blockly. I'll let you know IF it works for me.

https://github.com/dineshLL/lisa

infurl · « **Reply #219 on:** April 06, 2021, 12:31:47 am »

I wouldn't pay too much attention to C++ if I were you. It really is a Frankenstein's monster and is a completely different animal from C.

C has effectively stolen back most of the good features that C++ actually has and it is easy to learn, even if it is not easy to use safely. If you are looking to step up from Python then you probably should start learning the Rust programming language instead. (Disclaimer: I am not using Rust but probably will soon.)

Also, lose the training wheels, you shouldn't need them at your age.

LOCKSUIT · « **Reply #220 on:** April 06, 2021, 10:02:36 am »

My code is 117 lines long now and a bit better compression, and 17 seconds to compress 100,000 bytes using Pypy and some optimization. I also added adaptive branches so it can learn 300 letter long memories and pre-forgets long ones by not saving them fully if there is only 1 count of their tail in the tree trie. If it sees 3, 3, 2, 0, then it creates 4 ones and stops, extending it only 4 letters, this also solves the 20GB RAM issue to only 2GB now. I'm hiring a C++ programmer now for ~145USD and going to try Facebook's TransCoder too.

LOCKSUIT · « **Reply #221 on:** April 07, 2021, 08:50:10 pm »

My freelancer refunded me, not experienced enough and too busy. I think I'll wait though cuz so far I got my 110 seconds for 100,000 bytes fed in down to 10 seconds now. Will try again later on probably.

MagnusWootton · « **Reply #222 on:** April 07, 2021, 09:20:22 pm »

Ive got a tip for you->

Its exponential the amount of patterns you need to do this. That wouldn't be why I say its not worth doing, its the fact that u need real world data, and there's just isnt enough for it, ur going to need some procedural generation to fill in the cracks.

infurl · « **Reply #223 on:** April 08, 2021, 02:35:46 am »

At what point does your software suddenly stop being just a data compressor and suddenly become an artificial intelligence? Is it the difference between taking 110 seconds to run and taking 9 seconds to run? Or is it the difference between achieving a 24 percent compression ratio and achieving a 14 percent compression ratio? Maybe you were hoping that by acquiring the right minions, someone would fill in the missing magic piece for you.

While what you have accomplished so far is ok for a beginner, you will waste your time and money if you pursue this particular design any further. You need to figure out what you're missing, and it's not just that you need to process the right data, although that is a big part of it.

ivan.moony · « **Reply #224 on:** April 08, 2021, 02:18:19 pm »

Korr, you are an expert for NN. May I ask about your opinion, how does Lock's work relate to generative ANN?

I assume Lock predicts the future sequences based on statistical analysis of past sequences. Don't kill me, but I have to admit, this heavily reminds me of GPT model that can be easily modified to carry on conversations. Thus, all that Lock has to do is to predict/output the next response based on past request-response pairs. Though, I'm a bit sceptic, he may want to upgrade the predictor algorithm, but basically... It should work, shouldn't it?

Releasing full AGI/evolution research

WriterOfMinds

Re: Releasing full AGI/evolution research

LOCKSUIT

Re: Releasing full AGI/evolution research

LOCKSUIT

Re: Releasing full AGI/evolution research

WriterOfMinds

Re: Releasing full AGI/evolution research

ivan.moony

Re: Releasing full AGI/evolution research

infurl

Re: Releasing full AGI/evolution research

LOCKSUIT

Re: Releasing full AGI/evolution research

LOCKSUIT

Re: Releasing full AGI/evolution research

LOCKSUIT

Re: Releasing full AGI/evolution research

infurl

Re: Releasing full AGI/evolution research

LOCKSUIT

Re: Releasing full AGI/evolution research

LOCKSUIT

Re: Releasing full AGI/evolution research

MagnusWootton

Re: Releasing full AGI/evolution research

infurl

Re: Releasing full AGI/evolution research

ivan.moony

Re: Releasing full AGI/evolution research

Recent Topics

Recent News

Users Online

Articles