Releasing full AGI/evolution research

MikeB · « **Reply #195 on:** March 03, 2021, 07:11:30 am »

One way to multiply word search by ~10-26 times is by having a tight process between looking for the first letter and going to the next data item. Assume most first letters in word lookup are not a match...

infurl · « **Reply #196 on:** March 03, 2021, 07:39:13 am »

Quote from: MikeB on March 03, 2021, 07:11:30 am

One way to multiply word search by ~10-26 times is by having a tight process between looking for the first letter and going to the next data item. Assume most first letters in word lookup are not a match...

MikeB, reading your posts it sounds like you are using linear search which is never going to be fast beyond trivial cases. At the very least you should be sorting your data and using a binary search on it. Linear searching works on average in time of the order n/2 where n is the number of items. Binary search operates in time of the order log n to base 2. For a million records that's 500000 tests versus 20 tests. You should be able to do what you are doing in microseconds, not seconds.

MikeB · « **Reply #197 on:** March 04, 2021, 07:49:54 am »

Quote from: infurl on March 03, 2021, 07:39:13 am

Quote from: MikeB on March 03, 2021, 07:11:30 am
One way to multiply word search by ~10-26 times is by having a tight process between looking for the first letter and going to the next data item. Assume most first letters in word lookup are not a match...

MikeB, reading your posts it sounds like you are using linear search which is never going to be fast beyond trivial cases. At the very least you should be sorting your data and using a binary search on it. Linear searching works on average in time of the order n/2 where n is the number of items. Binary search operates in time of the order log n to base 2. For a million records that's 500000 tests versus 20 tests. You should be able to do what you are doing in microseconds, not seconds.

I use a highly optimised linear search because I wasn't really able to sort it in the past... It's a quick "one char/wchar binary check and move on" with no other cpu cycles but will look into half-interval binary searching.

LOCKSUIT · « **Reply #198 on:** March 06, 2021, 03:21:21 am »

New code! I just started ~15 days ago back at my ol 800 lines of hardcode.

You can just take a look if you don't want to run it. There is no gradient descent etc it is fully understood AI.

My new code is ready, it is 101 lines of code and can compress 100MB (enwik8) to 21.4MB though I only tested this version up to 1MB and got 251,148,000 bytes (which is really good, shelwien's by comparison that achieves 21.4MB gets 256,602). It is in Python. To use it, run it, it tells you at bottom how many bytes it is compressed, currently I don't have the bin or evaluation fixed up so bare with me. Paste the long number you get at top of code after the 0. then lower the 50,000 to ex. 49,000 and run it, including change the input2 file dataset to only "<mediawiki " and it will regenerate it all back in out.txt. Place both files I give you in the same folder BTW.

There's still room to improve it, my exponential function is for now a simple else ifs, and i only did layers. And my length of a set of predictions = roof is a bit wobbly still, some lengths of candidates get different changes to roof. As well the global weights is a bit static feeling but it seems fine for now.

Change the extension type from .txt to .py for 2nd file:

LOCKSUIT · « **Reply #199 on:** March 06, 2021, 07:50:48 am »

Just hours after I released my code I got it down from 251,148 bytes to 249,260 for 1,000,000 bytes in. Still working on it.

infurl · « **Reply #200 on:** March 06, 2021, 08:08:18 am »

I took a look at your code to make sure it was harmless and tried to run it. It requires Python3 because of the embedded unicode characters, but it also didn't like the 'ansi' parameter that you were passing to the file reader. I expect it needs some environment variables set or some libraries installed that I'm not using and I wasn't going to waste time trying to track them down. I may try running it again when it is a bit more polished.

For comparison, the usual sort of utilities for compressing text such bzip2, xz, and 7zip compressed that data file down to around 290 kilobytes so the numbers that you are claiming could be impressive, assuming that you really can restore the file. Also note that they only take a few hundred milliseconds to do it, and that they often achieve compression ratios of ten percent, so it's quite possible that your program is getting such good results because it is overfitting and will die horribly with different input data. Try it with lots of different test inputs, and also try it on some much larger files before you start feeling too triumphant.

So, when can I expect your program to be able to file my taxes for me or hold an entertaining conversation? I don't imagine that it will be able to restore your fast withering telomeres until some time after that.

LOCKSUIT · « **Reply #201 on:** March 06, 2021, 09:24:14 pm »

I'm not sure of any libraries needed, my code should run as is in python3. I use PyCharm IDE.

Yes it works on other files like BOOK at least. Wikipedia data is quite diverse and is natural human/ world data. It's not overfitting, enwik8 is not a solid pattern like aaaaaaa but also not total random like abcd1234, the more I can Lossly Compress enwik8 or 1MB of it the more this shows I am finding patterns in it. As you can see my code is small, if my code was 1MB I could be perhaps storing the full 1MB in the code and cheating. Yes my program is "AI" - it isn't something like Burrows Wheeler Transform or Run Length Compression, a neural network is a predictor and all predictors use patterns to predict. Yes python is slower by x10, my program would take 10 hours for 100MB training. Shelwien's (is in C++), which I mostly followed, is ~10 faster.

Not there yet but hopefully I get farther towards AGI.

LOCKSUIT · « **Reply #202 on:** March 07, 2021, 11:16:02 pm »

Somehow, I did not ship the code right, one line was wrong, line 46 should be "k = j[0][g]". I swear I tested it before uploading.

My latest record is 1MB > 249,064 bytes. Not released this yet.

MagnusWootton · « **Reply #203 on:** March 08, 2021, 08:34:50 am »

Quote from: LOCKSUIT on March 07, 2021, 11:16:02 pm

Somehow, I did not ship the code right, one line was wrong, line 46 should be "k = j[0][g]". I swear I tested it before uploading.

Thats the ghost is the simulation, stuffing up everything we do.

LOCKSUIT · « **Reply #204 on:** March 09, 2021, 02:47:54 am »

Indeed I think it was Pycharm autosave feature was too slow a second before I closed it

infurl · « **Reply #205 on:** March 10, 2021, 02:36:45 am »

It is good practice to create regression tests so you can automatically test your software whenever you make a change. It might be difficult for you to do this at the moment because your software is just a blob that has to be edited to change parameters like inputs and outputs, but if you were to learn how to divide it up into functions you would find it was a lot easier to work with and you could do things like implementing regression tests.

You started using paragraphs recently and people are more inclined to interact with you because of that. Just think of functions as paragraphs for computers.

LOCKSUIT · « **Reply #206 on:** March 12, 2021, 09:10:36 am »

Uploading new code, yous here should see a score increase since my code above is much older suddenly now, and I made it smaller code, and less work to use.

Now what you do is run it, and when you want to decompress it you change the top line input.txt to input2.txt and as I always do lower the range loop near top from 10000 to 9000 which runs all the code that many times. I'll upload the new file to put in folder, make sure you have in the folder input.txt, input2.txt, the python file, and if you really want to make sure it works you can before 2nd run delete the decompressed.txt it generates.

Change the code file type to .py BTW.

Omg, this forum won't let me upload the same named files ever again, please fix that. Renaming all 3 now...change the 2 inputs to input.txt and input2.txt --- the input2 is the small one BTW!

My score tests are below, I think my code needs to normalize somehwere cuz it seems to do bit worse than should as it gets better input>compress ratios as given more data:
MY BESTS
10,000 bytes in
3,423 bytes out
Shelwien's Green: 3,453

50,000 bytes in
15,405 bytes out
Shelwien's Green: ?

100,000 bytes in
28,457 bytes out
Shelwien's Green: 29,390

1,000,000 bytes in
248,842 bytes out
Shelwien's Green: 256,602

10,000,000 bytes in
2,288,646 bytes out
Shelwien's Green: 2,349,214

For 100MB:
I should be able to reach 21MB now
Shelwien's Green: 21,819,822

LOCKSUIT · « **Reply #207 on:** March 13, 2021, 05:50:10 am »

Ah, you'll need the compressed.txt in folder too cuz it believes it should be there.
Worst case scenario just make decode = '' instead of the open file path then upon run#2 change it back, only once you'll need to do this.

edit: i meant decode = '' sorry

LOCKSUIT · « **Reply #208 on:** March 15, 2021, 06:47:37 am »

uploading new code, achieved 28,115 bytes compressed for 100,000 bytes, shelwien's gets 29,390 (mine is 0.13 smaller), shelwien's achieves 21.8MB on the 100MB, so mine should be able to clearly reach at least 21.8MB - 0.13 = 20.5MB.

will improve it by tomorrow probably and add chat ability

maybe i'll film a short video explaining current code soon

follow instructions for use above, you'll need change extension type name to .py, put em in the same folder, with input.txt and input2.txt, and change compre to compressed.txt.....

LOCKSUIT · « **Reply #209 on:** March 16, 2021, 09:40:02 pm »

I'm going to put in effort to explain my code, and what AGI is, in this post. If members of my group don't connect sooner or later they must leave for now in hopes to narrow down more similar clones of myself.

Our questions for AGI are solving cancer etc, these are big problems that need lots of data and pattern finding. You could feed GPT-2 800 trillion different questions, you can't just program the correct response to solve AI. GPT-2 is only 400 lines of code and can almost complete the rest of the sentence, image, or song correctly for any sentence/ input fed in. It is general purpose like the brain. Check out openAI.com.

The best algorithm on the Hutter Prize compresses 100MB to 14.8MB. This is an evaluation you must use every time you run your AI, it tells you if you correctly or better implemented the next part of your algorithm. The better it predicts the next letter the better it can losslessly compress data, it understands that data therefore.

My code above can compress 100MB to ~20.5MB. I know fully how it works. It's 100 lines of code too. I take a dataset of 100MB of text and start scanning it with a 16 letter long window, I store in a trie tree every 16 letters of it. I don't store the same root of the branch twice, 'we love food' and 'we love beds' can share the same root branch, brains don't store the same word twice, they instead strengthen connections to represent the Frequency of times seen. This strength fades and forgets eventually (permant version, keep reading). As my 16 letter window scans and builds a tree or hierarchy, I have the tree/ brain predict the next letter, too, to send to evaluation. If my input prompt is 'walking down the stree_?_' I search for exact match in the tree and get the things it seen came after that in the dataset. So after those 15 letters I have seen next letter t 44 times, a 5 times z 1 time, m 9 times, $ 1 time, ! 1 time, etc. This probability distribution is beggining to be learnt. Now, if I only have 2 possible next letters I saw can come next and have 77 observations, then I am sure I know the distribution.

Longer matches are better but rarer in a datset, if you have 33 different kinds of letters that can come next and each was seen about 1 time, you need much more observations still, so my code resorts to shorter matches, i search the tree for 15 letter matches, 14, 13... i get up to 16 sets of predictions, I basically stop if by the ex. 4th (4 letter) match I now have enough observations. So each, especially shorter matches get some weight, I mix all up to 16 sets of predictions.

For the no context set, a appear I saw 47455 times, b 5644 times, z 352, .... I divide each by the sum of all here to get a softmax score ex. a 0.24% counts are a, b 7%, z 3%. Same for contextual based prediction sets. The sets from long matches get less weight, so ex. 'a' 0.37% I give it 20%, hence 0.37 * 0.2 = 0.074, if it got 100% it's be 0.37 * 1 = 0.37. So shorter matches's sets of predictions get more attention.

So to sum it up, we know from many past experiences, given the context, and viewing the context in multiple ways, what next letter is Probably going to appear most the time and when would be z (once every 1,000 times see the context hell_?_ it'll be z).

The more data it trains on, the more accurate this network is, it's so fun! It improves expectingly if plotted per every 100 bytes or so you feed it. 10MB is better than 1MB.

For a set of predictions, letters that have manyyyy counts get even more, but then never reach perfect either, this is the exponential neuron threshold function. So a 43646, b 45, d 76, e 3, z 2.......a gets even more, it thinks a is 0.98% likely the next letter, but it won't go to 0.999999, the S curse shoots up fast it thinks it is yes or no, but then levels flat before reaching the top of the box.

I do this for layers too, so if the match of 8 letters has enough observations and need not include 7, 6, 5 letter matches to get more predictions, and 8 letters is realllly sure, i give more weight to its set of predictions then.

I also set manually for now a global layer weight, so ex. 5 letters gets 30% weight, i cut it in half or so to speak to allow it to decide on its own if 5 letter's set is sure enough, or not.

I use letter recency to improve next letter prediction. I may have seen z appear 55 times in my life in this dataset, a 46457 times, but for z seen just 100 letters back ago, it feels like i saw z 5000 times, it'll fade fast to 5 times, but makes me expect zzzzz_?_ to be a, z that comes next. This is yet another pattern, it merges energy on neurons to boost them for now, like we merge counts on a connection and branches in a network and threshold pooling yes or no. I do it for layers too, so i take the last ex. 4 letters, and search the last 300 letters for these 4 letters, collect all next letters, and boost these predictions, I include this in the layer set for it. It helped lots.

The evaluation makes a shorter 'long number' - if predicts better ex. 0.568654564745....this corrects predictions to the desired one for lossless extraction of the dataset back again, you run the same code to now de-compress it letter by letter. It predicts ex. p lots, but the next letter is o, so you store correction, and it's costly the more its predictions are wrong. This long number 0.7856856856 can be compressed further by making it into binary ex. 8 bits 01110100 can store 0-255, 3 bytes up to yup, because 8 bits can hold up to 256 combinations. Then this binary bits number is stored as bytes and you get Ty$3!74hd54sHJ8$0)3df in the final compressed file.

As I said in my work but not yet implemented, like other AIs use, translation is also used to recognize the prompt and get more predictions. Cat and dog share the same predictions, so if you normalize this data you can see dog and cat are very similar - of all predictions they share 80% of the things they each predict, so it allows you to recognize longgggg questions and know what letter comes next, cuz you get lotssss of matches, ex. 'I was eating ?' matches memories 'we were swallowing P', 'I was devouring P', 'we then ate P'.

Like Blender and PPLM etc but I never yet implemented, reward steers prediction, you can influence some % the prediction to be the letter or word love, food, sex, kissing, AI, and immortality. Through translation it can leak reward chemical to related nodes to learn new better predictions to achieve/ see/ predict the desired result. It's all based on cause > effect, statistics. Matches / merging is counting/ math. The brain efficienctly stores patterns and runs fast because of that.

BTW You recognize stretched / bigger/ rotated objects because if each part of its lines is strecthed or rotaed the same amount, there is no error, ex. h--e--l--l--o is 2 letters off for each, totalling to 8 error, but because each is off by 2, there is only base error 2, it's a pattern. If we had h------e-l--l-----------------------------------------o this will not "look" like hello, thesre is no a no doubt pattern it clearly is hello, it is random.

And multi-sensory, a brain tries to efficiently capture the world data, it can't brute force simulate everything and atom, it'd work if could to find all passwords - 1 is the answer, it could make all possible dead bodies if try all arrangement of particles in a simulation, but is costly. So brains use vision, sound etc to capture snapshots of the world, to get a spectrum, more eyes, and more diverse sensors capture the distribution faster. Same for more AGI agents.

There is many more patterns, but they ae rare, ex. "Tom Ron Clark has a mom naed Jane Bane ? (CLARK!)" and "rat wind mice, home scarf building, loop frog tunnel, pizza sun food, ant gum ? (BUG!)". This is actually a triple match translation and an energization of it, and it predicts a translation haha. A brain can learn new patterns just BY reading data/ patterns. Using my big few golden patterns above you can get IF-THEN rules and it can therefore build other rules from context>result prediction, see? It's a if then mchine. It models reality.

We also make our homeworld into a fractal pattern so we know where, when, what all is. We organize things into merged patterns, all is sqaure, circular, no odd errors, homes lined up and square, aka stacking counts occurrences see? Group similar buildings together like food stores, medical centers. And same for timing things. It allows us to predict moreeee accurately, and therefore survive longer. All we do is seek life extension (a structure that repeats in pattern or lifetime statue, a metal block or cloned pattern) by means of food sex home, AI cryonics etc, we clone ourselves and force ourrr schooling/ beliefs upon kids. AGIs will quickly clone their brain directly like cells, unlike atoms that emerge on their own. It's beneficial to clone your brain so you can help yourself do many things youuu wanted to do in paraellel. We use patterns ni brain, world, to BE a pattern. Nothing else exists but patterns, a rock/ toaster/ human are just evolved machines, we seek immortality and lie we are special SO to extend our lifetime, we fight to live longer, rocks dont simply.

Releasing full AGI/evolution research

MikeB

Re: Releasing full AGI/evolution research

infurl

Re: Releasing full AGI/evolution research

MikeB

Re: Releasing full AGI/evolution research

LOCKSUIT

Re: Releasing full AGI/evolution research

LOCKSUIT

Re: Releasing full AGI/evolution research

infurl

Re: Releasing full AGI/evolution research

LOCKSUIT

Re: Releasing full AGI/evolution research

LOCKSUIT

Re: Releasing full AGI/evolution research

MagnusWootton

Re: Releasing full AGI/evolution research

LOCKSUIT

Re: Releasing full AGI/evolution research

infurl

Re: Releasing full AGI/evolution research

LOCKSUIT

Re: Releasing full AGI/evolution research

LOCKSUIT

Re: Releasing full AGI/evolution research

LOCKSUIT

Re: Releasing full AGI/evolution research

LOCKSUIT

Re: Releasing full AGI/evolution research

Recent Topics

Recent News

Users Online

Articles