Releasing full AGI/evolution research

  • 203 Replies
  • 30729 Views
*

MikeB

  • Nomad
  • ***
  • 78
Re: Releasing full AGI/evolution research
« Reply #195 on: March 03, 2021, 07:11:30 am »
One way to multiply word search by ~10-26 times is by having a tight process between looking for the first letter and going to the next data item. Assume most first letters in word lookup are not a match...

*

infurl

  • Administrator
  • **********
  • Millennium Man
  • *
  • 1102
  • Humans will disappoint you.
    • Home Page
Re: Releasing full AGI/evolution research
« Reply #196 on: March 03, 2021, 07:39:13 am »
One way to multiply word search by ~10-26 times is by having a tight process between looking for the first letter and going to the next data item. Assume most first letters in word lookup are not a match...

MikeB, reading your posts it sounds like you are using linear search which is never going to be fast beyond trivial cases. At the very least you should be sorting your data and using a binary search on it. Linear searching works on average in time of the order n/2 where n is the number of items. Binary search operates in time of the order log n to base 2. For a million records that's 500000 tests versus 20 tests. You should be able to do what you are doing in microseconds, not seconds.

*

MikeB

  • Nomad
  • ***
  • 78
Re: Releasing full AGI/evolution research
« Reply #197 on: March 04, 2021, 07:49:54 am »
One way to multiply word search by ~10-26 times is by having a tight process between looking for the first letter and going to the next data item. Assume most first letters in word lookup are not a match...

MikeB, reading your posts it sounds like you are using linear search which is never going to be fast beyond trivial cases. At the very least you should be sorting your data and using a binary search on it. Linear searching works on average in time of the order n/2 where n is the number of items. Binary search operates in time of the order log n to base 2. For a million records that's 500000 tests versus 20 tests. You should be able to do what you are doing in microseconds, not seconds.

I use a highly optimised linear search because I wasn't really able to sort it in the past... It's a quick "one char/wchar binary check and move on" with no other cpu cycles but will look into half-interval binary searching.

*

LOCKSUIT

  • Emerged from nothing
  • Trusty Member
  • ******************
  • Hal 4000
  • *
  • 4423
  • First it wiggles, then it is rewarded.
    • Main Project Thread
Re: Releasing full AGI/evolution research
« Reply #198 on: March 06, 2021, 03:21:21 am »
New code! I just started ~15 days ago back at my ol 800 lines of hardcode.

You can just take a look if you don't want to run it. There is no gradient descent etc it is fully understood AI.

My new code is ready, it is 101 lines of code and can compress 100MB (enwik8) to 21.4MB though I only tested this version up to 1MB and got 251,148,000 bytes (which is really good, shelwien's by comparison that achieves 21.4MB gets 256,602). It is in Python. To use it, run it, it tells you at bottom how many bytes it is compressed, currently I don't have the bin or evaluation fixed up so bare with me. Paste the long number you get at top of code after the 0. then lower the 50,000 to ex. 49,000 and run it, including change the input2 file dataset to only "<mediawiki " and it will regenerate it all back in out.txt. Place both files I give you in the same folder BTW.

There's still room to improve it, my exponential function is for now a simple else ifs, and i only did layers. And my length of a set of predictions = roof is a bit wobbly still, some lengths of candidates get different changes to roof. As well the global weights is a bit static feeling but it seems fine for now.

Change the extension type from .txt to .py for 2nd file:
Emergent

*

LOCKSUIT

  • Emerged from nothing
  • Trusty Member
  • ******************
  • Hal 4000
  • *
  • 4423
  • First it wiggles, then it is rewarded.
    • Main Project Thread
Re: Releasing full AGI/evolution research
« Reply #199 on: March 06, 2021, 07:50:48 am »
Just hours after I released my code I got it down from 251,148 bytes to 249,260 for 1,000,000 bytes in. Still working on it.
Emergent

*

infurl

  • Administrator
  • **********
  • Millennium Man
  • *
  • 1102
  • Humans will disappoint you.
    • Home Page
Re: Releasing full AGI/evolution research
« Reply #200 on: March 06, 2021, 08:08:18 am »
I took a look at your code to make sure it was harmless and tried to run it. It requires Python3 because of the embedded unicode characters, but it also didn't like the 'ansi' parameter that you were passing to the file reader. I expect it needs some environment variables set or some libraries installed that I'm not using and I wasn't going to waste time trying to track them down. I may try running it again when it is a bit more polished.

For comparison, the usual sort of utilities for compressing text such bzip2, xz, and 7zip compressed that data file down to around 290 kilobytes so the numbers that you are claiming could be impressive, assuming that you really can restore the file. Also note that they only take a few hundred milliseconds to do it, and that they often achieve compression ratios of ten percent, so it's quite possible that your program is getting such good results because it is overfitting and will die horribly with different input data. Try it with lots of different test inputs, and also try it on some much larger files before you start feeling too triumphant.

So, when can I expect your program to be able to file my taxes for me or hold an entertaining conversation? I don't imagine that it will be able to restore your fast withering telomeres until some time after that.
« Last Edit: March 06, 2021, 11:37:53 am by infurl »

*

LOCKSUIT

  • Emerged from nothing
  • Trusty Member
  • ******************
  • Hal 4000
  • *
  • 4423
  • First it wiggles, then it is rewarded.
    • Main Project Thread
Re: Releasing full AGI/evolution research
« Reply #201 on: March 06, 2021, 09:24:14 pm »
I'm not sure of any libraries needed, my code should run as is in python3. I use PyCharm IDE.

Yes it works on other files like BOOK at least. Wikipedia data is quite diverse and is natural human/ world data. It's not overfitting, enwik8 is not a solid pattern like aaaaaaa but also not total random like abcd1234, the more I can Lossly Compress enwik8 or 1MB of it the more this shows I am finding patterns in it. As you can see my code is small, if my code was 1MB I could be perhaps storing the full 1MB in the code and cheating. Yes my program is "AI" - it isn't something like Burrows Wheeler Transform or Run Length Compression, a neural network is a predictor and all predictors use patterns to predict. Yes python is slower by x10, my program would take 10 hours for 100MB training. Shelwien's (is in C++), which I mostly followed, is ~10 faster.

Not there yet but hopefully I get farther towards AGI.
Emergent

*

LOCKSUIT

  • Emerged from nothing
  • Trusty Member
  • ******************
  • Hal 4000
  • *
  • 4423
  • First it wiggles, then it is rewarded.
    • Main Project Thread
Re: Releasing full AGI/evolution research
« Reply #202 on: March 07, 2021, 11:16:02 pm »
Somehow, I did not ship the code right, one line was wrong, line 46 should be "k = j[0][g]". I swear I tested it before uploading.

My latest record is 1MB > 249,064 bytes. Not released this yet.
Emergent

*

MagnusWootton

  • Roomba
  • *
  • 8
Re: Releasing full AGI/evolution research
« Reply #203 on: Today at 08:34:50 am »
Somehow, I did not ship the code right, one line was wrong, line 46 should be "k = j[0][g]". I swear I tested it before uploading.

Thats the ghost is the simulation,  stuffing up everything we do.

 


New challenge: Online Turing test
by Denis ROBERT (AI News )
March 05, 2021, 03:30:33 pm
SwarmFarm agricultural robots
by infurl (Robotics News)
February 28, 2021, 12:48:38 am
Microsoft Patent To Construct Chatbots of Dead People Approved
by MikeB (AI News )
February 18, 2021, 06:18:35 am
Loebner Prize 2021
by Denis ROBERT (AI News )
February 10, 2021, 02:20:25 pm

Users Online

133 Guests, 0 Users

Most Online Today: 149. Most Online Ever: 2369 (November 21, 2020, 04:08:13 pm)

Articles