Ai Dreams Forum

Member's Experiments & Projects => General Project Discussion => Topic started by: Zero on March 25, 2020, 12:36:39 am

Title: The usual sequence
Post by: Zero on March 25, 2020, 12:36:39 am
Hi,

Here's a little algorithm I came up with tonight, well nothing fancy but I quite like it. You give it sequences (arrays), it gets used to it, and when you give it incomplete sequences, it returns the most probable complete one. As I said, nothing fancy. It will end up in BirdyVM. Here is it.

Code: Javascript

sys.getUsual = (function () {

    var corpus = [];

    var getExtensions = function (base, corpus) {
        var result = [];
        for (let c of corpus) {
            var discard = false;
            for (let b of base)
                if (!c.includes(b)) {
                    discard = true;
                    break;
                }
            if (!discard) result.push(c);
        }
        return result;
    }

    // from https://stackoverflow.com/questions/11919065/sort-an-array-by-the-levenshtein-distance-with-best-performance-in-javascript
    var distance = function (s, t) {

        var d = [];
        var n = s.length;
        var m = t.length;
        if (n == 0) return m;
        if (m == 0) return n;
        for (var i = n; i >= 0; i--) d[i] = [];
        for (var i = n; i >= 0; i--) d[i][0] = i;
        for (var j = m; j >= 0; j--) d[0][j] = j;
        for (var i = 1; i <= n; i++) {
            var s_i = s[i - 1];
            for (var j = 1; j <= m; j++) {
                if (i == j && d[i][j] > 4) return n;
                var t_j = t[j - 1];
                var cost = (s_i == t_j) ? 0 : 1;
                var mi = d[i - 1][j] + 1;
                var b = d[i][j - 1] + 1;
                var c = d[i - 1][j - 1] + cost;
                if (b < mi) mi = b;
                if (c < mi) mi = c;
                d[i][j] = mi;
                if (i > 1 && j > 1 && s_i == t[j - 2] && s[i - 2] == t_j) {
                    d[i][j] = Math.min(d[i][j], d[i - 2][j - 2] + cost);
                }
            }
        }
        return d[n][m];
    }

    var getScores = function (base, line) {

        var score = [];
        for (let l = 0; l < line.length; l++) score[l] = 0;
        for (let l = 0; l < line.length; l++) {
            for (let word of line[l]) {
                for (let l2 = 0; l2 < line.length; l2++) {
                    if (line[l2].includes(word)) score[l2] += 1;
                }
            }
        }
        var result = [];
        for (let r = 0; r < score.length; r++) {
            var ls = score[r];
            var ld = distance(base, line[r]);
            var ll = line[r].length;
            result.push({
                line: line[r],
                score: [
                    ls / ld / ll,
                    ls / ld / 1,
                    ls / 1 / 1,
                    1 / ld / ll,
                    1 / ld / 1,
                    1 / 1 / ll
                ]
            });
        }
        return result;
    }

    var getBest = function (base, result) {

        for (let method = 0; method < 6; method++) {
            var best = [];
            for (let r of result) {
                var highest = 0;
                if (r.score[method] > highest) {
                    best = [r.line];
                    highest = r.score[method];
                }
                else if (r.score[method] === highest) best.push(r.line);
            }
            if (best.length === 1) return best[0];
        }
        return base;
    }

    return function (sequence, maxCorpusLength) {

        var extensions = getExtensions(sequence, corpus);
        var scores = getScores(sequence, extensions);
        var best = getBest(sequence, scores);

        while (corpus.length >= maxCorpusLength)
            corpus.splice(Math.floor(Math.random()*corpus.length), 1);

        corpus.push(sequence);

        return best;
    }
})();


I'm a bit tired tonight, but I'll try to roughly describe it tomorrow.
Title: Re: The usual sequence
Post by: Zero on March 25, 2020, 01:46:01 pm
So, what it does is first get all the stored sequences that contain every item you gave as input. That's the getExtensions() function.

Then, getScores() calculates a score for each of them, like this:
- for each sequence
- for each word
- give 1 point to every sequence which contains this word
This results in a base score. Then we add the Levenshtein distance to the input and the length of the sequences in the mix, and try to get a rank with one of these formulas: (try first formula, if ex-aequo, try second, ...etc):
- score / distance / length
- score / distance
- score
- distance / length
- distance
- length
If it's still ex-aequo, return the input.

The use case is to give either sequences or incomplete sequences as input, and get as return value a possible "more complete" version of it.
Title: Re: The usual sequence
Post by: krayvonk on March 25, 2020, 03:14:58 pm
What the hell is the Levenschtein distance? sounds fancy, but is it any good?
What does "ex-aequo" mean, above water level?

And give us a demo of the output gibberish!
Title: Re: The usual sequence
Post by: krayvonk on March 25, 2020, 04:03:56 pm
I just had an idea!  :D

I dont know if youve already thought of it,  but now you know,  what if you werent just bringing back the same text but bringing back a reply,  that would be cool.
Title: Re: The usual sequence
Post by: Zero on March 25, 2020, 04:50:10 pm
Levenshtein distance (https://en.m.wikipedia.org/wiki/Levenshtein_distance) is cool. It tells you how much identical two sequences are.

Ex-aequo means "they have the same score".

It's not supposed to output gibberish, on the contrary :)

If you say:
"blue sky is nice"
"blue sky is nice"
"blue sky is nice"

Then you say:
"blue sky"

What will it say? You got it.

Now, the getExtensions() part is not good, there are better ways to do it.
Title: Re: The usual sequence
Post by: krayvonk on March 25, 2020, 06:14:41 pm
Levenschtein distance taken literally comes up with a few truths about things, but fails miserably on others (not being negative just harshly truthful).  Definitely interest building tho, I remember it makes it in the puzzle sections in glossy magazines sometimes.   If you get the meaning with the near spell together, thats when im off going nuts in my house about things.
Sounds like a winner,   I think this a.i. business is going to be simple, and im predicting my embarressment.    Just run the robot timestep to timestep on a stop watch calculus derivative style is probably all you need to maybe even make consciousness given enough power,  but im still not sure about it.  But if thats the case, I wouldnt bother feeling like a high*er* genius.  :-[