Help needed: glob pattern generator

  • 19 Replies
  • 690 Views
*

Zero

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1163
Help needed: glob pattern generator
« on: September 19, 2021, 11:49:45 am »
Hi,

I need help! I'm trying to implement a glob pattern generator, that would take a list of strings as input, and that would output the longest glob pattern that matches those strings.

Code: text
input:
    so I have a car
    so I have a bicycle
    so you have a car

output:
    so * have a *

Is there such a tool already somewhere?
Google is a plague, a disease. It is the metastatic cancer of the human species.

*

infurl

  • Administrator
  • ***********
  • Eve
  • *
  • 1255
  • Humans will disappoint you.
    • Home Page
Re: Help needed: glob pattern generator
« Reply #1 on: September 19, 2021, 12:55:35 pm »
https://en.wikipedia.org/wiki/Longest_common_subsequence_problem

Here's a detailed description of the algorithm that you need to use. It contains lots of pseudocode which you should be able to convert into the language that you are using. I searched for ready-made libraries too but didn't find any. Try searching directly on github, you might have better luck than I did.

*

infurl

  • Administrator
  • ***********
  • Eve
  • *
  • 1255
  • Humans will disappoint you.
    • Home Page
Re: Help needed: glob pattern generator
« Reply #2 on: September 19, 2021, 12:59:47 pm »
http://www.rosettacode.org/wiki/Longest_common_substring

I just thought of looking on rosettacode.org and sure enough, they have the algorithm implemented there, in just about every language you could think of no less. Good luck!

*

Zero

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1163
Re: Help needed: glob pattern generator
« Reply #3 on: September 19, 2021, 01:03:15 pm »
Thanks, I already know the LCS algo, but it's not what I want to achieve. I actually used it as part of my glob generator and successfully got globs from 2 strings, but not from N strings... yet!
:)
Thank you!
Google is a plague, a disease. It is the metastatic cancer of the human species.

*

infurl

  • Administrator
  • ***********
  • Eve
  • *
  • 1255
  • Humans will disappoint you.
    • Home Page
Re: Help needed: glob pattern generator
« Reply #4 on: September 19, 2021, 01:06:37 pm »
https://en.wikipedia.org/wiki/Longest_common_subsequence_problem#Complexity

Quote
For the general case of an arbitrary number of input sequences, the problem is NP-hard.[1] When the number of sequences is constant, the problem is solvable in polynomial time by dynamic programming.

Oh dear, it does like a tricky one doesn't it.

*

ivan.moony

  • Trusty Member
  • ************
  • Bishop
  • *
  • 1583
    • contrast-zone
Re: Help needed: glob pattern generator
« Reply #5 on: September 19, 2021, 01:24:36 pm »
I'm not completely sure, but shouldn't it work like this: let's say we have to compare strings A, B, C, and D. First we compare A and B, then A and C, then A and D. Along the comparing, we accumulate differences into an array.
There exist some rules interwoven within this world. As much as it is a blessing, so much it is a curse.

*

Zero

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1163
Re: Help needed: glob pattern generator
« Reply #6 on: September 19, 2021, 01:43:50 pm »
Quote
Oh dear, it does like a tricky one doesn't it.

NP-hard? ouch... I need NP-tough help  ;D

I'm not completely sure, but shouldn't it work like this: let's say we have to compare strings A, B, C, and D. First we compare A and B, then A and C, then A and D. Along the comparing, we accumulate differences into an array.

Yes, or even cross-comparing them all, but it doesn't yield an obvious answer yet.
Google is a plague, a disease. It is the metastatic cancer of the human species.

*

Zero

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1163
Re: Help needed: glob pattern generator
« Reply #7 on: September 19, 2021, 05:21:44 pm »
Code: javascript


function lcs(S, T) {
   
var D = [];
var LCS_len = 0;
var LCS = [];

for (var i = 0; i < S.length; i++) {
D[i] = [];
for (var j = 0; j < T.length; j++) {
if (S[i] == T[j]) {
if (i == 0 || j == 0) {
D[i][j] = 1;
} else {
D[i][j] = D[i-1][j-1] + 1;
}

if (D[i][j] > LCS_len) {
LCS_len = D[i][j];
LCS = [S.substring(i-LCS_len+1, i+1)];
} else if (D[i][j] == LCS_len) {
LCS.push(S.substring(i-LCS_len+1, i+1));
}
} else {
D[i][j] = 0;
}
}
}
return LCS[0];
}



function multiLcs(strings) {
   
    let candidate = strings[0];
   
    for (let str1 of strings)
        for (let str2 of strings) {
            let subs = lcs(str1, str2);
            if (!subs) return '';
            if (subs.length < candidate.length)
                candidate = subs;
        }
    return candidate;
}



function recurGenGlob(strings) {

    let center = multiLcs(strings);
   
    if (center.length == '') return '';
   
    let pre = strings.map(str => str.slice(0, str.indexOf(center)));
    let post = strings.map(str => str.slice(str.indexOf(center) + center.length));
   
    center = recurGenGlob(pre) + '*' + center + '*' + recurGenGlob(post);
   
    return center;
}



function genGlob(strings) {
   
    let center = recurGenGlob(strings.map(s => '  '+s+'  '));
    let result = '';
    let star = false;
    for (let i = 0; i < center.length; i++) {
        if (center[i] != '*' || star)
            result += center[i];
        star = center[i] == '*';
    }
    return result.trim();
}



Google is a plague, a disease. It is the metastatic cancer of the human species.

*

ivan.moony

  • Trusty Member
  • ************
  • Bishop
  • *
  • 1583
    • contrast-zone
Re: Help needed: glob pattern generator
« Reply #8 on: September 19, 2021, 05:41:38 pm »
Does that work?
There exist some rules interwoven within this world. As much as it is a blessing, so much it is a curse.

*

Zero

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1163
Re: Help needed: glob pattern generator
« Reply #9 on: September 19, 2021, 07:10:50 pm »
I don't know. It does spit something out, but I believe it's not optimal.

Edit: I mean, I believe the returned value of genGlob() is not always the longest possible pattern. It needs testing. But since it's Js, you can directly copy-paste it in your browser's console ang give it a shot: at least for the example I gave in the original post, it works (better than me, since I forgot the common 'c' in 'car' and 'bicycle', my own hand-made output was wrong). The trim() trick is a bit hacky though.
« Last Edit: September 19, 2021, 07:40:40 pm by Zero »
Google is a plague, a disease. It is the metastatic cancer of the human species.

*

Zero

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1163
Re: Help needed: glob pattern generator
« Reply #10 on: September 19, 2021, 11:23:09 pm »
doesn't work with
Code
    "and so I have a car",
    "so I have a bicycle",
    "so you have a car",
Google is a plague, a disease. It is the metastatic cancer of the human species.

*

Zero

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1163
Re: Help needed: glob pattern generator
« Reply #11 on: September 19, 2021, 11:56:01 pm »
Here is the new corrected genGlob().

Code: javascript
genGlob = (strings) => {
   
    let center = recurGenGlob(strings);

    while (center.indexOf('**') > -1)
        center = center.replace(/\*\*/g, '*');

    let startStar = false,
        endStar = false,
        startLetter = strings[0][0],
        endLetter = strings[0][strings[0].length-1];
       
    for (let s of strings) {
        if (s[0] != startLetter) startStar = true;
        if (s[s.length-1] != endLetter) endStar = true;
    }
   
    if (!startStar) center = center.slice(1);
    if (!endStar) center = center.slice(0, -1);

    if (center.length == 0) center = '*';

    return center;
}

« Last Edit: September 20, 2021, 09:39:23 am by Zero »
Google is a plague, a disease. It is the metastatic cancer of the human species.

*

8pla.net

  • Trusty Member
  • ***********
  • Eve
  • *
  • 1262
  • TV News. Pub. UAL (PhD). Robitron Mod. LPC Judge.
    • 8pla.net
Re: Help needed: glob pattern generator
« Reply #12 on: September 29, 2021, 11:20:33 pm »
I may I know how. Are you OK with PHP?

My Very Enormous Monster Just Stopped Using Nine

*

Zero

  • Trusty Member
  • **********
  • Millennium Man
  • *
  • 1163
Re: Help needed: glob pattern generator
« Reply #13 on: October 03, 2021, 11:25:33 am »
I believe my JS is correct, but yes I'm OK with PHP.
Google is a plague, a disease. It is the metastatic cancer of the human species.

*

frankinstien

  • Starship Trooper
  • *******
  • 445
    • Knowledgeable Machines
Re: Help needed: glob pattern generator
« Reply #14 on: October 03, 2021, 07:49:26 pm »
This will work with any N number of words and sentences:

Code
using System;
using System.Collections.Generic;
using System.Linq;

public class Program
{
public static void Main()
{
List<string> alreadyCounted = new List<string>();
var counts = new Dictionary<string, int>();
string[] sets = new string[] { "so I have a car, but this car is used", "so I have a bicycle", "so you have a car" };
foreach (String aSet in sets)
{
var sentenceSplit = aSet.Split(new char[] { ' ', ',' });
alreadyCounted.Clear();
foreach (String aWord in sentenceSplit)
{
var trimmed = aWord.Trim();
if (counts.ContainsKey(trimmed) && !alreadyCounted.Contains(trimmed))
counts[trimmed] = counts[trimmed] + 1;
else if(!alreadyCounted.Contains(trimmed))
counts.Add(trimmed, 1);
alreadyCounted.Add(trimmed);
}
}

var results = counts.Where(x => x.Value == sets.Length);
if (results != null)
{
foreach (KeyValuePair<string, int> entry in results)
Console.WriteLine(entry.Key);
}
}
}

Try it on Fiddle: https://dotnetfiddle.net/

Its a small mod to get it to work with a file where you just load the file into the string array.

« Last Edit: October 03, 2021, 08:13:54 pm by frankinstien »

 


New challenge: Online Turing test
by chattable (AI News )
October 09, 2021, 11:47:42 am
Spooky?
by MagnusWootton (AI News )
September 24, 2021, 11:19:09 am
OpenAI Shuts Down Chatbot Project To Prevent 'Possible Misuse'
by LOCKSUIT (AI News )
September 23, 2021, 04:07:25 pm
Beautiful new work robot.
by MagnusWootton (Robotics News)
September 15, 2021, 01:00:31 pm

Users Online

130 Guests, 0 Users

Most Online Today: 145. Most Online Ever: 2369 (November 21, 2020, 04:08:13 pm)

Articles