Hi all,
I was looking for a file format that would be somewhere between INI files (too simple) and XML (too... XMLish). Here is what I came up with. I call it TNT, short for Typed and Named Trees. It's heavily inspired from BBCode.
[a type = a label]
types and labels can contain spaces
labels can be identifiers or glob patterns, or anything
[type only]
labels are optional
nesting is allowed and [inline] indentation doesn't matter [/inline]
[/type only]
and we also have [empty tags/]
[/a type]
It is less expressive than XML since we get rid of attributes. I know, TNT should not exist: we're supposed to follow standards, but I can't help it. :idiot2: :)
EDIT:
These .tnt files could be the building blocks of a simple thinkbot system.
- From AIML I want to keep the trigger/template idea with glob patterns and star-tags.
- Unlike chatbots, a thinkbot won't generate speech directly, it will generate thoughts instead. Thoughts call other thougths. Eventually, thoughts call action, like speech.
- For scripting, I'll use my-basic (https://github.com/paladin-t/my_basic) which has no square brackets in its syntax (and which feels like I'm 10 again), so user can insert star-tags inside code.
- The whole thing should be mounted on libuv (https://nikhilm.github.io/uvbook/introduction.html) for async and good IO.
- There should be a focus system, but I still don't know how. It has to be straightforward.
The previous tnt sample would translate to the following JSON:
[
{
"istag" : TRUE,
"type" : "a type",
"label" : "a label",
"content" :
[
{
"istag" : FALSE,
"text" : "types and labels can contain spaces \n labels can be identifiers or glob patterns, or anything"
},
{
"istag" : TRUE,
"type : "type only",
"label" : FALSE,
"content" :
[
{
"istag" : FALSE,
"text" : "labels are optional \n nesting is allowed and"
},
{
"istag" : TRUE,
"type" : "inline",
"label" : FALSE,
"content" :
[
{
"istag" : FALSE,
"text" : "indentation doesn't matter"
}
]
}
]
},
{
"istag" : FALSE,
"text" : "and we also have"
},
{
"istag" : TRUE,
"type" : "empty tags",
"label" : FALSE,
"content" : FALSE
}
]
}
]
EDIT: There should be support for a few useful specific syntaxes inside certain tags, like string lists, n-ary relations, an RDF syntax, ...etc.
Hi!
In PEGjs (http://pegjs.org/online), the following grammar...
Content
= (Element / Text)*
Element
= startTag:sTag content:Content endTag:eTag {
if (startTag.type != endTag.type) {
throw new Error(
"Expected [/" + startTag + "] but [/" + endTag + "] found."
);
}
return {
istag: startTag.istag,
type: startTag.type,
label: startTag.label,
content: content
};
}
/ startTag:selfTag {
return {
istag: startTag.istag,
type: startTag.type,
label: startTag.label,
};
}
sTag
= [ \n\t]* "[" type:TagType "]" [ \n\t]* { return {istag:true,type:type,label:false}; }
/ [ \n\t]* "[" type:TagType "=" lab:label "]" [ \n\t]* { return {istag:true,type:type,label:lab}; }
label
= chars:[^/\[\]=]* {return chars.join("").trim();}
selfTag
= [ \n\t]* "[" type:TagType "/]" [ \n\t]* { return {istag:true,type:type,label:false}; }
/ [ \n\t]* "[" type:TagType "=" lab:label "/]" [ \n\t]* { return {istag:true,type:type,label:lab}; }
eTag
= [ \n\t]* "[/" type:TagType "]" [ \n\t]* { return {type:type}; }
TagType
= chars:[a-zA-Z0-9 ]+ { return chars.join("").trim(); }
Text
= chars:([^\[\]])+ { return {istag:false,text:chars.join("").replace(/[ \t]+/g, ' ').replace(/\n+/g,'\n').trim()};}
...will translate this...
[a type = a label]
types and labels can contain spaces
labels can be identifiers or glob patterns, or anything
[type only]
labels are optional
nesting is allowed and [inline] indentation doesn't matter [/inline]
[/type only]
and we also have [empty tags/]
[/a type]
...to this...
[
{
"istag": true,
"type": "a type",
"label": "a label",
"content": [
{
"istag": false,
"text": "types and labels can contain spaces
labels can be identifiers or glob patterns, or anything"
},
{
"istag": true,
"type": "type only",
"label": false,
"content": [
{
"istag": false,
"text": "labels are optional
nesting is allowed and"
},
{
"istag": true,
"type": "inline",
"label": false,
"content": [
{
"istag": false,
"text": "indentation doesn't matter"
}
]
}
]
},
{
"istag": false,
"text": "and we also have"
},
{
"istag": true,
"type": "empty tags",
"label": false
}
]
}
]