I guess it depends on which level that the xml needs to be interpreted: is it the bot's internal file processing mechanism that needs to process the xml and extract the text-only part, which is than sent to the pattern matcher for further processing or does the pattern matcher itself need to process the xml?
-In the first case, you need a well defined xml file format so that it can be properly queried (hardcoded stuff, perhaps you can use xslt's to query the xml and have those xslt definitions dynamic?).
-For the second case: in the end, an xml file is also just a string, so if you have a pattern matcher that's powerful enough to handle the xml specification, you should be able to define a set of patterns that are able to read any type of xml and extract the text values from them, and use these for further processing.
A long time ago, I implemented a general purpose xml parser (that was part of the library for a programming language that I developed), so I think I have a fairly good idea what it takes to make one. The first thing that came to mind was 'recursion' (pretty obvious): an xml element can have other xml elements as children, so the pattern matcher needs a way to declare a pattern that refernces itself. All the file formats can also be a problem, but that should be handled by the bot's internal file loading.
I'm not certain this can be done with AIML since I don't know if it can handle recursion (can an AIML pattern reference another pattern or itself?). Furthermore, defining xml tags in xml files is tedious, at best.
From what I remember about the sourceforge, it can't have patterns that reference other patterns, so that's out.
I'm not certain about chatscript, though I would be suprised if it couldn't do recursion, so I think it can parse xml. (anyone knows this exactly?, I'd really like to know)
The pattern matching language that I am using is based on compiler generator techniques, and should be able to handle the full xml specification, though it would definitely be slower compared to a C# xml parser for instance (I think). Code size is probably about the same as a compiler generator like coco/r.
From the top of my head, it would look something like:
TOPIC name : XMLElement
Rule name: Element
you say: <$FrontName {~XMLElement.attribs} >{~XMLElement.Element | $content} </$backName>
<$FrontName/>
When: $BackName && ($frontName != BackName)
bot says: there was an error in the xml formatting
else
bot says: $content:Evaluate
Rule name: attrib
Inputs: $AttribName = \' $attribValue \'
This is just a rough sketch, untested, with big wholes and errors, just a basic start for a simple xml element. Key here is the recursion: ~XMLElement.Element which allows for nested xml elements.
also:
':Evalulate' is actually called differently, but I forgot the exact name, is for the next release anyway. It sends the text part back to the pattern matcher for further evaluation.
This scheme probably also only works if the bot has the ability to turn on/off certain patterns. For instance, if you first need to extract the text out of the xml and process this seperatly, you need to make certain that the patterns which handle the content, don't overrule the xml patterns. In my system, this can be done by turning on and off an entire topic.
Finally, if you use the neural network, anything is possible, you just need to code it out.