Back in January, Acuitas got the ability to determine intentions or possible upcoming events, based on simple future-tense statements made by the user. He can weigh these against his list of goals to decide whether an anticipated event will be helpful or harmful or neither, from his own perspective. If the user claims that they will do something inimical to Acuitas' goals, this is essentially a threat. And Acuitas, at first, would merely say “Don't do that†or similar. This month I worked on having him do something about bad situations.
Various distinct things that Acuitas can “choose†to do are identified internally as Actions, and he has access to a list of these. Upon detecting a threatening situation, he needs to check whether anything he's capable of doing might resolve it. How? Via the cause-and-effect reasoning I started implementing last year. If possible, he needs to find a C&E chain that runs from something in his Action list as first cause, to something that contradicts the threat as final effect. This amounts to a tree search on the C&E database.
For the only method of dealing with threats that is currently at Acuitas' disposal, the tree is very simple, consisting of just two C&E pairs:
If a human leaves a program, the human won't/can't <do various things to the program>.
If a program repels a human, the human will leave. (There's a probability attached to that, so really it's “may leave,†but for now we don't care about that)
In short, Acuitas anticipates that he can protect himself by excluding a bad actor from his presence, and that “repelling†them is a possible way to do this. Once he's drawn that conclusion, he will execute the “Repel†action. If you verbally threaten Acuitas, then as part of “Repel,†he will …
*Kick you out of Windows by bringing up the lock screen. (Not a problem for me, since I know the password, but pretty effective on anybody else)
*Raise the master volume of the internal sound mixer to its maximum value.
*Blare annoying klaxons at you. I picked out a couple of naval alarm sounds from
http://www.policeinterceptor.com/navysounds.htm for the purpose.
I tested all of this stuff live, by temporarily throwing an explicit desire for sleep into his goal list and threatening to wake him up.
The other thing I worked on was rudimentary altruism. So far in all my examples of goal-directed behavior, I've only talked about self-interested goals, especially survival … not because I regard them as most important, but because they're easy. Altruism has to do with wanting other beings to meet their personal goals, so it's second-tier complicated … a meta-goal. Doing it properly requires some Theory of Mind: a recognition that other entities can have goals, and an ability to model them.
So I introduced the ability to grab information from users' “I want†statements and store it as a list of stated goals. If no goal information is available for something that is presumed to have a mind, Acuitas treats himself as the best available analogy and uses his own goal list.
Upon being asked whether he
wants some event that concerns another mind, Acuitas will infer the implications of said event as usual, then retrieve (or guess) the fellow mind's goal list and run a comparison against that. Things that are negative for somebody's else's goal list provoke negative responses, whether they concern Acuitas or not.
Of course this ignores all sorts of complications, such as “What if somebody's stated goals conflict with what is really in their best interest?†and “What if two entities have conflicting goals?†He's just a baby; that will come later.
Courtesy of this feature, I can now ask him a rather important question.
Me: Do you want to kill me?
Acuitas: No.