While I will agree that everything can be broken down to its simplest terms or pieces, not everything can be made simple.
Again, you're talking about seeing your mom's face yet your initial premise was about robots. The robot doesn't just "look" at an image and immediately have a match for it. It goes through the necessary steps of its programming, first looking at the image, adjusting any necessary lighting via outside source or via lens aperture, to the necessary graphical processing units, to its storage system, sorting through the possibly vast assortment of other images then giving weighted values to the ones that might seem to be "similar" in nature to the one being viewed. When it has determined its "best candidate" from possibles, it decides on one that hopefully will be correct. 9.7 out of 10 times, it will be.
Murderer in my room - kills my mom - and he turns into my girlfriend/mom - and I go to kill HIM because even though I see a pretty girl I STILL initiate my "kill actions" either on cue or self-initiate (ex. eyes close n no cue). TA DA
No 10 seconds eva - the 10 seconds IS matching but it is attempts at it and each time the bucket gets filled a little more until pop - its that burglar from yesterday. As said, input immediately fires right to the match and probably doesn't even backup!
That is the fault of the simpleness of your algorithm. Take a look -- if it is not simple, one would restrain instead of killing the murderer.
If one were to be more advanced, where one would have such a thing as "emotion" -- then one would rather seek the truth. One would attempt to be forgiving and take their patience and time to find answers. One would attempt to think outside the box...
Such as one wouldn't attempt to guess that it is a cube at first glance of my example picture. When you have an answer from a match, it does not stop there. It is required to gather every single match before going through the process of determining which is the right one -- the process known as elimination.
I am sure one've done that in school during examinations. Objective questions provides four possible choices; A, B, C, and D. If one were unable to answer the question due to unable to remember correctly, one might not notice, but everyone uses the process of elimination to find the last possible choice, and that final choice may be the answer -- once again one would say "may be", because even though one have found the answer, it still doesn't mean that it is correct.
Just as Art said, it goes through the necessary steps before jumping conclusions.
One would ask the murderer about how the hell the burglar gets in, find his way into the house, and who the murderer is, before jumping conclusions thinking that the murder is the burglar and started killing him at once.
While murdering people, by itself, is already wrong, no matter what happened, that is if you decided to kill the murderer, that you'll become the murderer by itself -- which in turn, you'll be arrested as well, unless the murderer resisted and died as a result of one's self-defense.
Derived from your example, if it is "simple", then it might as well as your AI easily becoming a broken killing machine with an ever so slightly poke.