I am working on a C# project to create simulated evolving organisms. The general setup is a 2D cell grid environment where each organism exists on a tile and can tell some things about the surrounding tiles and take actions accordingly. Each organism has a network of cells in its "brain". Random selection takes 2 organisms based on weighted fitness from lifespan in time cycles (so things that were able to live longer are more fit for now, but I'd like to add other criteria in the future).
The network structure is what I am working on currently. I was using LSTM cells previously with the logistic function, but I found trouble encoding categories into a -1 to +1 range without resorting to adding a cell for each member of the category. Also, the output similarly didn't give me the discrete value I was looking for. I am working on a new approach where each cell has a current state A-Z, an output A-Z, and a string input of arbitrary length based on connections or inputs.
Example: The organism sits at (5, 4) on the grid and gets information about an object at (4, 4) (to the left) into an input cell. It might appear like this:
Cell previous state = "B" (last cell state affects next state; it just happens to be this at that moment)
Object basic type = "F" (it is a flower)
Object color (extended info) = "R" (red flower)
Object extra data = "D" (it has 4 leaves)
Linked cell 01 output = "E" (somewhere in the middle layers)
Linked cell 02 output = "A" (somewhere in the middle layers)
Input string is thus: "BFRDEA". I need to map this to the fuzzy (probabilistic) truth table data like this (currently using a dictionary with string as key but as you can imaging there are lots of possible input strings so it gets large):
Next cell state weights (random selection weighted): "A" = 0.2, "B" = 0.7, null (no change) = 0.1
Cell output weights: "E" = 0.5, "M" = 0.5
And so it goes, through the input, middle, and output layer. An output node could signal to move, pick up, eat, etc. to a certain adjacent cell on the map. I want to enable something I think is vital to evolution, which is information persistence. For example, human life changed with the invention of writing. I want to enable "talking" via string transmission between nearby organisms, I want to enable a parent organism sharing data with its child when spawning (initial network state?), and I want to enable "artifacts" meaning writing a string to an object which may be discovered later by another organism.
Problems:
If I want to mutate the network cell layout or connections, it is likely to break the string => table mapping (because mainly, the string length or order of the input letters is now different). I am trying to see if I should switch to some kind of rule-based system which is less brittle, but am not sure how to do that (any ideas?). I am also trying to relate the energy "cost" per cycle to the processing time, meaning a "smarter" net that can do logic more with less units, or less time-consuming look-ups, is going to have an advantage.
Only certain input strings are likely to be encountered based on the environment, as well as the internal logic; but, to get the organism to "do" anything useful, there needs to be some random mutation, or an initial simply network that at least explores a bit. Usually, my first problem is that nets don't want to move or replicate, but there is no easy solution like rewarding movement, that doesn't result in useful behavior either so I am not sure how to "seed" it other than randomizing until one of them gets lucky and does something useful.
My initial goals are to have organisms learn a few basic things: identify what to do in response to some objects, like not trying to move through a solid block like a wall; identify food objects, like a "green flower", while avoiding "red flower" which is poison and reduces energy. I am not sure how to evolve it based on reward or punishment, how it can relate its actions to the consequence.
Am looking for suggestions on ways to have it learn other than randomization of the net values.