What's the reason for choosing Lua?
It's easy to nest it inside itself or other programs. It's also got extensible syntax and semantics. Lua is clean, simple, readable, maintainable, and you don't have to muck with arbitrary whitespace, or boatloads of brackets. There are built in concurrency mechanisms, and if I want to make it faster, all I have to do is switch the underlying interpreter over to LuaJIT - almost, or as fast as compiled c, easily one of the fastest languages. LuaJIT also has a beautiful FFI system that makes including third party libraries extremely easy.
It's also portable across platforms and languages - embedding is trivial. That being said, it's lacking a modern IDE, and some people don't like the syntax. Anyway, it's easily one of my favorite languages.
Here's an example of a LSTM neural network (http://en.wikipedia.org/wiki/Long_short_term_memory) setup:
local function _sigmoid(input,steepness)
if not(steepness) then steepness = 1 end
return 1.0 / (1.0 + math.exp(-input)) +.1
--return(math.tanh(input/steepness))
end
local function _createWeightsArray(numWeights)
local weightsArray = {}
for i=1,numWeights do
weightsArray[i] = .1-(.2 * math.random())
end
return(weightsArray)
end
local function _createNeuron(numWeights)
--Add 1 to numWeights for bias
numWeights = numWeights+1
local neuron = {
_createWeightsArray(numWeights), -- input weights
_createWeightsArray(numWeights), -- input gate weights
_createWeightsArray(numWeights), -- retention weights
_createWeightsArray(numWeights) -- output weights
}
neuron.memory = 1
return(neuron)
end
local function _createNetwork(numInputs, hiddenLayerTable, numOutputs)
local network = {}
local numHiddenLayers = #hiddenLayerTable
local inputLayer = {}
local outputLayer = {}
for i=1, numInputs do
inputLayer[i] = _createNeuron(1)
end
network[1] = inputLayer
for i=1, numHiddenLayers do
network[i+1] = {}
for j = 1, hiddenLayerTable[i] do
network[i+1][j] = _createNeuron(#network[i])
end
end
outputLayer[1] = {}
for i=1, numOutputs do
outputLayer[i] = _createNeuron(hiddenLayerTable[numHiddenLayers])
end
network[numHiddenLayers+2] = outputLayer
return(network)
end
local function _activateNeuron(neuron, inputs, freeze)
--initialize variables
local inputWeights = neuron[1]
local inputGateWeights = neuron[2]
local retentionWeights = neuron[3]
local outputWeights = neuron[4]
local sigmoidReceptorInput = 0
local sigmoidReceptorInputGate = 0
local sigmoidReceptorRetention = 0
local sigmoidReceptorOutput = 0
--sum incoming inputs
for i,v in ipairs(inputs) do
sigmoidReceptorInput = sigmoidReceptorInput + (inputWeights[i] * v)
sigmoidReceptorInputGate = sigmoidReceptorInputGate + (inputGateWeights[i] * v)
sigmoidReceptorRetention = sigmoidReceptorRetention + (retentionWeights[i] * v)
sigmoidReceptorOutput = sigmoidReceptorOutput + (outputWeights[i] * v)
end
-- calculate output
local input = _sigmoid(sigmoidReceptorInput)
local inputgate = _sigmoid(sigmoidReceptorInputGate)
local retention = _sigmoid(sigmoidReceptorRetention)
local outputgate = _sigmoid(sigmoidReceptorOutput)
local workingMemory = input * inputgate
local dynamicMemory = neuron.memory * retention
local memory = workingMemory + dynamicMemory
local output = memory * outputgate
if freeze ~= 1 then neuron.memory = memory end
return(output)
end
local function _forwardPropagate(network, inputs)
local inputLayerOutput = {}
local currentLayerOutput = {}
local bias = -1
if #inputs ~= #network[1] then
print("Unexpected difference in inputs!!!")
return(0)
end
for i,v in ipairs(inputs) do
if network[1][i] ~= nil then
inputLayerOutput[i] = _activateNeuron(network[1][i], {v})
end
end
--add bias input to layer output
inputLayerOutput[#inputLayerOutput+1] = bias
currentLayerOutput[1] = inputLayerOutput
for i=2,#network do
currentLayerOutput[i] = {}
for j, neuron in ipairs(network[i]) do
currentLayerOutput[i][j] = _activateNeuron(neuron, currentLayerOutput[i-1])
end
--apply bias to each successive layer
if i < #network then currentLayerOutput[i][#currentLayerOutput[i]+1] = bias end
end
return(currentLayerOutput)
end
It's not optimized, but I was able to put it together after about a week of reading papers on the theory behind LSTM. It's a method of adding persistence and "input agnosticism" to neural networks. Memory states can be retained, impressed, and removed to alter the functionality of a network, giving them awesome flexibility.
You'll notice that it lacks a training method. I've been working out a quickprop/cascade correlation setup, but that method seems kludgy. I've been waiting for inspiration to strike - there's gotta be a method of building networks dynamically without running into a combinatoric explosion. I have an inkling that there are going to be lots of solutions, but that the solutions are dependent on the implementation - I'll have to have the network in situ to create a meaningful training method that includes dynamic self-update.
Anyhow - this is part of my project. My ultimate goal is to create a mind/knowledge representation database and plug in a combination of dynamic LSTM ANNs and system functions that grow and learn and communicate intelligently. Database and interface/console coming up sometime later, after I get some blog stuff organized.