/ / / / / / / /

 

Minimal Disturbance Neural Network

In a minimal disturbance system, every input into the system drives the learning process. If there is no signal then the system is seen as being in a stable state. Rewards and maximal return are not sought, as is the case with credit assignment learning. Instead, any disturbance-free state is satisfactory.

The neural networks are biologically plausible feed-forward networks made up of adaptive leaky integrate-and-fire neurons. Each neural network is made up of three distinct layers; input, middle and output layer. The networks are evolved and evaluated when tasked with maximising two resources, labelled here as 'energy' and 'water'. The network is provided with a set of actions that can either increase or decrease by one or two resource points, or are neutral to, either the energy or water level in a virtual body.

There is one output neuron per action. The action performed by the network must have a direct and immediate effect on the target environment. If this action has a desirable effect then the corresponding input signals are reduced in the next turn. In this way the network acts as a minimal disturbance system as it settles upon actions that reduce its total input activation.

So for example, if the neural network was used in a robot with solar panels, actions that moved it into strong light would increase the charge sent to its batteries. An external module could sense this and reduce the appropriate input signal to the neural network by an amount corresponding to how much it needed the batteries charged. Once the batteries are charged, the input signals are no longer reduced and the robot is pushed out of its stable state.

Neuromodulators

Used here, a modulator is a global signal that can influence the behaviour of a neuron if that neuron has receptors for it. The signal decays over time, specified by a re-uptake rate, and can be increased by firing neurons that have secretors for it.

Neurons that are to be modulated are given a random number of receptors. These can be modulated by neurons in other layers that have secretors for those modulators. The receptors modulate either the neuron's sensitivity to input or probability of firing. The effect of this modulation is determined by the level of the associated modulator and whether the receptor is inhibitory or excitatory. Neurons can also have secretors. These increase the level of an associated modulator.

A minimal disturbance system can be biased towards either exploration or exploitation depending upon which of its layers are modulated. A network biased towards exploration, as in A), is more likely to try other actions even though they have not proved to provide the most desirable effect in the past. This means that if the effect of another action changes for the better then the network is more likely to start using the other action. A network biased towards exploitation, as in B), is more likely to settle upon actions that have a desirable effect and result in reduced input strength. This means that if the effect of another action changes for the better then the network is less likely to start using the other action.

Current work involves using minimal disturbance systems to arbitrate between other neural networks. Pictured below is a parent minimal disturbance system feeding inputs into two child minimal disturbance systems. Each child system has actions dedicated to increasing a different resource. The parent adapts the strengths of its outputs depending on which resource has been signalled as needing replenishment. This is signalled via an external source modulators.