Machine Learning on an 8-bit Micro

8bit Image

8 bits, but 16Mhz!

I recently built a robot that utilized a Atmel ATmega328P 8-bit processor. Sensory input consisted of an ultrasonic range-finder attached to a servo that can take readings at at various angles. Other inputs include a 3-axis ADXL335 accelerometer as well as wheel rotation sensors.
Tank Robot

I designed a small Arduino clone board, etched the PCB and then after assembling everything, realized I'd made a mistake and accidentally had a non PWM pin controlling the servo. Oh well, going to have to fix that in software, as I don't want to etch another PCB.
Arduino clone

So, normally, it wouldn't be much of a challenge to take the data from these input sources and develop some code to avoid obstacles while navigating the environment. Been there, done that many times. I've got quite the collection of home built robots that walk, crawl and drive around my house. What they mostly have in common is the control system is hard-coded. Something along the lines of:

mainLoop() {
    for (int = 0; i< (int)(PI*(180/PI)); i+=SWEEP_ANGLE)
    {
        // sweep from 0 to PI degrees
        range_cm[i/SWEEP_ANGLE] = sonar.ping_cm();
    }
}

The ATmega328P is an 8-bit MCU with no hardware support for floating point. I obviously make a point in the code above to use floating point to show that the compiler can emulate FPU support via software. This emulation is terribly slow. Furthermore, the ATMega328P has just 2kB of RAM and only 32kB of storage (flash). The ATMega328P in my robot is running at approximately 16Mhz. This is not a fast CPU, but for $3 it will do.

Its all about the Sigmoid!

I decided to see if I could implement a Feedforward Neural Network and attempt to train it utilizing the gradient descent algorithm (also known as Backpropagation of Error). The typical Artificial Neural Network (ANN) consists of multiple layers of neurons that each sum the inputs*weights and integrate the result and generate an output. The inputs are typically floating point numbers. The activation function most utilized in the simple feed forward network is the Sigmoid function:
$$g(z) = \frac{1}{1+e^{-z}}$$ sigmoid_plot Every forward pass through the neural network will calculate g(z) multiple times. Remember, this is slow due to the ATmega328P lacking hardware floating point. It is especially slow because just calculating the exponential involves close to a dozen divisions and multiplications. We can do better than this. I came up with a function that approximates the sigmoid but only requires two divides and no multiplications.
$$g(z) = \frac{z/2}{(1+abs(z))+0.5}$$ approximation_plot

The functions are close enough that training through gradient descent works just fine, even with floats. The end result is a robot that utilizes a neural network to learn to automatically navigate the room in the following manner:

  1. Initially the neural net is untrained and the robot will randomly roam its environment and randomly respond to inputs.
  2. When a collision is detected, the robot will know what the sensor reading was prior to the collision and will back up and try a random behavior (ie. turn left, move forward, etc). If the randomly selected behavior results in no collision, the neural network is trained on that input and the selected output behavior.
  3. This goes on multiple times, as the neural network begins to learn about it's environment, it begins making fewer mistakes. At some point, the network will be fully trained and should no longer make mistakes.

I ended up using a 3 layer neural network in a (3-3-2) configuration with three inputs and two outputs directly driving the differential steering of the robot. The feed forward phase of the neural network ran quite well (as seen in the video) and performed well enough for real-time control of the robot. The un-optimized learning phase was slower, but it was not my goal to make the learning occur at real-time, nor was it important in this application.

There are many improvements that could be made. The first of which would be to forego floating point altogether and utilize fixed point math and come up with a neural network based on fixed point routines. That's probably the optimal way to integrate a neural network on an embedded system; however, not wanting to spend a lot of time implementing fixed point routines and a neural network that can utilize those this can be a viable alternative.

Embedded video of the learning phase below:

The code can be downloaded here.