Recent Tutorials

7th May 2015 at 16:25 · By Lee Jacobson

Ant Colony Optimization For Hackers

Originally proposed in 1992 by Marco Dorigo, ant colony optimization (ACO) is an optimization technique inspired by the path finding behaviour of ants searching for food. ACO is also a subset of swarm intelligence - a problem solving technique using decentralized, collective behaviour, to derive artificial intelligence. Typical applications of ant colony optimization are combinatorial optimization problems such as the traveling salesman problem, however it can also be used to solve various scheduling and routing problems.

One advantage ant colony optimization algorithms have over other optimization algorithms is their ability to adapt to dynamic environments - a feature that makes it great for applications such as network routing, where there are likely to be frequent changes to accommodate to.

Ants in Nature

To fully understand ant colony optimization, it's important to first understand the natural behavior of ants which first inspired the algorithm.

When searching for food ants follow a relatively basic set of rules. Although simple, it's these rules which allow ants to communicate and cooperatively optimize their paths to food sources. One of the key characteristics behind all these rules is the use of pheromone trails. Pheromone trails are essentially what ants use to communicate to other ants that a food source has been found, and how to get to it. When other ants come across pheromone trails they can typically expect to find food if they decide to follow it. However, ants don't follow every pheromone trail they find. Depending on the strength of the pheromone trail, an ant may decide to take a different path, or perhaps a completely random path which has no pheromone on. However, on average the stronger a pheromone trail is, the more chance there is of an ant taking it.

Over time, unless reinforced by other ants, pheromone trails will gradually evaporate. This means that pheromone trails which no longer lead to a food source will eventually stop being used, promoting ants to find new paths and new food sources.

To understand how this process over time also enables the colony to optimize their paths, consider the following example:



Here we can see the path an ant has taken to reach a food source. From our view of the environment we can clearly see that although the ant may have found a valid path to the food source, it's not the optimal one. But, here's where it get's interesting... Next, ants from the colony stumble upon the previous ant's pheromone trail. Most take the original path of the pheromone trail, but a few others decide to take the other, more direct, path:



Although the original pheromone trail may be stronger at this point, a second pheromone trail has started to develop from the few ants randomly taking it. What's important is that in this scenario the second path has an advantage because the ants taking it will reach, and retrieve, the food quicker than ants following the other path. For example, it may take 3 minutes for one ant to lay a single pheromone trail from the food back to the colony taking the original path, but in that time an ant may be able to lay two pheromone trails when taking the shorter path. Due to this characteristic the shorter path will usually begin to acquire more pheromone than the original path. This then leads to the original pheromone trail being used less and eventually evaporating in favor of the new, shorter, path.



It's this characteristic that typically leads to shorter paths possessing the densest pheromone trails, and as a consequence, they are generally the preferred path by the ant colony.

It's worth briefly noting that this is only a generalization of ant behavior in nature, and that different ant species often vary in their exact use of pheromones.

The Algorithm

This tutorial will look at applying the ACO algorithm to the traveling salesman problem (TSP). If you are not already familiar with the traveling salesman problem, you may want to familiarise yourself with it here, Applying a genetic algorithm to the traveling salesman problem

The bulk of the ant colony optimization algorithm is made up of only a few steps. First, each ant in the colony constructs a solution based on previously deposited pheromone trails. Next ants will lay pheromone trails on the components of their chosen solution, depending on the solution's quality. In the example of the traveling salesman problem this would be the edges (or the paths between the cities). Finally, after all ants have finished constructing a solution and laying their pheromone trails, pheromone is evaporated from each component depending on the pheromone evaporation rate. These steps are then ran as many times as are needed to generate an adequate solution.

Constructing a Solution

As mentioned previously, an ant will often follow the strongest pheromone trail when constructing a solution. However, for the ant to consider solutions other than the current best, a small amount of randomness is required in its decision process. In addition to this, a heuristic value is also computed and considered helping to guide the search process towards the best solutions. In the example of the traveling salesman problem this heuristic will typically be the length of the edge between the city being considered - the shorter the edge, the more likely an ant will pick it.

Let's take a look at how this works mathematically:

$$p^k_{ij} = \frac{[t_{ij}]^\alpha \cdot [n_{ij}]^\beta}{\sum_{l\in N^{k}_{l}} [t_{il}]^\alpha \cdot [n_{il}]^\beta}$$
This equation calculates the probability of selecting a single component of the solution. Here, $t_{ij}$ denotes the amount of pheromone on a component between states $i$ and $j$, and $n_{ij}$ denotes it's heuristic value. $\alpha$ and $\beta$ are both parameters used to control the importance of the pheromone trail and heuristic information during component selection.

In code this would typically be written as a roulette-wheel-style selection process:

rouletteWheel = 0
states = ant.getUnvisitedStates()

for newState in states do
    rouletteWheel += Math.pow(getPheromone(state, newState), getParam('alpha'))
            * Math.pow(calcHeuristicValue(state, newState), getParam('beta'))
end for

randomValue = random()
wheelPosition = 0

for newState in states do
    wheelPosition += Math.pow(getPheromone(state, newState), getParam('alpha'))
            * Math.pow(calcHeuristicValue(state, newState), getParam('beta'))
    if wheelPosition >= randomValue do
        return newState
end for

For the traveling salesman problem a state would represent a single city on the graph. Here, an ant would be selecting the next city depending on the distance to the next city, and the amount of pheromone on the path between the two cities.

Local Pheromone Update

The local pheromone update process is applied every time an ant successfully constructs a solution. This step mimics the way ants lay pheromone trails after finding food in nature. As you may remember from earlier, in nature better paths acquire more pheromone due to ants being able to traverse them quicker. In ACO, this characteristic is replicated by varying the amount of pheromone deposited on a component by considering how well the completed solution scores. If we use the traveling salesman problem as an example, pheromone will be deposited on the paths an ant took between cities depending the total tour distance.


This image shows pheromone trails laid on the paths between cities in a typical traveling salesman problem.

Before looking at how it could be written in code, first let's look at it mathematically:

$$\Delta \tau^k_{ij} = \begin{cases}Q/C^k& \text{if component(i, j) was used by ant}\\0& \text{Otherwise}\\ \end{cases}$$ $$\tau_{ij} \leftarrow \tau_{ij} + \sum_{k=1}^m \Delta \tau^k_{ij}$$

Here, $C^k$ is defined as the total cost of the solution, in the example of the traveling salesman problem this would be the tour length. $Q$ is simply a parameter to adjust the amount of pheromone deposited, typically it would be set to 1. We sum $Q/C^k$ for every solution which used $\text{component(i, j)}$, then that value becomes the amount of pheromone to be deposited on $\text{component(i, j)}$.

For clarity, let's go over a quick example of this process when applied to the traveling salesman problem...
  

Here, we have a typical traveling salesman graph and we are interested in calculating how much pheromone should be deposited on component A. In this example, let's assume three ants used this edge, or "component A", when constructing their tour. Let's also assume the three tours constructed had the following total distances:

Tour 1: 450
Tour 2: 380
Tour 1: 460

To work out how much pheromone should be deposited we simply sum the following:

componentNewPheromone = (Q / 450) + (Q / 380) + (Q / 460)

Now add that value to the pheromone on the component:

componentPheromone = componentPheromone + componentNewPheromone

Now, for a quick code example:

for ant in colony do
    tour = ant.getTour();
    pheromoneToAdd = getParam('Q') / tour.distance();
    for cityIndex in tour do
        if lastCity(cityIndex) do
            edge = getEdge(cityIndex, 0)
        else do
            edge = getEdge(cityIndex, cityIndex+1)
        end if
        currentPheromone = edge.getPheromone();
        edge.setPheromone(currentPheromone + pheromoneToAdd)
    end for
end for

In this code example we are simply looping over the ant colony, fetching each of the tours and finally applying pheromone to each tour edge.

Global Pheromone Update
The global pheromone update is the stage in which pheromone is evaporated from components. It is applied after each iteration of the algorithm, when each ant has successfully constructed a tour and applied the local update rule.

The global pheromone update is defined mathematically as follows:

$$\tau_{ij} \leftarrow (1 - \rho) \cdot \tau_{ij}$$

Where $\tau_{ij}$ is the pheromone on the component between state $i$ and $j$ and $\rho$, is a parameter used to vary the level of evaporation.

In code this might look as follows:

for edge in edges
    updatedPheromone = (1 - getParam('rho')) * edge.getPheromone()
    edge.setPheromone(updatedPheromone)
end for


Optimization

There are many variations of ant colony optimization, two of the main ones being elitist and MaxMin. Depending on the problem it's likely these variations will preform better than the standard ant colony system we have looked at here. Luckily for us however, it's quite simple to extend our algorithm to include elitist and MaxMin.

Elitist

In elitist ACO systems, either the best current, or global best ant, deposits extra pheromone during it's local pheromone update procedure. This encourages the colony to refine it's search around solutions which have a track record of being high quality. If all goes well, this should result in better search performance.

The math for the elitist local pheromone update procedure is almost exactly the same as the the local update procedure which we previously studied, with one small update:

$$\Delta \tau^{bs}_{ij} = \begin{cases}Q/C^{bs}& \text{if component(i, j) belongs to an elite ant}\\0& \text{Otherwise}\\ \end{cases}$$ $$\tau_{ij} \leftarrow \tau_{ij} + \sum_{k=1}^m \Delta \tau^k_{ij} + e \Delta \tau^{bs}_{ij}$$
Where, $e$ is a parameter used to adjust the amount of extra pheromone given to the best (or "elite") solution.

MaxMin

The MaxMin algorithm is similar to the elitist ACO algorithm in that it gives preference to high ranking solutions. However, in MaxMin instead of simply giving extra weight to elite solutions, only the best current, or best global solution, is allowed to deposit a pheromone trail. Additionally, MaxMin requires pheromone trails are keep between a maximum and minimum value. The idea being that having a limited range between the amount of pheromone found on trails, premature convergence around sub-optimal solutions can be avoided.

In many MaxMin implementations the best current ant is initially the ant which lays pheromone trails, then later, the algorithm switches so that the global best ant is the only ant which can lay a pheromone trail. This process helps encourage search across the entire search space initially, before eventually focusing in on the all time best, hopefully making a few final amendments.

Finally, the maximum and minimum value can either be passed in as parameters or adaptively set with code.

Example

To finish off here is an example implementation of the ACO algorithm applied to the traveling salesman problem in JavaScript. Simply click the graph to add new cities, then click the start button when you're ready to run the algorithm. Feel free to play around with the parameters so you get a feel for how the search process works. I've made the code available below - but please beware that I haven't finished refactoring or commenting it just yet.

Debug Info

Configuration
ACO Mode:
Max Pheromone: (1 * Pheromone Deposit Weight) / Best distance
Min Pheromone: Max Pheromone * Scaling Factor
Min Scaling Factor:

Colony Size:
Max Iterations:
Alpha:
Beta:
Rho:
Initial Pheromone:
Pheromone Deposit Weight:

Download Source

Comments


24th March 2015 at 22:07 · By Lee Jacobson

Solving the Traveling Salesman Problem Using Google Maps and Genetic Algorithms

An ideal way to explore the potentials and pitfalls of genetic algorithms is by applying them to real world data. Perhaps one of the easiest ways to do this is by using the Google Maps API to implement a solution to the traveling salesman problem.

If you're unfamiliar with genetic algorithms, or the traveling salesman problem, then you might find the following links useful:
Creating a genetic algorithm for beginners
Applying a genetic algorithm to the traveling salesman problem

Below I've created a very simple route optimizer which uses distance and duration data from the Google Maps API to find the quickest route.

One interesting thing you might observe is how in real life the quickest route is often neither the most direct, or the most obvious route you might think of taking. In the real world there are many restrictions such as road speed and traffic conditions, which are usually equal in importance to the direction when searching for the quickest route. This becomes clear when optimizing the route for driving, to walking or cycling. If you spend a lot of your day behind the wheel, or on foot, you may even be pleasantly surprised to find yourself a quicker daily route. =)

This demo has been built completely in HTML and JavaScript, using just the Google Maps API and the JQuery library. You're welcome to explore the code using the download link at the bottom of the post.

Loading...
Configuration
Travel Mode:
Avoid Highways:
Population Size:
Mutation Rate:
Crossover Rate:
Elitism:
Max Generations:
Debug Info
Destinations Count: 0

Download from GitHub: https://github.com/leejacobson/googlemaps-tsp-ga

Comments


26th March 2014 at 21:16 · By Lee Jacobson

Introduction to Artificial Neural Networks Part 2 - Learning

Welcome to part 2 of the introduction to my artificial neural networks series, if you haven't yet read part 1 you should probably go back and read that first!

Introduction

In part 1 we were introduced to what artificial neural networks are and we learnt the basics on how they can be used to solve problems. In this tutorial we will begin to find out how artificial neural networks can learn, why learning is so useful and what the different types of learning are. We will specifically be looking at training single-layer perceptrons with the perceptron learning rule.

Before we begin, we should probably first define what we mean by the word learning in the context of this tutorial. It is still unclear whether machines will ever be able to learn in the sense that they will have some kind of metacognition about what they are learning like humans. However, they can learn how to perform tasks better with experience. So here, we define learning simply as being able to perform better at a given task, or a range of tasks with experience.

Learning in Artificial Neural Networks

One of the most impressive features of artificial neural networks is their ability to learn. You may recall from the previous tutorial that artificial neural networks are inspired by the biological nervous system, in particular, the human brain. One of the most interesting characteristics of the human brain is it's ability to learn. We should note that our understanding of how exactly the brain does this is still very primitive, although we do still have a basic understanding of the process. It is believed that during the learning process the brain's neural structure is altered, increasing or decreasing the strength of it's synaptic connections depending on their activity. This is why more relevant information is easier to recall than information that hasn't been recalled for a long time. More relevant information will have stronger synaptic connections and less relevant information will gradually have it's synaptic connections weaken, making it harder to recall.

Although simplified, artificial neural networks can model this learning process by adjusting the weighted connections found between neurons in the network. This effectively emulates the strengthening and weakening of the synaptic connections found in our brains. This strengthening and weakening of the connections is what enables the network to learn.

Learning algorithms are extremely useful when it comes to certain problems that either can't be practically written by a programmer or can be done more efficiently by a learning algorithm. Facial recognition would be an example of a problem extremely hard for a human to accurately convert into code. A problem that could be solved better by a learning algorithm, would be a loan granting application which could use past loan data to classify future loan applications. Although a human could write rules to do this, a learning algorithm can better pick up on subtleties in the data which may be hard to code for.

Learning Types

There are many different algorithms that can be used when training artificial neural networks, each with their own separate advantages and disadvantages. The learning process within artificial neural networks is a result of altering the network's weights, with some kind of learning algorithm. The objective is to find a set of weight matrices which when applied to the network should - hopefully - map any input to a correct output. In this tutorial, the learning type we will be focusing on is supervised learning. But before we begin, lets take a quick look at the three major learning paradigms.

  • Supervised Learning
    The learning algorithm would fall under this category if the desired output for the network is also provided with the input while training the network. By providing the neural network with both an input and output pair it is possible to calculate an error based on it's target output and actual output. It can then use that error to make corrections to the network by updating it's weights.
  • Unsupervised Learning
    In this paradigm the neural network is only given a set of inputs and it's the neural network's responsibility to find some kind of pattern within the inputs provided without any external aid. This type of learning paradigm is often used in data mining and is also used by many recommendation algorithms due to their ability to predict a user's preferences based on the preferences of other similar users it has grouped together.
  • Reinforcement Learning
    Reinforcement learning is similar to supervised learning in that some feedback is given, however instead of providing a target output a reward is given based on how well the system performed. The aim of reinforcement learning is to maximize the reward the system receives through trial-and-error. This paradigm relates strongly with how learning works in nature, for example an animal might remember the actions it's previously taken which helped it to find food (the reward).

Implementing Supervised Learning

As mentioned earlier, supervised learning is a technique that uses a set of input-output pairs to train the network. The idea to provide the network with examples of inputs and outputs then to let it find a function that can correctly map the data we provided to a correct output. If the network has been trained with a good range of training data when the network has finished learning we should even be able to give it a new, unseen input and the network should be able to map it correctly to an output.

There are many different supervised learning algorithms we could use but the most popular, and the one we will be looking at in more detail is backpropagation. Before we look at why backpropagation is needed to train multi-layered networks, let's first look at how we can train single-layer networks, or as they're otherwise known, perceptrons.

The Perceptron Learning rule

The perceptron learning ruleworks by finding out what went wrong in the network and making slight corrections to hopefully prevent the same errors happening again. Here's how it works... First we take the network's actual output and compare it to the target output in our training set. If the network's actual output and target output don't match we know something went wrong and we can update the weights based on the amount of error. Lets run through the algorithm step by step to understand how exactly it works.

First, we need to calculate the perceptron's output for each output node. As you should remember from the previous tutorial we can do this by:

output = f( input1 × weight1 + input2 × weight2 + ... )
- or -
$o = f(\sum\limits_{i=1}^n x_iw_i)$

Now we have the actual output we can compare it to the target output to find the error:

error = target output - output
- or -
E = t - o

Now we want to use the perceptron's error to adjust the weights.

weight change = learning rate × error × input
- or -
Δwi = r E x

We want to ensure only small changes are made to the weights on each iteration, so to do this we apply a small learning rate (r). If the learning rate is too high the perceptron can jump too far and miss the solution, if it's too low, it can take an unreasonably long time to train.

This gives us a final weight update equation of:

weight change = learning rate × (target output - actual output) × input
- or -
Δwi = r ( t - o ) xi


Here's an example of how this would work with the AND function...

Learning rate = 0.1
Expected output = 1
Actual output =  0
Error = 1

Weight Update:
wi = r E x + wi
w1 = 0.1 x 1 x 1 + w1
w2 = 0.1 x 1 x 1 + w2

New Weights:
w1 = 0.4
w2 = 0.4


Learning rate = 0.1
Expected output = 1
Actual output =  0
Error = 1

Weight Update:
wi = r E x + wi
w1 = 0.1 x 1 x 1 + w1
w2 = 0.1 x 1 x 1 + w2

New Weights:
w1 = 0.5
w2 = 0.5


Learning rate = 0.1
Expected output = 1
Actual output =  1
Error = 0

No error,
training complete.

Implementing The Perceptron Learning Rule

To help fully understand what's happening let's implement a basic example in Java.

First, we initiate our network's threshold, learning rate and weights. We could initiate the weights with a small random starting weight, however for simplicity here we'll just set them to 0.

double threshold = 1;
double learningRate = 0.1;
double[] weights = {0.0, 0.0};

Next, we need to create our training data to train our perceptron. In this example our perceptron will be learning the AND function.

// AND function Training data
int[][][] trainingData = {
    {{0, 0}, {0}},
    {{0, 1}, {0}},
    {{1, 0}, {0}},
    {{1, 1}, {1}},
};

Now, we need to create a loop that we can break from later if our network completes a cycle of the training data without any errors. Then, we need a second loop that will iterate over each input in the training data.

// Start training loop
while(true){
    int errorCount = 0;
    // Loop over training data
    for(int i=0; i < trainingData.length; i++){
        System.out.println("Starting weights: " + Arrays.toString(weights));
    }

From here we can calculate the weighted sum of the inputs and get the output.

// Calculate weighted sum of inputs
double weightedSum = 0;
for(int ii=0; ii < trainingData[i][0].length; ii++) {
    weightedSum += trainingData[i][0][ii] * weights[ii];
}

// Calculate output
int output = 0;
if(threshold <= weightedSum){
    output = 1;
}

System.out.println("Target output: " + trainingData[i][1][0] + ", "
    + "Actual Output: " + output);

The next step is to calculate the error and adjust the weights...

// Calculate error
int error = trainingData[i][1][0] - output;

// Increase error count for incorrect output
if(error != 0){
    errorCount++;
}

// Update weights
for(int ii=0; ii < trainingData[i][0].length; ii++) {
    weights[ii] += learningRate * error * trainingData[i][0][ii];
}

System.out.println("New weights: " + Arrays.toString(weights));
System.out.println();

Finally, break if a solution is found and close loop

    // If there are no errors, stop
    if(errorCount == 0){
        System.out.println("Final weights: " + Arrays.toString(weights));
        System.exit(0);
    }
}

And if we put it all together...

package perceptron;
import java.util.Arrays;

public class PerceptronLearningRule {
    public static void main(String args[]){
        double threshold = 1;
        double learningRate = 0.1;
        // Init weights
        double[] weights = {0.0, 0.0};
        
        // AND function Training data
        int[][][] trainingData = {
            {{0, 0}, {0}},
            {{0, 1}, {0}},
            {{1, 0}, {0}},
            {{1, 1}, {1}},
        };
        
        // Start training loop
        while(true){
            int errorCount = 0;
            // Loop over training data
            for(int i=0; i < trainingData.length; i++){
                System.out.println("Starting weights: " + Arrays.toString(weights));
                // Calculate weighted input
                double weightedSum = 0;
                for(int ii=0; ii < trainingData[i][0].length; ii++) {
                    weightedSum += trainingData[i][0][ii] * weights[ii];
                }

                // Calculate output
                int output = 0;
                if(threshold <= weightedSum){
                    output = 1;
                }

                System.out.println("Target output: " + trainingData[i][1][0] + ", "
                        + "Actual Output: " + output);
                                
                // Calculate error
                int error = trainingData[i][1][0] - output;
                
                // Increase error count for incorrect output
                if(error != 0){
                    errorCount++;
                }
                
                // Update weights
                for(int ii=0; ii < trainingData[i][0].length; ii++) {
                    weights[ii] += learningRate * error * trainingData[i][0][ii];
                }

                System.out.println("New weights: " + Arrays.toString(weights));
                System.out.println();
            }

            // If there are no errors, stop
            if(errorCount == 0){
                System.out.println("Final weights: " + Arrays.toString(weights));
                System.exit(0);
            }
        }
    }
}
Perceptron learning rule source

Bias units

In our last example we set our threshold to 1, this means our weighted input needs to equal or exceed 1 to give us an output of 1. This is okay when learning the AND function because we know we only need an output when both inputs will be set, allowing (with the correct weights) for the threshold to be reached or exceeded. In the case of the NOR function however, the network should only output 1 if both inputs are off. This means if we have a threshold of 1 there isn't a combination of weights that will ever make the following true,
x1 = 0
x2 = 0
1 <= x1w1 + x2w2


There's a simple fix to this though, a bias unit. A bias unit is simply a neuron with a constant output, typically of 1. Bias units are weighted just like other units in the network, the only difference is that they will always output 1 regardless of the input from the previous layer, this is where they get their name! So why are they important? Bias inputs effectively allow the neuron to learn a threshold value. Consider our previous equation, with a bias input (x0) added we can change it to,
x0 = 1
x1 = 0
x2 = 0
1 <= x0w0 + x1w1 + x2w2


Now we can satisfy that equation. You can try this yourself by updating your perceptron training set to train for the NOR function. Just add a bias input to the training data and also an additional weight for the new bias input. Here is the updated code:

// Init weights
double[] weights = {0.0, 0.0, 0.0};

// NOR function training data
int[][][] trainingData = {
    {{1, 0, 0}, {1}},
    {{1, 0, 1}, {0}},
    {{1, 1, 0}, {0}},
    {{1, 1, 1}, {0}},
};

To be continued...

Hopefully you should now have a clearer understanding about the types of learning we can apply to neural networks and the process in which a simple, single layer perceptrons can be trained. In the next tutorial we will be learning how to implement the back propagation algorithm and why it's needed when working with multi-layer networks.

Comments


Read More

1 2 3 4 Next »