Monday, March 9

A*

A* finding the shortest path between two points in a grid.
A* (A star) is an algorithm that has applications for finding the shortest path between two points in Euclidean space. It is a special case of Dijkstra's algorithm for finding the minimal traversal between nodes in a weighted graph. The primary difference between Dijkstra's algorithm and A* is the application of a heuristic which favors expansion of certain nodes based on their minimum (absolute) distance to the destination node. Those Wikipedia links provide good summaries of both algorithms. If this sounds like a foreign language, I can promise you that it is interesting and worthy of exploration. In this post, I will attempt to give some background, describe some motivations for using A*, and demonstrate the details of the algorithm in my own words.


Graphs

Example graph
A graph is a structure that represents the connectivity of a set of objects. There are two primary types of entities in a graph: edges and nodes. Nodes represent the intersection of edges, they usually indicate points of interest but their meaning can be more abstract. Edges represent a link between two nodes; by definition, they span exactly two nodes. Graphs come in a variety of flavors, but the most basic representation is a list of edges identifying the pairs of nodes that they span. For instance, imagine that an edge is defined by a pair of numbers, with each number being unique to a particular node. The graph depicted to the right would be defined by the following set: [6,4], [4,5], [4,3], [3,2], [2,5], [1,2], [5,1]. The edge [1,2] would indicate an edge spanning nodes 1 and 2. Typically, an edge is defined by an unordered pair, this means that the order [1,2] or [2,1] does not matter. On a directed graph, there is a difference between [1,2] and [2,1]; there is a particular meaning to the order or direction hence the qualifier directed. If not specified, a graph is assumed to be undirected*.

*While undirected graphs are the "default version" they are actually instantiated by directed graphs. That is, an undirected graph can always be represented by a directed graph while the converse is not necessarily true.

Rather than storing a list of edges, a graph can also be represented by a matrix. There are two standard matrix representations: the adjacency matrix and the incidence matrix. The adjacency matrix is always square with size equal to the number of nodes in the graph. Each element in the matrix is associated with a particular edge whose indices indicate the nodes that it spans. The adjacency matrix of an undirected graph will always be symmetric. The image below depicts an adjacency matrix for both directed and undirected graphs. Notice that the undirected graph's adjacency matrix is symmetric.

(left) A directed graph and corresponding adjacency matrix.
(right) An undirected graph and corresponding adjacency matrix.

Depending on the number of edges in a graph, the adjacency matrix may turn out to be relatively sparse (most of the matrix containing zeros). For a large graph with a sparse adjacency matrix, it may be more efficient to use the incidence matrix representation. With this representation, the columns correspond to edges and the rows correspond to nodes in the graph. By definition, because each column represents an edge, it must contain exactly two non-zero elements indicating the nodes that the edge spans. Take a look at the adjacency and incidence matrices below. Both matrices describe the "example matrix" at the beginning of this section.

(left) Incidence matrix representation of the "example graph" above.
(right) Adjacency matrix representation of the "example graph" above.


It is also possible to have weights associated with edges. A typical piece of information that is represented by a weighted edge is the distance between the nodes that it spans. A graph traversal is a sequence of connected nodes that span two particular nodes. In the example graph, the sequence [6-4-5-1] traverses the graph between nodes 6 and 1. Note that [6-4-3-2-1] is also a traversal. Both of these sequences can be scored by summing the weights of the edges. In the current example, assume that each edge had an equal weight w; the first traversal [6-4-5-1] would have a weight of 3w while [6-4-3-2-1] would have a weight of 4w. Both traversals have the same start and endpoint but one of them has a higher cost. What if we wanted to find the minimum-cost traversal between two nodes in a graph?

If each edge had an equal weighting, the minimum-cost traversal will be the first result of a breadth-first search. A breadth-first search starts at the original "root" node and expands it by adding each of its adjacent nodes to a frontier list. On each iteration, it steps through the list and expands them by adding their unexplored adjacent nodes to the list. It also removes each node from the list as it is expanded. In this way, the algorithm only has to store those nodes that are on the frontier of the search. The process terminates when the destination node is discovered. Breadth-first search is considered complete because it is guaranteed to find a traversal if it exists (even if the graph is infinitely large).

A* on a grid with more bulbulous obstacles.
For weighted graphs with unequal edge weights, Dijkstra's algorithm is the optimal method of determining the minimal-cost traversal. Similarly, for graphs whose nodes have associated euclidean coordinates, A* is the optimal search method that is also complete. By "associated euclidean coordinates" I mean that the nodes of the graph represent real points in an environment. If a path exists between the origin and destination in a graph, not only is A* guaranteed to find it, it will do so in the shortest possible number of steps. There may be other path finding algorithms that perform better than A* but they will not be complete, or they will only perform better under restricted conditions.

There is a whole lot more to learn about graphs and their relationship to the rest of mathematics. Practically, they are a useful tool for modeling systems. They are particularly prevalent in the fields of robotics, computer science, and optimization.


Why graphs?

You might be thinking, "That is all well and good, but I don't live on a graph, what does this have to do with the real world?" It turns out, most navigational structures that we are familiar with can be represented as graphs without losing too much information. In the context of navigating a road, place a node on any intersection, destination, entrance, or exit and draw edges between them. You now have a graph representing a system of roadways. Most public transit systems have some reliable set of routes and transfer points that can be represented in the form of a graph including subway systems, trains, buses, and even airport networks. For the purpose of navigating the interior of a building, put a node at the center of each room and draw an edge between those with doorways connecting them. You now have a graph summarizing the navigationally important features of the floorplan.

3D Voxel grid from the
popular Minecraft PC Game.
2D occupancy grid generated
from an indoor environment.
https://www.cs.cmu.edu/~minerva/maps.html

If you want to represent an environment in greater detail. You can represent a 2-D layout with a grid or even a 3-D environment with a voxel grid. A grid is easily represented with a special type of graph, a lattice graph. A lattice graph has a familiar diagonal structure and is usually sparse. Any obstacles in the grid or regions that cannot be traversed can be marked as such or simply omitted from the graph altogether. One particularly nice feature of a lattice graph is efficiency. Because elements in a grid (with the exception of edges) have the same connective structure, they don't need to be explicitly defined. This can save a lot of working memory when dealing with lattice graphs in software.

An A* implementation with GNU Octave

It works "a pretty-prooty good".
I decided to implement A* in GNU Octave. It's been a few months between implementing this algorithm and writing this post, so I may get a few things wrong when describing things. In the above .gif, you can see the algorithm in action. The image depicts a grid with white elements representing traversable space and black elements representing obstacles. On initialization, the user selects an origin and destination node. The starting element is indicated with blue and the destination with green*. While the algorithm runs, the elements that are stored on the frontier stack are colored with green and the elements that have already been evaluated are colored with a magenta/yellow gradient depending on their distance from the origin node. When a traversal is found, it is traced in blue.

*There were some problems during recording that prevented these elements from being displayed properly in the gifs.

Observe how the frontier begins to propagate around the
first obstacle.
The distance between elements is 1 for horizontally and vertically adjacent elements and sqrt(2) for diagonally adjacent elements; this is just the distance between their centers. The absolute distance between each node and the destination is also computed as the distance between their centers. The main feature of this euclidean distance computation is that it does not depend on the adjacency of the nodes and the destination; it is part of the heuristic which makes A* optimal for this class of search problems.

For some of these sample runs, the coloration is more
apparent. I intended for the color to change from magenta
to yellow with distance.
The algorithm works like breadth-first search. Nodes are expanded if they represent the shortest distance from the origin/root. For each iteration, the nodes on the frontier stack are scored according to their accumulated distance from the root. In addition to this accumulated distance score, a heuristic is applied that augments their total score. The heuristic assumes that a "shortcut" exists between each node and the destination. The distance of this shortcut is the absolute distance from the destination; because the nodes in this graph are embedded in euclidean space, the total score after being augmented by the heuristic can be thought of as the minimum possible traversal if there were no obstacles between the destination and the current node. Because of this heuristic, nodes that make progress towards the destination are favored over those that move away from the destination.

This was almost a straight shot with the exception
of the small circle in front of the destination.
To emphasize the importance of this heuristic, consider Dijkstra's algorithm. If you were to depict Dijkstra's algorithm on a similar lattice graph, the frontier would resemble a circle expanding away from the origin/root. This is because a circle happens to describe the set of points that are all equally distant from the root in euclidean space. Because of A*'s absolute euclidean heuristic, the frontier moves rapidly towards the destination and appears to flow around obstacles, seeking the shortest path.

It is worth mentioning that, due to the grid's connective structure, the shortest path between two points is restricted to combinations of horizontal, vertical, or diagonal edges. In reality, a straight line is the shortest distance between two points but on a grid with this connective structure, the shortest distance is restricted to something more akin to the manhattan distance.

This animation really depicts the power of the heuristic
for influencing the shape of the frontier boundary.
Although A* is optimal, this code certainly is not. If I were to re-write this implementation, I would separate the graphics from the logic and allow for more flexibility in the graph representation. The current implementation requires a lattice graph generated from a regular grid which limits its utility outside of a demonstration tool. I would also prefer to use a language like C/C++ or Python that is more suited to "looping" than Octave/Matlab. Loops are relatively slow in Octave/Matlab; unless you can vectorize this algorithm, they aren't particularly suited for running it.

Notice how the solution speeds up after the unexplored
space outside of the final corridor has been filled.
Other ways of improving performance could include segmenting the grid according to its structure;redefining the graph in a way that summarizes important information while doing away with redundant information. Possible approaches to segmenting the grid include creating a Voronoi Diagram of the boundaries and using it as the graph representation or compressing the grid using a Quadtree structure before generating an augmented lattice graph. Both of these approaches have potentially adverse affects on the resulting path as they reduce the resolution of the space that the graph structure describes.

If you are interested in checking out the algorithm, you can download it from my Github Repo.

In order to run the algorithm, you need to download GNU Octave by following the instructions here. The implementation utilizes some Octave-specific syntax so it will not run with Matlab.

I you have any questions or criticisms, please comment below. Thank you.