This project is inspired by The Martian. It's not the first one I've got that is, and it won't be the last, but it's the first one I finished. In both the movie and the book, we get a scene in which Mark Watney plots a course from his habitat to the resting location of the defunct Mars Pathfinder landing site on a satellite image of Mars. The thing I thought was interesting is that there are plenty of really neat algorithms for finding extremely efficient paths between two points, if obstacles are known (like the craters on a satellite map), but that a person can draw a pretty good linear spline path very quickly.

Also, path tracking for the really curvy paths produced by relaxation and potential field methods tends to be a little messy and inconvenient anyway. So I decided I wanted to try creating an algorithm that was fairly quick, and aimed at generating a linear spline path, so that individual trajectories would be towards the next intersection of two linear segments. I also wanted to approach finding this kind of path like a human being would. People don't usually employ high level field based methods when planning routes- they use visual logic, so that's what I wanted to implement.

The basic idea, then, was to take the satellite image, find all the craters and obstacle type things through image segmentation, note some paths around the obstacles blocking the direct path between the starting point and the destination, and convert those paths to straight line segments. It naturally turned out to be more difficult than it sounds there.

Here's the image I chose to work from:

Also, path tracking for the really curvy paths produced by relaxation and potential field methods tends to be a little messy and inconvenient anyway. So I decided I wanted to try creating an algorithm that was fairly quick, and aimed at generating a linear spline path, so that individual trajectories would be towards the next intersection of two linear segments. I also wanted to approach finding this kind of path like a human being would. People don't usually employ high level field based methods when planning routes- they use visual logic, so that's what I wanted to implement.

The basic idea, then, was to take the satellite image, find all the craters and obstacle type things through image segmentation, note some paths around the obstacles blocking the direct path between the starting point and the destination, and convert those paths to straight line segments. It naturally turned out to be more difficult than it sounds there.

Here's the image I chose to work from:

The first part was to find the craters and things. This aspect was pretty open loop, as the judgement of the placement of obstacles on Mars is, from my perspective, mostly guess work. It is naturally relatively easy to just look at the image and note where the obstacles are, but it's a little difficult to say for sure what should and shouldn't be considered an obstruction from a satellite image. Especially considering that the tactic I used was visual segmentation, there is a chance that things that ought to be considered obstacles are not included, and things that shouldn't are. But a visual inspection seems to indicate to my intuition that it does a fair job. In any case, in an implementation setting, the reconnaissance would probably be better, and for all intents and purposes, we just need a test case for the path finding algorithm.

The segmenting I did was pretty rudimentary- like I said, the main point was to have a sample set of obstacles. The image processing started with a conversion to black and white of the satellite image, with the aim of using the general intensity within a region to determine if it represented an obstruction. The tactic I employed was to generate from the intensity image a histogram of the light levels within the image. The reasoning here was that the 'background' of the image would make up, principally, the largest histogram region, and all obstacles a significantly lower proportion of all pixels.

Using the histogram, it was possible to select out the background by including only pixels outside the most popular histogram bin. Naturally, the number of bins chosen has a distinct impact on how the groupings of different intensities for up, but having tried a couple different numbers, I found that a default of 10 bins worked fine, so it was left at that. To create the baseline comparison, I elected to look at the integral over the histogram, so as to compute the weighted average pixel value by summing the width of each bin multiplied by the count of pixels within that bin, and dividing by the total number of pixels. This gives a center point, S, to compare to, and so we select out in a range around it. I went with a 1/3 margin, so that the pixels which are included within the final obstacle mask are those with intensities less than 66% of S or greater than 133% of it. We also blur the image selected by these conditions in order to smooth the noise within the image after hard thresholding, and apply the computer vision technique of binary closure to eliminate the remaining porosity of the selected regions.

The end result is a mask with 1s in the locations where the image is significantly different from S, and 0s elsewhere:

The segmenting I did was pretty rudimentary- like I said, the main point was to have a sample set of obstacles. The image processing started with a conversion to black and white of the satellite image, with the aim of using the general intensity within a region to determine if it represented an obstruction. The tactic I employed was to generate from the intensity image a histogram of the light levels within the image. The reasoning here was that the 'background' of the image would make up, principally, the largest histogram region, and all obstacles a significantly lower proportion of all pixels.

Using the histogram, it was possible to select out the background by including only pixels outside the most popular histogram bin. Naturally, the number of bins chosen has a distinct impact on how the groupings of different intensities for up, but having tried a couple different numbers, I found that a default of 10 bins worked fine, so it was left at that. To create the baseline comparison, I elected to look at the integral over the histogram, so as to compute the weighted average pixel value by summing the width of each bin multiplied by the count of pixels within that bin, and dividing by the total number of pixels. This gives a center point, S, to compare to, and so we select out in a range around it. I went with a 1/3 margin, so that the pixels which are included within the final obstacle mask are those with intensities less than 66% of S or greater than 133% of it. We also blur the image selected by these conditions in order to smooth the noise within the image after hard thresholding, and apply the computer vision technique of binary closure to eliminate the remaining porosity of the selected regions.

The end result is a mask with 1s in the locations where the image is significantly different from S, and 0s elsewhere:

In the interest of verifying that this is a sensible map of obstacles, let's look at it overlaid on the original image:

On the left, we have obstacles stripped out, leaving only background, and on the right, only obstacles shown. I wouldn't propose this alone as an obstacle identifying algorithm for analysis of satellite imagery, but it seems, qualitatively speaking, to do a pretty good job. The final result is, in any case, exactly what we want: an example case of an obstacle label on a map which we can build a path around.

That naturally leads to the question of how we're going to construct the path. Because I wanted to build a path in a similar way to how a person would draw one on the map, the natural starting point seemed to be the shortest possible path: A straight line between two points, so we're going to start with that and then work our way around the obstacles we run into. Here's the line we'll be looking at:

That naturally leads to the question of how we're going to construct the path. Because I wanted to build a path in a similar way to how a person would draw one on the map, the natural starting point seemed to be the shortest possible path: A straight line between two points, so we're going to start with that and then work our way around the obstacles we run into. Here's the line we'll be looking at:

Note that we've colored the image a little, to help with identifying different features. The obstacles, naturally, are in green, and the straight line path is in yellow, with terminal points in red and blue. I chose to display this pair of points because this path processing is one of the test cases I worked with that covers a long range and in doing so, allows a demonstration of all the techniques I applied in one setting. That is to say, there's nothing special about this case of path, except that it encounters enough problems to be a good example. Also, it's not the only pair I tried, just my favorite among the many.

So, the most immediate problem is that this path runs right into several obstacles. We'd like to adjust it to not do this. The first step to that is identifying which of the obstacles are in the way. In order to positively identify the different regions that represent differing obstacles, we need to identify individual obstacles. Such is easy to do using a labeling algorithm the result of such an algorithm is an array corresponding to the image, with each connected blob marked with an index. This allows us to ride along the line as shown above, and note the index of each blob the line intersects, to build a list of all problematic obstacles. Once we have this list, we can use it to build a new obstacle mask which includes specifically the ones which the crow's path line intersects:

So, the most immediate problem is that this path runs right into several obstacles. We'd like to adjust it to not do this. The first step to that is identifying which of the obstacles are in the way. In order to positively identify the different regions that represent differing obstacles, we need to identify individual obstacles. Such is easy to do using a labeling algorithm the result of such an algorithm is an array corresponding to the image, with each connected blob marked with an index. This allows us to ride along the line as shown above, and note the index of each blob the line intersects, to build a list of all problematic obstacles. Once we have this list, we can use it to build a new obstacle mask which includes specifically the ones which the crow's path line intersects:

These are the obstructions around which we will have to navigate. In order to generate a path, we're going to find out how to maneuver around these craters. The most simplistic approach would be to track around the outside edge of the obstacle until reaching the pre-existing path. That sort of tactic is appropriate when the size and shape of the offending impediment is unknown, such as in the case of the line following robot with obstacle avoidance employed in my robot, Mordax. In this case, we know the lay of the land in advance, so it seems a little bit odd to not make use of that information. We will begin, however, by extracting this path to use as an initial template for our later, more efficient path.

To get the outer boundary of the obstacles, which we will use in formulating the initial path to refine, we use a gradient-based edge detection approach which looks at the rates of change in the X and Y directions across pixels within the image. The result of mapping the derivatives on the image is a new image, in which the only areas which have high values are those on a border between occupied pixels and unoccupied ones, i.e., the edge of the blobs. Illustrated on the image below are these edges, in magenta:

To get the outer boundary of the obstacles, which we will use in formulating the initial path to refine, we use a gradient-based edge detection approach which looks at the rates of change in the X and Y directions across pixels within the image. The result of mapping the derivatives on the image is a new image, in which the only areas which have high values are those on a border between occupied pixels and unoccupied ones, i.e., the edge of the blobs. Illustrated on the image below are these edges, in magenta:

One might immediately notice an issue- in each of these blobs, the crow's flight bisects the boundary into two segments. Obviously, we want to include in our path the shortest segment, but how will we select which to include? And, to make matters worse, what if the line intersects through a wavy segment, resulting in many segments?

In practice, it makes more sense to solve the second problem first, as the segment inclusion problem is made more complicated when there are many segments to consider. It would be tempting to say that multiple intersections could be solved by looking to remove the longest segment, but one could imagine an oddly-shaped edge which defies this, such as below:

In practice, it makes more sense to solve the second problem first, as the segment inclusion problem is made more complicated when there are many segments to consider. It would be tempting to say that multiple intersections could be solved by looking to remove the longest segment, but one could imagine an oddly-shaped edge which defies this, such as below:

weFor this weirdly shaped thing, we can see that pitching out the inner middle section, which is longer due to the zigzagging shape, would result in a strange path such as the one on the left. Even if we were planning on just supplying an edge following path, this would be difficult to interpret, due to the truncated portions on the bottom half. One possible approach to resolving this issue, were we aiming for the edge following path, would be to label all segments individually, and search through all combinations thereof for sets of segments which complete the path and then take the minimum distance case of all these. However, this is a combinatorial approach, which we really ought to avoid, generally.

Instead of trying multiple different implementations trying to efficiently locate among these paths, which would in any case complicate linear reduction later. For instance, in the above image, the two small south side nodes would be preferable, since a straight line can be drawn between their peaks that is much more efficient than going around the outer edge. In light of this observation, I chose to take a different approach to path simplification.

Noticing that what we really want here is a tight circumscription of the obstacle, to hop around the highly curved regions as opposed to following them. To acquire this kind of path, I used some image processing and manipulation to successively blur and level-threshold each obstacle blob through which the line passes. This process is similar to the image processing technique of dilation, but without including a kernel. In fact, my first approach was the application of a dilation operation to the obstacles, but ended up finding that any given aperture size, kernel shape, and number of successive operations tended to perform very well on certain shapes, and poorly on others, whereas successive blurring operations tended to work relatively well at an approximated window size for most all obstacles encountered.

The specific implementation is thus: we start with our plot of intersected obstacles, and walk along the line from the starting point, checking a smallish (about 20x20 pixel) region around the current location for intersections with a labeled block obstacle. When we run into one, we select out precisely this blob within the image, and compute its boundary. We then remove the intersection between the crow's path line and this boundary, to chop it into split regions, and count the number of regions. If there is exactly 1 region, then the line is tangential to the blob and we can ignore it. If it has exactly 2 segments, then we can compute the length of each (by summing the total number of pixels in each- we assume he width is constant on average, which is very close to true in the approximate case), and include the shorter of the two paths.

If, however, there are 3 or more regions, then we apply a sequential series of blurring operations to the original obstacle, recalculating the number of boundary cut regions each time, until there are only 2 regions, at which point, we include ths shorted of the two boundary segments. In this way, we eliminate the multiple intersection zones in a quick and fairly simple way, creating a smooth outer boundary. Because we choose the lesser outer boundary, we also tend to get the most convex relaxation of the original outer curve. This is good, because later we'll be using a visibility graph to build the spline path, and visibility along a convex region will tend to lead to a linear cut that bypasses all the interior spaces that we'd rather not traverse.

Instead of trying multiple different implementations trying to efficiently locate among these paths, which would in any case complicate linear reduction later. For instance, in the above image, the two small south side nodes would be preferable, since a straight line can be drawn between their peaks that is much more efficient than going around the outer edge. In light of this observation, I chose to take a different approach to path simplification.

Noticing that what we really want here is a tight circumscription of the obstacle, to hop around the highly curved regions as opposed to following them. To acquire this kind of path, I used some image processing and manipulation to successively blur and level-threshold each obstacle blob through which the line passes. This process is similar to the image processing technique of dilation, but without including a kernel. In fact, my first approach was the application of a dilation operation to the obstacles, but ended up finding that any given aperture size, kernel shape, and number of successive operations tended to perform very well on certain shapes, and poorly on others, whereas successive blurring operations tended to work relatively well at an approximated window size for most all obstacles encountered.

The specific implementation is thus: we start with our plot of intersected obstacles, and walk along the line from the starting point, checking a smallish (about 20x20 pixel) region around the current location for intersections with a labeled block obstacle. When we run into one, we select out precisely this blob within the image, and compute its boundary. We then remove the intersection between the crow's path line and this boundary, to chop it into split regions, and count the number of regions. If there is exactly 1 region, then the line is tangential to the blob and we can ignore it. If it has exactly 2 segments, then we can compute the length of each (by summing the total number of pixels in each- we assume he width is constant on average, which is very close to true in the approximate case), and include the shorter of the two paths.

If, however, there are 3 or more regions, then we apply a sequential series of blurring operations to the original obstacle, recalculating the number of boundary cut regions each time, until there are only 2 regions, at which point, we include ths shorted of the two boundary segments. In this way, we eliminate the multiple intersection zones in a quick and fairly simple way, creating a smooth outer boundary. Because we choose the lesser outer boundary, we also tend to get the most convex relaxation of the original outer curve. This is good, because later we'll be using a visibility graph to build the spline path, and visibility along a convex region will tend to lead to a linear cut that bypasses all the interior spaces that we'd rather not traverse.

On the left, we have the original obstacle, and on the right, an overlay of the original and a relaxed version, illustrating the elimination of the concave region- which would reduce the number of boundary segments associated with a line that cut through that area.

Using this minimal convex curve, combined with the visibility graph, means that the only cases in which we can accidentally overlap with another obstacle via this process is one in which the range between obstacle boundaries is less than the width of the blurring window. This is because the relaxation pushes the boundary out towards the line, and we stop as soon as the boundary overtakes the line, at which point the line is fully within the new boundary (one entry and one exit means one line segment within the boundary region), we will not expand out past the line, except by an error margin due to discretization of the blur window. Any obstacle intersecting the new boundary must therefor be within the diagonal window width of the new boundary.

We can make this problem evaporate by making the blur window diagonal length equal to the initial path width. We set the diagonal because its the maximum distance at which the blur can extend off the line in any direction. Doing so means that any obstacle that comes to intersect the new, relaxed, boundary will be at most tangential to the path, and therefor inconsequential. We also add this new boundary to the net path by stitching together at the two intersection points between it and the original line, and setting the line-follower which possessed the intersection window that checks obstacles at the latter edge, as though it had followed the path to that point.

By doing this, all obstacles that are bypassed by taking the relaxed path are avoided, as well as any that entered the set of intersection obstacles that were artifacts of the obstacle identification process. That is to say, any 'islands' within larger obstacles are ignored as a result, which is helpful for getting good paths. The end result of this process is a path within the image which looks something like this:

Using this minimal convex curve, combined with the visibility graph, means that the only cases in which we can accidentally overlap with another obstacle via this process is one in which the range between obstacle boundaries is less than the width of the blurring window. This is because the relaxation pushes the boundary out towards the line, and we stop as soon as the boundary overtakes the line, at which point the line is fully within the new boundary (one entry and one exit means one line segment within the boundary region), we will not expand out past the line, except by an error margin due to discretization of the blur window. Any obstacle intersecting the new boundary must therefor be within the diagonal window width of the new boundary.

We can make this problem evaporate by making the blur window diagonal length equal to the initial path width. We set the diagonal because its the maximum distance at which the blur can extend off the line in any direction. Doing so means that any obstacle that comes to intersect the new, relaxed, boundary will be at most tangential to the path, and therefor inconsequential. We also add this new boundary to the net path by stitching together at the two intersection points between it and the original line, and setting the line-follower which possessed the intersection window that checks obstacles at the latter edge, as though it had followed the path to that point.

By doing this, all obstacles that are bypassed by taking the relaxed path are avoided, as well as any that entered the set of intersection obstacles that were artifacts of the obstacle identification process. That is to say, any 'islands' within larger obstacles are ignored as a result, which is helpful for getting good paths. The end result of this process is a path within the image which looks something like this:

Which is very close to what we set out for as a basis from which to construct our later linear spline.

The next thing we want to try is to get this path chopped up into a series of points that an agent could follow. These way points will allow us to specify the course as a path, since this path it totally visual- we selected it by adding in unbounded (no pun intended) boundaries computed with image processing techniques. Basically, at this point we have the starting and ending point, but no coordinates of anything in between but the intersections between curved and crow's line sections, which are naturally not terribly useful.

A simple way to get a coordinate string for the path is to set an agent at an endpoint, and have it move along the line by detecting where the path is and moving in that direction, marking it's absolute location every so often. This is done in practice by using a small window. Again we have a window width set by the line width, in this case the side width, because we want the minimum possible width to be the size of the path- we also increase this width a little to account for variation in the width of the path, since it's curved. An intuitive value of 2x the line width seems to work well.

At each advancement step, we look at a window shifted into each of the 8 pixels surrounding the current one, and calculate the number of pixels on the path intersected by the new window. Which ever shift yields the greatest intersection, we set as the next pixel. We then Erase all pixels within the window, to force the agent to progress positively along the path. For the first step, this means wiping a large number of pixels, and at subsequent steps, the wipe will remove only pixels along the new edge of the window in the direction traveled. If we did not perform this erasing, then there could be cases for which stepping back to the previous pixel was added as many or more cells to the intersection.

We also set a parameter, so that we record the center occasionally, after a certain number of steps, thus breaking up the path into a number of waypoints. We should pick this time between marks to be more than 1 step, because a set point at each pixel is likely to be an absurdly large number of pixels, but if we select too long a distance, we may experience aliasing in the path. In truth the proper number of points depends on the nature of the path, but I elected, as we keep doing, to set it to the width of the original line. In essence, the original line width has become our fundamental resolution.

This path breakdown gives us a set of sequential points to follow, like so (waypoints marked with purple dots):

The next thing we want to try is to get this path chopped up into a series of points that an agent could follow. These way points will allow us to specify the course as a path, since this path it totally visual- we selected it by adding in unbounded (no pun intended) boundaries computed with image processing techniques. Basically, at this point we have the starting and ending point, but no coordinates of anything in between but the intersections between curved and crow's line sections, which are naturally not terribly useful.

A simple way to get a coordinate string for the path is to set an agent at an endpoint, and have it move along the line by detecting where the path is and moving in that direction, marking it's absolute location every so often. This is done in practice by using a small window. Again we have a window width set by the line width, in this case the side width, because we want the minimum possible width to be the size of the path- we also increase this width a little to account for variation in the width of the path, since it's curved. An intuitive value of 2x the line width seems to work well.

At each advancement step, we look at a window shifted into each of the 8 pixels surrounding the current one, and calculate the number of pixels on the path intersected by the new window. Which ever shift yields the greatest intersection, we set as the next pixel. We then Erase all pixels within the window, to force the agent to progress positively along the path. For the first step, this means wiping a large number of pixels, and at subsequent steps, the wipe will remove only pixels along the new edge of the window in the direction traveled. If we did not perform this erasing, then there could be cases for which stepping back to the previous pixel was added as many or more cells to the intersection.

We also set a parameter, so that we record the center occasionally, after a certain number of steps, thus breaking up the path into a number of waypoints. We should pick this time between marks to be more than 1 step, because a set point at each pixel is likely to be an absurdly large number of pixels, but if we select too long a distance, we may experience aliasing in the path. In truth the proper number of points depends on the nature of the path, but I elected, as we keep doing, to set it to the width of the original line. In essence, the original line width has become our fundamental resolution.

This path breakdown gives us a set of sequential points to follow, like so (waypoints marked with purple dots):

Note that this path demonstrates the tangential intersection with another blob as described above.

Now, we have all these points- and we can also see that we ought to be able to move straight from certain ones to others and bypass others, and in doing so, radically decrease the number of sub-points we have to visit, as well as decreasing the net length of our path. In order to find what potential jumps we can make, we construct a visibility graph for these points, progressing through each point and calculating, for all succeeding points, the line between both points. It is necessary only to calculate from succeeding points because the edges will be symmetric in either direction. We then check to see if that line crosses any (including those not intersected by the direct line, of course) obstacle. If it does not, then we add to a matrix the calculated length of that segment. This matrix will represent the lengths of edges in a graph, the navigation of which is the navigation among all visible points on the path.

We can draw this graph on the map above by drawing lines between every pair of visibly adjacent points:

Now, we have all these points- and we can also see that we ought to be able to move straight from certain ones to others and bypass others, and in doing so, radically decrease the number of sub-points we have to visit, as well as decreasing the net length of our path. In order to find what potential jumps we can make, we construct a visibility graph for these points, progressing through each point and calculating, for all succeeding points, the line between both points. It is necessary only to calculate from succeeding points because the edges will be symmetric in either direction. We then check to see if that line crosses any (including those not intersected by the direct line, of course) obstacle. If it does not, then we add to a matrix the calculated length of that segment. This matrix will represent the lengths of edges in a graph, the navigation of which is the navigation among all visible points on the path.

We can draw this graph on the map above by drawing lines between every pair of visibly adjacent points:

From this point, finding the optimal path is a simple graph navigation problem, which I resolved using Dijkstra's algorithm (you can see, looking at a few of my pages, it's my favorite algorithm). This results in a fairly straightforward short list of waypoints giving the collection of points on the original path which share visibility and connect the starting point to the destination:

Which seems to be a very qualitatively satisfying solution. Because it is formed by Dijkstra's algorithm operating on a path with is very close to a straight line, we also can conclude that it is highly efficient as well, though it is likely not a truly optimal path. However, it is formed of linear splines, which make it very easy to specify as a short list of points, and very easy to apply a simple tracking algorithm to. It is also quick to calculate, and formed by a suite of suitably intuitive visual operations, as originally intended. As a final check, I present the determined path on the original image, in which context it is rather nice looking.