Victoria Pruim, Zachary Parks, Simon Liu

iObstacle

April 24th, 2018

University of North Carolina, Wilmington

Abstract

There are many applications, from video game design to the planning of city infrastructure, where being able to find a path (i.e., “pathfinding”) between two points in space, virtual or real, would be a valuable tool. This process of pathfinding, in one of its simplest forms, can be further investigated by attempting to find a solution to a relatively simple and familiar problem, solving a maze.

I. INTRODUCTION

This paper describes an experiment performed to compare the computational complexity of three algorithms used to find the shortest series of moves to create a path from a set start position to a set end position, for an object to travel on a square grid which contains obstacles (a maze).

Fig. 1 Example of maze unsolved and solved

II. FORMAL PROBLEM STATEMENT

In order to state our problem formally, we must define the maze, start/end points, squares, obstacles, moves, and paths. A maze can be visually represented as a orthogonal square grid of two sets of evenly-spaced parallel lines at perpendicular angles to one another. In a grid of size n2 squares, all of the squares(s) on the grid are represented by an ordered pair. Those ordered pairs are in the set S where:

S={ (0,0), (0,1), .... (0,n-1), (1,0), (1,1), …. (n-1, 0), (n-1,1), … (n-1,n-1)}

s ∈ S, s=0 or s=1. If s=0, then s represents an obstacle, if s=1, then s represents an open path. Our grid will be represented by a 2D array populated with 0’s and 1’s. The start position is a square on the grid and is the object’s first current position. The end position is the goal square, where object is attempting to find. The object’s size is one square. A move is a transition from one square with a value of 1 to anadjacent square with a value of 1.

Two squares are adjacent if for one square with coordinates (i,j) where both i and j are in the range (0,n) , the other square’s coordinates are (i,j+1) , (i, j-1), (i+1, j) , or (i-1,j).

m = { (si, sj) | si∈S, sj ∈ S, si ≠ sj, and si , sj = 1}.

A path is a series of moves, such that the second square in one move is the first square of the following move.

P = { (s0 , s1) ,(s1 , s2) , …., (sr-2 , sr-1) } = {m0, …., mr-1}

∃ a path, P, such that the first square in the first move is the start square and the last square in the last move is the end square.
Q is a list of the lengths of all tested paths that go from Start to End. Length of a path is defined by the number of moves in the path. A shortest path is a minimum on Q.
Q = { len(P0), len(P1), ….len(Pq-1)} q = number of paths tested
Pshortest = min(Q)

III. ALGORITHMS

In testing our algorithms for this problem, there must be a possible path from start to finish and the object can only move up, down, left, or right, NO diagonal movement. The algorithms we are analyzing are A* Search, Dijkstra’s Algorithm, and Breadth First Search.

IV. DIJKSTRA’S ALGORITHM

Dijkstra’s algorithm is an uninformed graph search algorithm used to find the shortest path from a starting node to a goal node in a weighted graph. In a graph-based representation of a maze, where the weights of the edges of the graph-based representation of a maze are all equal to one, Dijkstra’s algorithm can be used to find the shortest solution in the maze.

Dijkstra’s algorithm works by creating a shortest path tree from the source node to all other nodes in the graph. Starting from the source node, this shortest path tree is generated by examining each of the current node’s adjacent nodes, labeling them with the total weight of the edges between themselves and the source node. These adjacent nodes are then put into a priority queue, based on their distances from the source node. In the next step, a new current node is chosen from the queue, recording the previous node and repeating the process until a goal is reached or until every node has been examined.

V. DIJKSTRA’S PSEUDOCODE AND COMPLEXITY

The running time bounds for Dijkstra’s algorithm is a function of the number vertices, |V|, and the number of edges, |E|, of the graph. The tightness of these bounds largely depends on the way the priority queue is implemented - the time complexity of Dijkstra’s can be improved by using more efficient data structures. The running time complexity of Dijkstra’s, regardless of the queue implementation, can be represented as O(|E|* Tpc + |V|* Tem) where Tpc and Tem are complexities of the prioritization change and extract minimum operations of the queue. The pictures below are snippets of the relevant pseudo-code alongside the time complexity analysis for the implementation of Dijkstra’s algorithm used in this study. In this implementation, the priority was queue implemented as a red-black balancing tree, with the Tpc and Tem operations both having a running time bounds of O(log|V|).

(Larger image available in Appendix A)

Fig. 2 Breakdown od Dijkstra’s pseudo code

It can be seen from the above image that the time complexity of the implementation of Dijkstra’s used in this study was O(|E| + |V|lov|V|).

VI. A* HISTORY

In 1964 Nils Nilsson invented a heuristic based approach to increase the speed of Dijkstra’s algorithm (A1). In 1967 Bertram Raphael made dramatic improvements upon this algorithm, but failed to show optimality (A2). In 1968 Peter E.Hart slightly proved that A2 was optimal when using a heuristic with only minor changes (A*).

VII. A*

The cost function is defined as f(n) = g(n) + h(n), where:
n = square on path (ordered pair)
g = cost of path from start to n
h = heuristic/estimated cost of path

from n to end

For two squares (x,y) and (i,j), the h cost is calculated by ( |i-x| + |j-y| )

“As A* traverses the graph, it follows a path of the lowest known cost, Keeping a sorted priority queue of alternate path segments along the way. If, at any point, a segment of the path being traversed has a higher cost than another encountered path segment, it abandons the higher-cost path segment and traverses the lower-cost path segment instead. This process continues until the goal is reached.”

IIX. A* PSEUDOCODE AND COMPLEXITY

The time complexity of the A* algorithm depends on how accurate the heuristic function is at estimating the cost from any node to the end node. The worst case complexity for a maze with no obstacles is exponential with respect to the number of squares total and the number of nodes expanded for each square. The time complexity is polynomial when the maze has a single end state and “the heuristic function meets |h(x)-h*(x)| = O(log h*(x)), where h* is the optimal heuristic” ( the exact cost to get from square x to the end goal). The difference between the worst case and best case time complexities is substantial. Accurately estimating a heuristic cost, or at least not overestimating the heuristic is a crucial element of A* search.

Below is the pseudo code with a breakdown of the worst case Big O analysis.

(Larger image available in Appendix A)

Fig. 3 Breakdown of A*’s Pseudo code

IX. DIJKSTRA VERSUS A*

It is apparent that the A* and Dijkstra’s algorithms are very similar in the way that they operate. The major difference between the two algorithms is that Dijkstra’s does not use a heuristic in its search method – it is what is known as an uninformed search algorithm. Dijkstra’s algorithm will subsequently search all of the available nodes, unless the goal is found, or the process is terminated prematurely, for a path. The A* algorithm, on the other hand, does use a cost heuristic to make an informed decision on what path should be taken to find the shortest path to the goal. The use of the heuristic in the A* algorithm allows for the discovery of a shortest path to the goal to be made without having to unnecessarily explore the paths that are not solutions.

X. BFS HISTORY

In 1945 BFS was invented by Konrad Zuse and Michael Burke, but was not published until 1972. In 1959 BFS was reinvented by Edward F. Moore to find the shortest path out of a maze. In 1961 BFS was developed into a wire routing algorithm by C. Y. Lee.

XII. BFS

Breadth First Search (BFS) is an algorithm for traversing or search tree or graph data structures.

BFS is complete and is guaranteed to find the best solution (if one exists) within a finite amount of time.

BFS starts at the tree root and explodes the same level parent nodes first, before moving to the next children level. The algorithm will go over all the paths until it reaches the ending point, which will cause the process to stop since the ending point does not have any children.

To keep track of the references for all child nodes, BFS uses a queue (Q). First, BFS sets the starting node of the maze to the root of the tree and adds it to Q. Then, it removes the root from Q and repeats the process with every child (checks if the node is the end if it is, terminates the process and returns the path).

XII. BFS PSEUDOCODE AND COMPLEXITY

BFS’s complexity is also based on the number of edges and vertices in the graph. The algorithm will expand every level of the tree before moving to the next level.

(Larger image available in Appendix A)

Fig.4 Breakdown of BFS’s pseudo code

XIII. EXPERIMENTAL PROCEDURE

Each algorithm will be timed on mazes of varying sizes and densities in order to evaluate their efficiency. For each density in set d = {10%, 20%, 30%, 40%} we will:

-Create mazes with size n2 for n in range 10 to 100 incrementing by 10

-Time how long it takes Dijkstra, A*, and BFS to solve each maze

10%

20%

30%

40%

Fig. 5 Examples of mazes with different densities

XIV. RESULTS

The first figure below shows the data obtained from measuring the running times of each individual algorithm on 10 mazes of different sizes and varying densities. The second figure shows a zoomed in picture to compare the running times of the A* and BFS algorithms. The running time for A* and BFS are very close to each other. BFS starts faster for smaller mazes but then A* becomes faster for bigger mazes, although this trend was true for all sets of mazes of all densities

Fig. 6 A line graph showing run times for all three algorithms for different maze densities

Fig. 7 A line graph showing run times for A* and BFS algorithms for different maze densities

The two figures below show the running time results of Dijkstra’s algorithm as well as a plot of these results vs. maze density. Similar to the results obtained from the other algorithms, the time required to find a path in a maze was inversely related to the maze’s density.

Fig. 8 A table and line graph showing run times Dijkstra’s algorithms for different maze densities

The results obtained from the A* algorithm are shown below. A* run times across different densities show the same general pattern as Dijkstra. Low density mazes has the slowest run times, and high density mazes had the shortest run times. Slower run times for les dense graphs aligns with the time complexity mentioned above for A* search, graphs with less obstacles and therefore more searchable squares will have exponential runtimes. The unevenness of the graph can be attributed to the small sample size of the data. For each size and density we only created and tested one maze. An average of more mazes for each size/density pair would show a more consistent graph.

Fig. 9 A table and line graph showing run times A* algorithm for different maze densities

The results from BFS matched the trends shown in Dijkstra and A*. Overall the less dense mazes had slower run times while the more dense mazes had faster run times. This is consistence with the big O analysis for the pseudo code because if there are overall less vertices and edges to be added to the tree in the search for the final path.

Fig. 10 A table and line graph showing run times for BFS algorithm for different maze densities

XV. CONCLUSIONS

The time complexities of these three algorithms are all functions of the number of edges and the number of vertices that represent a given maze

Two conclusions drawn from our data are that:

-As maze size increases, running time increases.

-As maze density increases, running time decreases.

These conclusions are logical. The larger the maze, the more vertices and edges there are per graph, thus increasing run time. The more dense a maze is, the less vertices and edges there are, as many of the squares in the grid are obstacles and will not be examined in search for the final path.

XVI. HOLES

Dijkstra and A* are both greedy algorithms. They perform more efficiently than BFS when the cost per move is not constant. In this case with each step having a cost of 1, Dijkstra’s and A*’s performances were both decreased.

Fig. 11 Example of a weighted graph

Graphs as seen above illustrate one type of maze where Dijkstra and A* would perform more efficiently than BFS. Each green circle represents what one square in our graph does, and the goal would be to find a path from circle A to circle E. The lines from one circle to another show the costs of each move. Greedy algorithms would explore the path form A-B before A-C because the cost of 10<15.

XVII. FUTURE WORKS

Path finding algorithms are already in practice today and have many uses.

-Robotics

-Self learning navigation

-Gaming

-Routing protocols

-Hardwire wiring

-City/ transportation planning

In order to continue the studies presented in this paper one possible test would be to build 3D mazes in order to see how adding a dimension to the graph would change the results. One improvement on Dijkstra would be to use different data structures to represent the priority queue used to expand squares. Using different heaps may decrease run time and decrease memory complexity as well.

XVII. QUESTIONS

1. What is the name of the estimated cost from any square to the end square?
- Heuristic cost

2. What is an error in A* which will lead to a worse time complexity ?
- Overestimating the heuristic cost

3. Why are A* and Dijkstra’s considered “greedy” algorithms?
- They both search for the shortest

traversal path through a graph based

on making the locally optimum

choice with the goal of finding a

global optimum

4. The complexity of graph search algorithms such as BFS, Dijkstra’s, and A* is a based on a function of what?
- The complexity is a function of the

graph’s number of edges and vertices

5. Solve for |V| and |E| of the map below, where |V| is the number of vertices and |E| is the number of edges, using BFS. (V0: start point, V6: end point)

REFERENCES

[1]

[2]

[3]

[4]

[5]

[6]

[7]

Appendix A

Fig. 2

Fig. 3

Fig. 4