Slow A* Pathfinding Routine

You are an experienced programmer and have a problem with the engine, shaders, or advanced effects? Here you'll get answers.
No questions about C++ programming or topics which are answered in the tutorials!
frost_bite
Posts: 28
Joined: Wed Dec 03, 2008 11:42 pm

Slow A* Pathfinding Routine

Post by frost_bite »

Hello! I spent the last hour implementing a basic A* pathfinding routine for my code. However, it is very slow (Going from (0, 0) to (64, 64) in a straight line takes 5 seconds to calculate). I am posting it here in hopes that it can be improved to the point it is usable in my project, and other various projects.

So, how can I improve this pathfinding routine?

[Pathfinding.h]

Code: Select all

class Node {
private:
    u32 x, y;
    bool Walkable;
    bool Occupied;
    Node * Parent;
    f32 Cost;
public:
    Node(
        u32 X = 0,
        u32 Y = 0,
        bool walkable = true,
        bool occupied = false
    ) : x(X), y(Y), Walkable(walkable), Occupied(occupied) {}

    inline void setParent(Node * parent) {
        Parent = parent;
    }

    inline void setCost(f32 cost) {
        Cost = cost;
    }

    inline void setLocation(u32 X, u32 Y) {
        x = X;
        y = Y;
    }

    inline void setWalkable(bool walkable) {
        Walkable = walkable;
    }

    inline void setOccupied(bool occupied) {
        Occupied = occupied;
    }

    inline u32 &getLocationX() {
        return x;
    }

    inline u32 &getLocationY() {
        return y;
    }

    inline bool &isWalkable() {
        return Walkable;
    }

    inline bool &isOccupied() {
        return Occupied;
    }

    inline Node * getParent() {
        return Parent;
    }

    inline f32 &getCost() {
        return Cost;
    }
};

class NodePool {
private:
    Node * Pool;
    u32 w, h;
    void getSurroundingNodes(u32 x, u32 y, array<Node *> &nodes);
    Node * getNodeWithLeastCost(array<Node *> &possible);
public:
    NodePool(u32 x, u32 y, bool * WalkableFlags);
    bool findPath(u32 sx, u32 sy, u32 ex, u32 ey, array<Node *> &path);
};
[Pathfinding.cpp]

Code: Select all


#include <irrlicht.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdexcept>
#include <assert.h>

using namespace std;

using namespace irr;
using namespace core;
using namespace gui;
using namespace io;
using namespace scene;
using namespace video;

#include <Pathfinding.h>

#define POOL_ELEMENT(x, y) (&Pool[(x) * w + (y)])

NodePool::NodePool(u32 x, u32 y, bool * WalkableFlags) {
    w = x;
    h = y;

    Pool = new Node[x * y];

    for(u32 i = 0; i < w; i++) {
        for(u32 d = 0; d < h; d++) {
            Node *node = POOL_ELEMENT(i, d);

            node->setLocation(i, d);
            node->setWalkable(WalkableFlags[(i) * w + (d)]);
            node->setOccupied(false);
        }
    }
}

void NodePool::getSurroundingNodes(u32 x, u32 y, array<Node *> &nodes) {
    nodes.clear();

    if(x > 0) nodes.push_back(POOL_ELEMENT(x - 1, y - 0));
    if(y > 0) nodes.push_back(POOL_ELEMENT(x - 0, y - 1));
    if(x < w) nodes.push_back(POOL_ELEMENT(x + 1, y + 0));
    if(y < h) nodes.push_back(POOL_ELEMENT(x + 0, y + 1));
    if(x < w && y < h) nodes.push_back(POOL_ELEMENT(x + 1, y + 1));
    if(x > 0 && y < h) nodes.push_back(POOL_ELEMENT(x - 1, y + 1));
    if(x > 0 && y > 0) nodes.push_back(POOL_ELEMENT(x - 1, y - 1));
    if(x < w && y > 0) nodes.push_back(POOL_ELEMENT(x + 1, y - 1));
}

Node * NodePool::getNodeWithLeastCost(array<Node *> &possible) {
    Node * ret = 0;
    f32 cost = possible[0]->getCost();
    ret = possible[0];
    for(u32 i = 1; i < possible.size(); i++) {
        if(possible[i]->getCost() < cost) {
            ret = possible[i];
            cost = ret->getCost();
        }
    }
    return ret;
}

bool NodePool::findPath(u32 sx, u32 sy, u32 ex, u32 ey, array<Node *> &path) {
    array<Node *> OpenList;
    array<Node *> ClosedList;
    OpenList.push_back(POOL_ELEMENT(sx, sy));
    POOL_ELEMENT(sx, sy)->setCost(0);

    while(OpenList.size()) {
        Node * currentNode = getNodeWithLeastCost(OpenList);
        ClosedList.push_back(currentNode);
        OpenList.erase(OpenList.binary_search(currentNode));

        if(currentNode->getLocationX() == ex && currentNode->getLocationY() == ey) {
            array<Node *> tmpPath;
            Node * node = POOL_ELEMENT(ex, ey);
            while(node != POOL_ELEMENT(sx, sy)) {
                tmpPath.push_back(node);
                node = node->getParent();
                assert(node);
            }

            tmpPath.push_back(POOL_ELEMENT(sx, sy));

            printf("Path found from %ux%u to %ux%u with %u steps!\n", sx, sy, ex, ey, tmpPath.size());

            for(u32 i = 0; i < tmpPath.size(); i++) {
                path.push_back(tmpPath[tmpPath.size() - (1 + i)]);
            }

            return true;
        }

        array<Node *> adjecentNodes;
        getSurroundingNodes(currentNode->getLocationX(), currentNode->getLocationY(), adjecentNodes);

        for(u32 i = 0; i < adjecentNodes.size(); i++) {
            if(ClosedList.binary_search(adjecentNodes[i]) != -1 || !adjecentNodes[i]->isWalkable()) {
                // ignore it
            } else {
                if(OpenList.binary_search(adjecentNodes[i]) != -1) {
                    // its already on the open list
                } else {
                    adjecentNodes[i]->setParent(currentNode);
                    adjecentNodes[i]->setCost(vector2df(sx, sy).getDistanceFrom(vector2df(adjecentNodes[i]->getLocationX(), adjecentNodes[i]->getLocationY())));
                    OpenList.push_back(adjecentNodes[i]);
                }
            }
        }
    }

    return false;
}

[Test.cpp]

Code: Select all

bool WalkableMap[512 * 512];
memset(WalkableMap, 1, sizeof(bool) * 512 * 512);

NodePool * tmpPathfinding = new NodePool(512, 512, WalkableMap);

array<Node *> path;
if(tmpPathfinding->findPath(0, 0, 50, 50, path)) {
    for(u32 i = 0; i < path.size(); i++) {
        printf("--> Step %u: %ux%u\n", i, path[i]->getLocationX(), path[i]->getLocationY());
    }
} else {
    printf("Path not found!\n");
}
Thanks beforehand to any contributors.
Acki
Posts: 3496
Joined: Tue Jun 29, 2004 12:04 am
Location: Nobody's Place (Venlo NL)
Contact:

Post by Acki »

I tested the code and I get a calculation time ~2.6 secs...

well, I didn't try, but this could cause calculation overheads:

Code: Select all

        Node * currentNode = getNodeWithLeastCost(OpenList);
        ClosedList.push_back(currentNode);
        OpenList.erase(OpenList.binary_search(currentNode));
when I use A* I always inserted elements to the open list so it is sorted for cost !!!
I mean element 0 is the one with the least cost and the last element of the array is the one with the most cost...
so I don't need to search for the element with the least cost, I just can use the 1st element (it's the one with the least cost)...
this way all the searches (getNodeWithLeastCost and OpenList.binary_search) are obsolete !!! ;)
you know one quick loop to insert the element at a desired position is much more faster than doing multiple searches on the whole array... :lol:
I guess to sort the open list so the 1st element is the one with the most cost and the last is the one with the least cost can also improve the calculation a bit more...
while(!asleep) sheep++;
IrrExtensions:Image
http://abusoft.g0dsoft.com
try Stendhal a MORPG written in Java
polylux
Posts: 267
Joined: Thu Aug 27, 2009 12:39 pm
Location: EU

Post by polylux »

First off, I didn't go through your code in detail but the A* algorithm can be pretty efficiently implemented.
From what I've seen, you do alot of node-finding and pushing their pointers into separate arrays (like "getNodeWithLeastCost(...)", "getSurroundingNodes(...)",...). Just work on the "global" arrays (I used one to store which fields are obstructed (as you do) or were already visited) you already have and then set the correct backtrace based on your calculations.
Moreover, the A* algorithm can be perfectly implemented as a recursive function.

If it helps, I'd be glad to help you out with my implementation.

€dit: Acki was faster... ;)
beer->setMotivationCallback(this);
Halifax
Posts: 1424
Joined: Sun Apr 29, 2007 10:40 pm
Location: $9D95

Post by Halifax »

I don't know what the general speed of an A* algorithm is supposed to be, but you can do one of two things. You can either break out the profiler and start testing for hot-spots, or you can run your code against JP's implementation of A* in IrrAI and see if it matches up on your system. If the fact of the matter is that your code is generally optimal, then you're just going to have to reduce the complexity of the path you are using to achieve faster search times. That, in it self, can be done a number of ways using different path culling techniques.
TheQuestion = 2B || !2B
aanderse
Posts: 155
Joined: Sun Aug 10, 2008 2:02 pm
Location: Canada

Post by aanderse »

How many A* path nodes are there? You never mentioned what you use as your path nodes.
frost_bite
Posts: 28
Joined: Wed Dec 03, 2008 11:42 pm

Post by frost_bite »

Thank you to everyone who has replied!

Looks like I need to re-write it with pre-allocated arrays rather than constantly resizing the various arrays.

I will post the new version latter today.
Acki
Posts: 3496
Joined: Tue Jun 29, 2004 12:04 am
Location: Nobody's Place (Venlo NL)
Contact:

Post by Acki »

aanderse wrote:How many A* path nodes are there? You never mentioned what you use as your path nodes.
frost_bite wrote:

Code: Select all

bool WalkableMap[512 * 512];
memset(WalkableMap, 1, sizeof(bool) * 512 * 512);

NodePool * tmpPathfinding = new NodePool(512, 512, WalkableMap);
while(!asleep) sheep++;
IrrExtensions:Image
http://abusoft.g0dsoft.com
try Stendhal a MORPG written in Java
randomMesh
Posts: 1186
Joined: Fri Dec 29, 2006 12:04 am

Post by randomMesh »

Code: Select all

Pool = new Node[x * y];
You got a memory leak there since Pool is never deleted.

Code: Select all

for(u32 i = 0; i < adjecentNodes.size(); i++)
This will be faster:
1. You call the method size() every loop iteration. This is expensive. Just store the size once and access it.
2. Use the pre-increment operator rather than post-increment.

Code: Select all

const u32 numNodes = adjecentNodes.size();
for(u32 i = 0; i < numNodes; ++i)
Same with

Code: Select all

adjecentNodes[i]
Calling the [] operator multiple times to get the same Node is unnecessarily slow.

The code itself (not the alogrithm) could be faster if you would mind basic C++ optimization rules.

You could browse the source of MicroPather to get some ideas.
"Whoops..."
frost_bite
Posts: 28
Joined: Wed Dec 03, 2008 11:42 pm

Post by frost_bite »

Thank you for your input.

randomMesh: The optimisations you suggested have now been implemented. However, the "while(OpenList.size())" had to stay as the size changes every loop, and the code needs to know when the size equal to 0.

I have also moved "getSurroundingNodes()" and "getNodeWithLeastCost()" to inline functions to save the function call expense.

EDIT: added new version of code. Slightly faster.

Here is a new (slightly faster) version:

Pathfinding.h

Code: Select all

#ifndef PATHFINDING_H_INCLUDED
#define PATHFINDING_H_INCLUDED

#define POOL_ELEMENT(x, y) (&Pool[(x) * w + (y)])

class Node {
private:
    u32 x, y;
    bool Walkable;
    bool Occupied;
    Node * Parent;
    f32 Cost;
public:
    Node(
        u32 X = 0,
        u32 Y = 0,
        bool walkable = true,
        bool occupied = false
    ) : x(X), y(Y), Walkable(walkable), Occupied(occupied) {}

    inline void setParent(Node * parent) {
        Parent = parent;
    }

    inline void setCost(f32 cost) {
        Cost = cost;
    }

    inline void setLocation(u32 X, u32 Y) {
        x = X;
        y = Y;
    }

    inline void setWalkable(bool walkable) {
        Walkable = walkable;
    }

    inline void setOccupied(bool occupied) {
        Occupied = occupied;
    }

    inline u32 &getLocationX() {
        return x;
    }

    inline u32 &getLocationY() {
        return y;
    }

    inline bool &isWalkable() {
        return Walkable;
    }

    inline bool &isOccupied() {
        return Occupied;
    }

    inline Node * getParent() {
        return Parent;
    }

    inline f32 &getCost() {
        return Cost;
    }
};

class NodePool {
private:
    Node * Pool;
    u32 w, h;

    inline void getSurroundingNodes(u32 x, u32 y, Node * nodes[8]) {
        memset(nodes, NULL, sizeof(Node*) * 8);
        if(x > 0) nodes[0] = POOL_ELEMENT(x - 1, y - 0);
        if(y > 0) nodes[1] = POOL_ELEMENT(x - 0, y - 1);
        if(x < w) nodes[2] = POOL_ELEMENT(x + 1, y + 0);
        if(y < h) nodes[3] = POOL_ELEMENT(x + 0, y + 1);
        if(x < w && y < h) nodes[4] = POOL_ELEMENT(x + 1, y + 1);
        if(x > 0 && y < h) nodes[5] = POOL_ELEMENT(x - 1, y + 1);
        if(x > 0 && y > 0) nodes[6] = POOL_ELEMENT(x - 1, y - 1);
        if(x < w && y > 0) nodes[7] = POOL_ELEMENT(x + 1, y - 1);
    }

    inline u32 getNodeWithLeastCost(vector<Node *> &possible) {
        assert(possible.size());
        f32 cost = possible[0]->getCost();
        u32 ret = 0;
        for(u32 i = 1; i < possible.size(); i++) {
            if(possible[i]->getCost() < cost) {
                ret = i;
                cost = possible[i]->getCost();
            }
        }
        return ret;
    }

    inline u32 countTreeSize(Node * start, Node * end) {
        Node * current = start;
        u32 Count = 0;
        while(current != end) {
            Count++;
            current = current->getParent();
        }
        return Count;
    }
public:
    NodePool(u32 x, u32 y, bool * WalkableFlags);
    ~NodePool();
    bool findPath(u32 sx, u32 sy, u32 ex, u32 ey, Node **&path, u32 &pathSize);
};

#endif // PATHFINDING_H_INCLUDED

[Pathfinding.cpp]

Code: Select all


#include <irrlicht.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdexcept>
#include <assert.h>
#include <vector>
#include <algorithm>

using namespace std;

using namespace irr;
using namespace core;
using namespace gui;
using namespace io;
using namespace scene;
using namespace video;

#include <Pathfinding.h>

NodePool::NodePool(u32 x, u32 y, bool * WalkableFlags) {
    w = x;
    h = y;

    Pool = new Node[x * y];

    for(u32 i = 0; i < w; i++) {
        for(u32 d = 0; d < h; d++) {
            Node *node = POOL_ELEMENT(i, d);

            node->setLocation(i, d);
            node->setWalkable(WalkableFlags[(i) * w + (d)]);
            node->setOccupied(false);
        }
    }
}

NodePool::~NodePool() {
    delete[] Pool;
}

bool NodePool::findPath(u32 sx, u32 sy, u32 ex, u32 ey, Node ** &path, u32 &pathSize) {
    vector<Node *> OpenList;
    vector<Node *> ClosedList;
    OpenList.push_back(POOL_ELEMENT(sx, sy));
    POOL_ELEMENT(sx, sy)->setCost(0);
    Node * adjecentNodes[8];

    while(OpenList.size()) {
        u32 cNode = getNodeWithLeastCost(OpenList);
        Node * currentNode = OpenList[cNode];
        OpenList.erase(OpenList.begin() + cNode);
        ClosedList.push_back(currentNode);

        if(currentNode->getLocationX() == ex && currentNode->getLocationY() == ey) {
            pathSize = countTreeSize(POOL_ELEMENT(ex, ey), POOL_ELEMENT(sx, sy));

            path = new Node * [pathSize];

            Node * activeNode = POOL_ELEMENT(ex, ey);
            u32 count = pathSize - 1;
            while(activeNode != POOL_ELEMENT(sx, sy)) {
                path[count--] = activeNode;
                activeNode = activeNode->getParent();
            }

            printf("Path found from %ux%u to %ux%u with %u steps!\n", sx, sy, ex, ey, pathSize);

            return true;
        }

        getSurroundingNodes(currentNode->getLocationX(), currentNode->getLocationY(), adjecentNodes);

        for(u32 i = 0; i < 8; ++i) {
            Node * currentAdjecentNode = adjecentNodes[i];
            if(currentAdjecentNode != NULL) {
                if(find(ClosedList.begin(), ClosedList.end(), currentAdjecentNode) != ClosedList.end() || !currentAdjecentNode->isWalkable()) {
                    // ignore it
                } else {
                    if(find(OpenList.begin(), OpenList.end(), currentAdjecentNode) != OpenList.end()) {
                        // its already on the open list
                    } else {
                        currentAdjecentNode->setParent(currentNode);
                        currentAdjecentNode->setCost(vector2df(sx, sy).getDistanceFrom(vector2df(currentAdjecentNode->getLocationX(), currentAdjecentNode->getLocationY())));
                        OpenList.push_back(currentAdjecentNode);
                    }
                }
            }
        }
    }

    return false;
}

Last edited by frost_bite on Wed Feb 24, 2010 8:44 pm, edited 1 time in total.
drewbacca
Posts: 38
Joined: Tue Jan 30, 2007 6:49 pm

Post by drewbacca »

Glancing at your code, your getNodeWithLeastCost() function will be a source of slow down since you are iterating through all your available nodes every time. Making this more efficient will have a dramatic effect on the speed of your search.

A great container for such a thing is a heap. It will always make the cheapest node on top w/o the overhead of sorting the entire collection of nodes. This is already implemented for you if you use the stl priority_queue as your container.
randomMesh
Posts: 1186
Joined: Fri Dec 29, 2006 12:04 am

Post by randomMesh »

frost_bite wrote:

Code: Select all

inline u32 &getLocationX() {
    return x;
}
You don't need references to POD (plain old data) types like int but you should make the method const.

Code: Select all

inline u32 getLocationX() const { return x; }
"Whoops..."
Acki
Posts: 3496
Joined: Tue Jun 29, 2004 12:04 am
Location: Nobody's Place (Venlo NL)
Contact:

Post by Acki »

drewbacca wrote:Glancing at your code, your getNodeWithLeastCost() function will be a source of slow down since you are iterating through all your available nodes every time. Making this more efficient will have a dramatic effect on the speed of your search.
Acki wrote:I tested the code and I get a calculation time ~2.6 secs...

well, I didn't try, but this could cause calculation overheads:

Code: Select all

        Node * currentNode = getNodeWithLeastCost(OpenList);
        ClosedList.push_back(currentNode);
        OpenList.erase(OpenList.binary_search(currentNode));
when I use A* I always inserted elements to the open list so it is sorted for cost !!!
I mean element 0 is the one with the least cost and the last element of the array is the one with the most cost...
so I don't need to search for the element with the least cost, I just can use the 1st element (it's the one with the least cost)...
this way all the searches (getNodeWithLeastCost and OpenList.binary_search) are obsolete !!! ;)
you know one quick loop to insert the element at a desired position is much more faster than doing multiple searches on the whole array... :lol:
I guess to sort the open list so the 1st element is the one with the most cost and the last is the one with the least cost can also improve the calculation a bit more...
:roll:
while(!asleep) sheep++;
IrrExtensions:Image
http://abusoft.g0dsoft.com
try Stendhal a MORPG written in Java
frost_bite
Posts: 28
Joined: Wed Dec 03, 2008 11:42 pm

Post by frost_bite »

huh, found why it is being slow:

Code: Select all

currentAdjecentNode->setCost(vector2df(sx, sy).getDistanceFrom(vector2df(currentAdjecentNode->getLocationX(), currentAdjecentNode->getLocationY()))); 
should be:

Code: Select all

currentAdjecentNode->setCost(vector2df(ex, ey).getDistanceFrom(vector2df(currentAdjecentNode->getLocationX(), currentAdjecentNode->getLocationY()))); 
It is now working properly. Thanks to everyone for their input!
Ulf
Posts: 281
Joined: Mon Jun 15, 2009 8:53 am
Location: Australia

Post by Ulf »

It helps to have at least a few comments in the code so people can work out what the hell the variables are for.
e.g.

Code: Select all

bool findPath(u32 sx, u32 sy, u32 ex, u32 ey, array<Node *> &path);
what is sx, sy, ex, and ey?
How would anyone know?

I assume it's startX and endX for example.

Without looking at all the code, I can't imagine why that would slow it down rather than make it incorrect.
I can hear birds chirping
:twisted:

I live in the Eye of Insanity.
trivtn
Posts: 132
Joined: Tue Jan 17, 2006 12:30 pm
Location: Viet Nam
Contact:

Post by trivtn »

I've test your code, but It's still very slow ! In my AStar (64x64) , It takes average < 40ms with Optimized code (VC++2008).
Here's my Demo : (with begin source)
http://www.4shared.com/file/ozAq8EbO/MyAStar.html
There's something is fantastic, there's nothing is absolute.
Post Reply