Sort Array

If you are a new Irrlicht Engine user, and have a newbie-question, this is the forum for you. You may also post general programming questions here.
Ion Dune
Posts: 453
Joined: Mon Nov 12, 2007 8:29 pm
Location: California, USA
Contact:

Sort Array

Post by Ion Dune »

I have an array of objects in my game that I need to sort by x value for collision, pathfinding, etc. The array contains only pointers to the objects, so I can't directly use the array.sort() function. I saw the workaround in some topic or another (making a struct containing a pointer to the object and a < operation), but to implement that I would have to change so much code that it made me not want to do it.

I got to thinking about the best way to sort by X value, and I realized that the fasted way would probably be to start out with a sorted array (do a bubble or something during the scene start up) and then to only sort elements that change, ie. I have 500 units, and the 250th moves past the 251st, so I just switch those two elements instead of resorting the whole thing.

I remembered a dynamic array type thing I did in a C class, where each element contains a pointer to the next element. I thought this would be good in this situation because to add or remove an element (which I would be doing a lot with sorting), you only have to change the previous element's pointer.

The problem is I'm not really certain if this would actually be any faster than the stuff built in to irrlicht, so I need advice in picking the best way to sort elements by X value, preferably geared towards huge arrays (upwards of 1000 elements).

Thanks for your time.
arras
Posts: 1622
Joined: Mon Apr 05, 2004 8:35 am
Location: Slovakia
Contact:

Post by arras »

I remembered a dynamic array type thing I did in a C class, where each element contains a pointer to the next element.
This is called list.

Here is sorted list class I use for A* pathfinding:

Code: Select all

//*-----------------------------------------------------------------------------*
| headerfile sortedlist.h                                                      |
|                                                                              |
| version 1.30                                                                 |
| date: (25.03.2008)                                                           |
|                                                                              |
| author:  Michal Švantner                                                     |
| Sorted list.                                                                 |
| It sorts data at the point of insertion based on comparison method provided. |
|                                                                              |
| Thanks vitek from Irrlicht forum for useful help.                            |
*-----------------------------------------------------------------------------*/

#ifndef SORTEDLIST_H
#define SORTEDLIST_H

#ifndef COMPARE_H
#define COMPARE_H

// Structure for comparing two values.
template <class T> struct Less
{
   // Returns true if firsth value is smaler than second;
   bool operator() (const T& v1, const T& v2) {return v1 < v2;}
};

// Structure for comparing two values.
template <class T> struct More
{
   // Returns true if firsth value is larger than second;
   bool operator() (const T& v1, const T& v2) {return v1 > v2;}
};

// Structure for comparing two values.
template <class T> struct Equal
{
   // Returns true if firsth equals second;
   bool operator() (const T& v1, const T& v2) {return v1 == v2;}
};

// Structure for comparing two values.
template <class T> struct Unequal
{
   // Returns true if firsth is unequal to second;
   bool operator() (const T& v1, const T& v2) {return v1 != v2;}
};

// Structure for comparing two values which are known by pointers.
template <class T> struct Deref_Less
{
   // Returns true if firsth value is smaler than second;
   bool operator() (const T& v1, const T& v2) {return *v1 < *v2;}
};

// Structure for comparing two values which are known by pointers.
template <class T> struct Deref_More
{
   // Returns true if firsth value is larger than second;
   bool operator() (const T& v1, const T& v2) {return *v1 > *v2;}
};

// Structure for comparing two values which are known by pointers.
template <class T> struct Deref_Equal
{
   // Returns true if firsth equals second;
   bool operator() (const T& v1, const T& v2) {return *v1 == *v2;}
};

// Structure for comparing two values which are known by pointers.
template <class T> struct Deref_Unequal
{
   // Returns true if firsth is unequal to second;
   bool operator() (const T& v1, const T& v2) {return *v1 != *v2;}
};

#endif

template <class T, class Compare = Less<T> > class SortedList
{
   struct Node
   {
      T Data;
      Node *Inferior, *Superior;
   };

   Node Top, Bottom;

   Compare compare;

public:

   // Constructor.
   SortedList()
   {
      Top.Inferior = &Bottom;
      Bottom.Superior = &Top;
   }

   // Destructor.
   ~SortedList()
   {
      clear();
   }

   // Insert data in to the list.
   void add(T data)
   {
      Node *newNode = new Node;
      newNode->Data = data;

      Node *node = Top.Inferior;

      while(compare(node->Data, data) && node != &Bottom) node = node->Inferior;

      newNode->Superior = node->Superior;
      newNode->Inferior = node;

      node->Superior->Inferior = newNode;
      node->Superior = newNode;
   }

   // Return best data based on comparison method.
   // List should not be empty.
   T getBest()
   {
      return Top.Inferior->Data;
   }

   // Return best data based on comparison method and remove it from list.
   // List should not be empty.
   T extractBest()
   {
      Node *node = Top.Inferior;

      T data = node->Data;

      Top.Inferior = node->Inferior;
      node->Inferior->Superior = &Top;

      delete node;

      return data;
   }

   // If data is on list remove it.
   void remove(T data)
   {
      Node *node = Top.Inferior;

      while(node != &Bottom &&  data != node->Data) node = node->Inferior;

      if(node != &Bottom)
      {
         node->Inferior->Superior = node->Superior;
         node->Superior->Inferior = node->Inferior;

         delete node;
      }
   }

   // Clear list.
   void clear()
   {
      Node *node = Top.Inferior;

      while(node != &Bottom)
      {
         node = node->Inferior;
         delete node->Superior;
      }

      Top.Inferior = &Bottom;
      Bottom.Superior = &Top;
   }

   // Return true if list is empthy.
   bool isEmpty()
   {
      return Top.Inferior == &Bottom;
   }
};

#endif
You need to provide your own method to compare your objects. It should be something like this:

Code: Select all

template <class T> struct MyMethod
{
   bool operator() (const T& v1, const T& v2) {return v1->X < v2->X;}
};
Then you initialize list like this:

Code: Select all

SortedList<YourElementClass,  MyMethod<YourElementClass> > list;
list.add(yourElement1);
list.add(yourElement2);
...
I did spend some time optimizing it so it should be fast. It however might not be bug free so if you happened to have some problems please contact me.

Of course this list do not know if one of your element changed its X value. In such case you need to remove it from list and add it again (with new X value).
vitek
Bug Slayer
Posts: 3919
Joined: Mon Jan 16, 2006 10:52 am
Location: Corvallis, OR

Post by vitek »

You could just use the Standard C++ Library algorithm sort...

Code: Select all

struct Deref_NameCompare 
{ 
   bool operator() (const MyObject* lhs, const MyObject* rhs) const
   {
      return strcmp (rhs->getName(), lhs->getName()) < 0;
   } 
};

// assuming you have something like this
core::array<MyObject*> objects;

// use the C++ standard library algorithm for sorting. note that you
// must include <algorithm>
std::sort (objects.begin(), objects.end(), Deref_NameCompare());
rogerborg
Admin
Posts: 3590
Joined: Mon Oct 09, 2006 9:36 am
Location: Scotland - gonnae no slag aff mah Engleesh
Contact:

Re: Sort Array

Post by rogerborg »

Indeed, but depending on how you actually implement your object logic, you may want to go with your own bubble sort.

If you first move all objects, and then (in a separate pass) do collision / AI, then a single sort (after all the moves) will likely be more efficient.

However, if you want to do movement + AI/collision once per object, then you'll have to keep your list sorted all the time. As you noted, it's just a case of swapping objects up or down until they're back in order again. You could use a list, but an array would be just fine as well.
Please upload candidate patches to the tracker.
Need help now? IRC to #irrlicht on irc.freenode.net
How To Ask Questions The Smart Way
arras
Posts: 1622
Joined: Mon Apr 05, 2004 8:35 am
Location: Slovakia
Contact:

Post by arras »

Array is faster when iterating. List is faster when inserting extracting elements. If you work with fixed number of elements and you don't have to sort them often, array is better. If number of elements is dynamic, its better to use list, especially if you deal with 1000 objects. List is probably also going to be faster when sorting.

You can reserve memory for array if you are sure that number of elements will not execs that reserve. But that still don't have to make array more efficient.
ebo
Posts: 38
Joined: Sun Feb 19, 2006 5:39 pm

Post by ebo »

arras wrote:Array is faster when iterating.
From a technical PoV: yes.
arras wrote:List is faster when inserting extracting elements.
Not if you insert elements at the end of an array. Even with reallocation.
arras wrote:If number of elements is dynamic, its better to use list, especially if you deal with 1000 objects.
Lists generally kill memory locality. This is normally bad. And 1000 isnt large ...
arras wrote:List is probably also going to be faster when sorting.
No! From a algorithmic PoV there is no difference. From a technical view: Arrays are fast, because they have cheap random access.

For Dijkstra/A* the optimal data structure would be a fibonacci heap. Unfortunately they are rather difficult to code. A good alternative is a binary heap. A sorted list is only a bit better than just using an unsorted one.
arras
Posts: 1622
Joined: Mon Apr 05, 2004 8:35 am
Location: Slovakia
Contact:

Post by arras »

Not if you insert elements at the end of an array. Even with reallocation.
If you need sorting, you don't expect element to be inserted at the end.
Lists generally kill memory locality. This is normally bad.
Normally yes, but you have to weight pros and cons.
No! From a algorithmic PoV there is no difference. From a technical view: Arrays are fast, because they have cheap random access.
From algorithmic not, what makes difference is that you need to reallocate all following elements for every element sorted.
A good alternative is a binary heap. A sorted list is only a bit better than just using an unsorted one.
Sorted list is several times faster in this situation than unsorted one. I did spend few days testing this so I am certain about it. At last on my system.
I also did comparison tests between sorted list and binary heap with 10000 elements and sorted list was roughly as fast as binary heap.
Then I did comparison tests with A* pathfinding on map 250x175 large with obstacles arranged in such way which ensured nearly all tiles to be tested by pathfinding. On average, binary heap was bit slower than sorted list (yes I know most of the A* tutorials say that binary heap is faster ...it wasn't).
Saturn
Posts: 418
Joined: Mon Sep 25, 2006 5:58 pm

Post by Saturn »

Why should sorting a list be any faster than sorting an array? If anything it is the other way around.
You don't need to reallocate on sorting, as the number of elements doesn't change. Arras, why do you believe there is any reallocation required?

You shouldn't be inserting elements during sorting unless you use a bad sorting algorithm in the first place. Bubble-, Heap- and Quicksort don't insert, they shuffle.
In case of bubble sort and to a degree also for heap- and quicksort, cache coherence is a factor. And array is way better than list in this regard.
ebo
Posts: 38
Joined: Sun Feb 19, 2006 5:39 pm

Post by ebo »

You should use in-place algorithms for sorting. They dont require any reallocation.

I also think that inserting at the end is so uncommon (I use std:vector regularly in high performance computing applications).

The bad thing about benchmarks is reproducibility. I did my own test (in java though) and a bin heap was way faster.
arras
Posts: 1622
Joined: Mon Apr 05, 2004 8:35 am
Location: Slovakia
Contact:

Post by arras »

Ok, here is what I think is going to happen when you sort single element in array and list. Please correct me if I am wrong:

Array:
take one element and iterate through whole array comparing element with others. Once comparison return false (or true depending on your condition) you have to reposition(shift) all elements between your element original and new position.

List:
take one element and iterate through list comparing as in array. Once new position was found reconnect nodes at original and new position by changing pointers (3 or 6 pointers in total depending on list type).

Clearly you do much less operations on list than on array (although it depend to some extend on position of element and number of elements in list/array).

Now if you need to insert/extract element not just resort it, situation is even worst:
In case of list you just need to allocate new (delete) memory for single node. Iterate through list and insert (changing 2 or 4 pointers).
Yet in case of array you end up allocating new memory for whole array, copying all elements to new location and remove old memory.

Binary heaps:
In theory those should clearly be superior to sorted list. Problem is that implementation of binary heap is slower than that of sorted of list.

Inserting:
In case of list, you test just against 2 conditions: your sorting test and end of list.
In case of binary heap you test 4 conditions: your sorting test, left or right node, and end of heap.

Get best element:
In case of list you just take first element (or last) from list.
In case of binary heap you have to iterate through it to find best element.

Removing:
In case of list you just delete one node and reconnect two others.
In case of binary heap you delete node and reinsert both its left and right branches in to the heap which includes iterating through part of heap.

Clearing:
In case of list you iterate and delete elements in process.
In case of binary heap you use nesting which is slower than linear iterating.

Binary heap can still be faster on inserting element but that is depending on conditions and on order in which elements are inserted as well as number of elements inserted. In worst case you may end up having only one branch in heap.

However in all other operations sorted list is always going to be faster.

In case of huge number of elements, binary heap can still have chance to be faster. As I sad, I did tests with 10000 elements which already is way too much for A* to handle in real time. And binary heap sill was not fast enough.

At the side note:
With sorted list you can always expect its effectivity to be linearly dependent on its size. Yet in case of binary heap, you newer known what to expect.

You also have to note, that in case of A* pathfinding your open list is always going to be relative limited in size. All tested nodes have to enter it at some point but most of them are going to end in closed list. So its efectivity based on size is not that important ...and thats why I think sorted list was better in my tests.
ebo
Posts: 38
Joined: Sun Feb 19, 2006 5:39 pm

Post by ebo »

I'm sorry to say, but you got several things wrong.

Sorting: You described something that could best be compared to Insertion sort. Thats a O(n^2) sorting algorithms. Try heap- or quicksort. They run in O(n*log(n)), which is much faster (Break-even point is around 5-9 elements).

True is: If you are inserting an element in the middle of an array it costs you O(n) operations. For array allocations: you should always double the amount of memory in each allocation step. If you do this the whole cost evens out to O(1), which is constant time.

Comparing Sorted Lists and Bin heaps:
Major Perfomance Points in Wayfinding algorithms

Inserting:
List -> Insertion O(1), Sorting O(nlog(n))
Sorted Insertion: List -> You have to search the right place O(n) + insert O(1)
Array -> binary search O(logn) + insert O(n)
Bin Heap -> O(log(n))

Extract best Element:
List -> O(1)
Bin Heap -> O(log(n)) Just take the first element. Replace it with the last and restore heap order

Decrease Key:
List -> O(n) to find key, O(nlogn) for sorting
Bin heap -> O(n) to find key, O(logn) for restoring Heap Order

The nice feature of Bin Heap is, that you can use a array to store everything you need.

If you really need performance you can still use the more complicated Fibonacci heap.
http://en.wikipedia.org/wiki/Fibonacci_heap Look at the table at the bottom of the page.
A Bin heap can be faster than a fib heap (due to implementation) but it should always be faster than a list.
arras
Posts: 1622
Joined: Mon Apr 05, 2004 8:35 am
Location: Slovakia
Contact:

Post by arras »

Sorting: You described something that could best be compared to Insertion sort. Thats a O(n^2) sorting algorithms. Try heap- or quicksort.
These can be more effective only if you reorder unsorted list/array. If you are inserting element in to list/array which is sorted already I do not see any advantage of those methods. But I may try to test it since I was newer using them before. Particularly I do not see how can heap sort be any better to what I described. In what I described you first find new position then shift elements. When doing heap sort you compare then swap each element individually ending with same total sum of elements shifted. But I may be wrong.

As for binary heap I must apologize you, I was describing binary tree not binary heap and confused them, partly because I am not native English speaker.

Thanks for that link I'll look at that ;)
ebo
Posts: 38
Joined: Sun Feb 19, 2006 5:39 pm

Post by ebo »

A binary heap uses a binary tree. It's just not ordered totally. It's only guaranteed to have to min or max element in front.

Why is sorting with binary heaps faster than bubblesort?
Bubblesort is obviously O(n^2).

You can insert a element into a binary heap in O(logn). You have to do this n times, so the overall complexity is O(n*logn). And a binary sort is a special case. Its guaranteed to have this complexity (unlike quicksort).
rogerborg
Admin
Posts: 3590
Joined: Mon Oct 09, 2006 9:36 am
Location: Scotland - gonnae no slag aff mah Engleesh
Contact:

Post by rogerborg »

This is all very interesting, but the appropriate collection and algorithm will be dictated by the application's requirements. Since we don't know those, you're just comparing the sizes of your inhalers.
Please upload candidate patches to the tracker.
Need help now? IRC to #irrlicht on irc.freenode.net
How To Ask Questions The Smart Way
Saturn
Posts: 418
Joined: Mon Sep 25, 2006 5:58 pm

Post by Saturn »

I guess the point is that you already have an almost sorted list(array) and want to resort it. This is actually rather common in computer graphics as there usually is some kind of frame-to-frame coherence. In the case of almost sorted lists insertion sort is quite good. Same goes for bubblesort and derivatives on arrays.

Adding a new element to an already sorted container is O(n) for both sorted array and sorted vector. Removing a random element is O(n) too. So I still fail to see how a sorted list is superior to a sorted array in this context.

rogerborg, glad you figured out the rationale of this discussion. Of course this is about apps' requirements, but in order to assess whether one or other container/algo is best in a certain situation you need to know the complexity in said situation. That's what is done here.
Post Reply