[Coco] [Color Computer] the final stage of optimization

Sat Nov 21 08:19:04 EST 2009

Wayne

If you are using strings you'll need a table of pointers to the actual data so you don't have to move strings around, which is much slower than pointers.

First, sorts.

The concept behind all good sorts, is to move out of place data as close to it's destination as possible with each pass.  That usually means as far as possible with early passes and smaller and smaller distances with each additional pass. 

The easiest sort I think you could use is the comb sort.  It's a modified bubble sort where you compare the first item against the middle item to start and step through until you get to the end with the 2nd pointer.

Each time through you are moving data a large distance instead of bubbling it up one at a time.  This eliminates what's known as "turtles", an out of place data item at the end of the list that slowly bubble up to their destination.

Depending on how out of order the data is, combsort can actually beat the quick sort.  One is better at best case and the other is better at worst and on average they are pretty close.
You'll also find combsort easy to implement iteratively (simple loops) where most quicksort examples use recursion and require more stack space.  For a small memory environment and with unpredictable data size that's pretty important as you could overflow the stack.

The Wiki talks about dividing the gap by 1.3 but to simplify things you can divide by 2 (just a bit shift in assembly).  It will cause more passes as the gap shrinks but for a data set like this you won't really notice.

http://en.wikipedia.org/wiki/Comb_sort
http://en.wikipedia.org/wiki/Quicksort

For searches you want a similar concept.  Get as close to the data you are searching for with each pass as you can.

In other words...
sort the data or you have to search.
Then do what is called a binary search.  
It requires three pointers.
Set the first to the first data element, the 2nd to the last data element and set the 3rd to the midpoint of your sorted data.  
pointer3 = (pointer2 - pointer1 / 2) + pointer1.

Check to see if your target value is greater, less than or equal to pointer3 (data block midpoint).  
If it's equal, quit.
If it's less than, set pointer2 to pointer3 - 1, greater than set pointer1 = pointer3 + 1.
Recalculate the midpoint and go again till you find your data.

You split the data in half each pass of the search and check to see which half your target value is in.  Then you split that in half etc... until you find what you are looking for.

As is typical for computer science the author of the Wiki made the explanation more complex than it needs to be so I included another link.

http://en.wikipedia.org/wiki/Binary_search_algorithm
http://www.ensta.fr/~diam/c++/online/notes-cpp/algorithms/searching/binarysearch.html

-quoted text---------------------------------
I have been aware that the one thing that is slowing decode down more than anything else is having to use disk files to store, search and sort records. This email concerns searches and sorts.

I have never really gotten a good understanding of how search and sort algorithms are designed, and how they operate, beyond the least complex. When I wrote DCom, I wrote sort procedures that were recursive. They worked, but I never really understood if they were working efficiently or not. I do know that the sorting operations were faster after they were modified.

In decode, I am using a simple search algorithm. Basically, it starts with the first record, and continues reading each succeeding record until either a match is found or the last record is read. The number of records that get added to the file depends on the I-Code being decoded. The more records in the file, the longer the search takes.

Three files are created, variables reference, line reference, and literal reference. Decoding createmsg (one of Ron's modules, about 2K in size), the decode takes about 20 minutes. Decoding RConfig (another of Ron's modules, about 16K in size) takes about 40 minutes. Decoding decode (20K) takes 2 hours.

I have the same problem with my sort. I'm using a sifter-style (for lack of a better term) of sort. It starts with the first record, proceeds to the last record, then performs the sort in reverse, starting with the last record and proceeding to the first. Each pass of the routine performs both of these operations. It works, and it was much faster than using the top-to-bottom method I first used that was based on the search routine. However, it takes as long to sort the literals records as it does to decode the instruction section of the I-Code module.

There has to be a better way. Can someone help me understand how to make my search and sort algorithms faster?

Wayne