[Coco] the final stage of optimization

Fri Nov 20 02:33:08 EST 2009

I have been aware that the one thing that is slowing decode down more than anything else is having to use disk files to store, search and sort records. This email concerns searches and sorts.

I have never really gotten a good understanding of how search and sort algorithms are designed, and how they operate, beyond the least complex. When I wrote DCom, I wrote sort procedures that were recursive. They worked, but I never really understood if they were working efficiently or not. I do know that the sorting operations were faster after they were modified.

In decode, I am using a simple search algorithm. Basically, it starts with the first record, and continues reading each succeeding record until either a match is found or the last record is read. The number of records that get added to the file depends on the I-Code being decoded. The more records in the file, the longer the search takes.

Three files are created, variables reference, line reference, and literal reference. Decoding createmsg (one of Ron's modules, about 2K in size), the decode takes about 20 minutes. Decoding RConfig (another of Ron's modules, about 16K in size) takes about 40 minutes. Decoding decode (20K) takes 2 hours.

I have the same problem with my sort. I'm using a sifter-style (for lack of a better term) of sort. It starts with the first record, proceeds to the last record, then performs the sort in reverse, starting with the last record and proceeding to the first. Each pass of the routine performs both of these operations. It works, and it was much faster than using the top-to-bottom method I first used that was based on the search routine. However, it takes as long to sort the literals records as it does to decode the instruction section of the I-Code module.

There has to be a better way. Can someone help me understand how to make my search and sort algorithms faster?

Wayne