[Coco] [Color Computer] Re: the final stage of optimization
asa.rand at yahoo.com
Tue Nov 24 15:54:22 EST 2009
Thanks, Steve. Your words are verification. I went to wikipedia last night, after posting that reply, and looked up hash, checksum, etc. I'm afraid my higher math knowledge is lacking, so alot of the mathematical explanation is lost on me. I understand how to "read" algebra, but I have difficulty with knowing exactly what it means.
I learned enough to see that collisions were very likely, and that I would have to do something like what you suggest to make it work. I need to play around with all of this for awhile before I can truly grasp what is happening, and what to do about the problem areas.
At least I feel capable of figuring it out, which is more than I could say when I developed DCom.
From: Steven Hirsch <snhirsch at gmail.com>
To: CoCoList for Color Computer Enthusiasts <coco at maltedmedia.com>
Sent: Tue, November 24, 2009 4:45:14 AM
Subject: Re: [Coco] [Color Computer] Re: the final stage of optimization
On Mon, 23 Nov 2009, Wayne Campbell wrote:
> James (it's nice to know your name),
> The program is a decompiler, unpacker, decoder, however you want to say it. Basic09 compiles into intermediate code, and I am using that to reconstruct the source code.
> I am still working at understanding exactly what a hash is, but I think I get it. I was thinking about an idea to create an array for the literals that included a field that would contain the numeric value of a string. That value would be the total of the numeric values of each character in the string. I'm not sure if that's what you mean by "packed ascii" or not, but it sounds like it might be the same thing. Would it be feasible, though? What are the chances that 2 different strings, even of differing lengths, could have the same value?
The chances of this happening are very high. It's called a "collision" in the terminology of a hash table. I'd suggest researching the theory and implementation of hash tables on Wikipedia or a CS-related site.
If the key-calculation algorithm is working well (and a simple summation probably is not a good choice), the hashed items will be spread evenly over a limited number of "buckets". Each bucket has a short list of data items (or references to those items if, e.g. they actually live on disk) that hash to that key. At that point, you typically do a linear search of the list to directly find the target item. Note that this is only one possible implementation of a hash table!
At best, a hash table can approach constant-time lookup. At worst, it degenerates to an O(n) linear search. Typically, you'll be more towards the constant-time end of things.
Coco mailing list
Coco at maltedmedia.com
More information about the Coco