[Coco] [Color Computer] Re: the final stage of optimization
snhirsch at gmail.com
Tue Nov 24 07:45:14 EST 2009
On Mon, 23 Nov 2009, Wayne Campbell wrote:
> James (it's nice to know your name),
> The program is a decompiler, unpacker, decoder, however you want to say
> it. Basic09 compiles into intermediate code, and I am using that to
> reconstruct the source code.
> I am still working at understanding exactly what a hash is, but I think
> I get it. I was thinking about an idea to create an array for the
> literals that included a field that would contain the numeric value of a
> string. That value would be the total of the numeric values of each
> character in the string. I'm not sure if that's what you mean by "packed
> ascii" or not, but it sounds like it might be the same thing. Would it
> be feasible, though? What are the chances that 2 different strings, even
> of differing lengths, could have the same value?
The chances of this happening are very high. It's called a "collision" in
the terminology of a hash table. I'd suggest researching the theory and
implementation of hash tables on Wikipedia or a CS-related site.
If the key-calculation algorithm is working well (and a simple summation
probably is not a good choice), the hashed items will be spread evenly
over a limited number of "buckets". Each bucket has a short list of data
items (or references to those items if, e.g. they actually live on disk)
that hash to that key. At that point, you typically do a linear search of
the list to directly find the target item. Note that this is only one
possible implementation of a hash table!
At best, a hash table can approach constant-time lookup. At worst, it
degenerates to an O(n) linear search. Typically, you'll be more towards
the constant-time end of things.
More information about the Coco