[Coco] [Color Computer] Re: DEC Alpha CPU bug. 6809 Test and Set instruction?
James Diffendaffer
jdiffendaffer at yahoo.com
Wed Jul 12 15:01:58 EDT 2006
>> >> The rest I've seen were the result of compiler bugs/limitations,
a CPU
>> >> bug (stupid DEC Alpha), OS bugs and the quest for "job security".
>> >>
>> >What dec alpha cpu bug?
>> >
>> >kevin
>> Not even gonna go there.
>>
>Aw cmon. I have one of those 600au personal workstation beasties.
>
>kevin
When I worked at MCI we ran into a bug in the Alpha's that we were
using for processing call records. It's been over 7 years so I may
not remember the details correctly.
We processed over 120 million records through the system each day and
in a given week 2 to 5 records could not be processed correctly by the
part of the system I was in charge of. (I was the team lead)
It took 2 weeks just to reproduce the error in unit testing and
another several days to isolate the exact data causing the problem.
We traced it to a piece of code that was supposed to clear an array
used in processing certain call records. The array was only used
under certain circumstances so we had a test for the condition and it
only ran when that condition was met. The array had a dimension of at
least 100 so it offered a decent speedup.
It looked something like this:
if(data->piece == something) {
//clear array
for(i=0; i < 100; i++) {
array[i] = 0;
}
}
Very simple stuff.
Anyway... we determined the logic was correct, the C++ code was
correct, examined the code output from the compiler and it was
correct. Single stepping showed the code did what we intended but
when when we ran a specific sequence of data through it the test
failed even though it was written and compiled correctly. The odds of
it taking place even with the right code sequence were about 1 in a
billion with our data so I doubt you'd see it even if your cpu had the
bug.
The actual bug appeared to be some sort of state that could be created
in the processor when the right instructions & data were used in a
very specific sequence. I can't remember if the condition code was
correct and the branch failed or if the condition code failed and the
branch was correct. Either way I wouldn't have thought such a thing
were possible if I hadn't seen it myself.
I'm guessing an extra instruction in the compiler output would have
fixed it.
Anyway... we removed the test so the array would always be cleared and
placed a comment in the code explaining why it wasn't optimal.
After creating a data sample and extracting some code that could
reproduce the error I tried to get it forwarded to Digital and our
internal points of contact with them refused, said it was my bug,
wouldn't even look at the code and said it was fixed so who cares.
Brought to you by the 6809, the 6803 and their cousins!
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/ColorComputer/
<*> To unsubscribe from this group, send an email to:
ColorComputer-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the Coco
mailing list