[Coco] Mod10 Suggestions

William Astle lost at l-w.ca
Sat Feb 18 12:35:11 EST 2017


Take a closer look. It only does the LSLA on every other digit. It does 
*two* digits  per loop, just like Brett's version.

You can easily pretend all numbers are 16 digits by right justifying the 
numbers in your buffer and padding with zeros.

On 2017-02-18 10:06 AM, William Mikrut wrote:
> I like how this works from right to left.
> The only issue is the LSLA on every number.
>
> The algo is to double every other number, starting with the right most
> digit, and sub 9 if the result is 10 or more.
>
> Now if the number is always 16 digits, Brett's 16 bit word seems the
> easiest way to go.
> If the number is 13 digits long the 16 bit word method won't work, but I am
> happy to pretend all numbers are 16 digits!
>
> I am going to try to include a couple things you showed me into Brett's 16
> bit chunk method and try a slightly different routine!
>
>
> On Sat, Feb 18, 2017 at 10:22 AM, William Astle <lost at l-w.ca> wrote:
>
>> On 2017-02-18 12:43 AM, msmcdoug wrote:
>>
>>> Actually I'm surprised noone has suggested bcd arithmetic on the result
>>> to eliminate divide by 10 loop
>>>
>>
>> BCD would certainly give a predictable overall cycle count. It would
>> require a significantly different approach, though. The only register you
>> can use for BCD arithmetic is A and DAA is only useful after ADDA or ADCA.
>>
>> I had thought about using BCD but had initially dismissed it due to
>> possible complexity. However, upon reflection, the extra cycles to use BCD
>> would probably be less than the average cycle time of the modulus loop
>> combined or checking for digit overflow during the loop.
>>
>> I think you could use code that looks something like the following which
>> is based off Mr. Mikrut's most recent posted code. (warning: mailer codeā„¢
>> follows so it may have errors)
>>
>>         ORG $1200
>> CCD     RMB 16
>> RESULT  RMB 1
>> START   LEAX CCD+16,PCR
>>         CLRA
>>         LDB #8
>> LOOP    PSHS A
>>         LDA ,-X
>>         LSLA
>>         CMPA #10
>>         BLO LOOP2
>>         SUBA #9
>> LOOP2   ADDA ,S+
>>         DAA
>>         ADDA ,-X
>>         DAA
>>         DECB
>>         BNE LOOP
>>         ANDA #$0F
>>         STA RESULT,PCR
>> ENDPGM  RTS
>>
>> I'm using the stack for a temporary storage location instead of something
>> PCR relative for code size reasons. You could use the "RESULT variable for
>> the temporary to eliminate stack usage. That would probably be slightly
>> faster at the expense of two more code bytes. This is one of those
>> size/speed trade-offs.
>>
>> DAA has to be used after every addition and only applies to A. Using BCD
>> means we can eliminate the mod 10 loop and just mask off the upper digit
>> (BCD stores two decimal digits in a byte). That gives a constant time for
>> the "mod 10" result and also only takes 2 bytes (and 2 cycles).
>>
>> I have also eliminated the STATUS variable and just store the result. You
>> can test RESULT for non-zero trivially so there's no need for a separate
>> STATUS value.
>>
>> By my calculation, this version is 32 bytes, requires 1 byte of stack
>> space, 17 bytes of data space, and runs in a maximum of 351 cycles (and a
>> minimum of 336 cycles if none of the doubled digits goes above 9). For this
>> analysis, I've assumed 8 bit offsets for the PCR references. 16 bit offsets
>> in PCR mode are quite a bit more expensive (4 extra cycles and 1 extra
>> byte).
>>
>>
>> --
>> Coco mailing list
>> Coco at maltedmedia.com
>> https://pairlist5.pair.net/mailman/listinfo/coco
>>
>



More information about the Coco mailing list