[Coco] Mod10 Suggestions

Ron Klein ron at kdomain.org
Sat Feb 18 20:14:02 EST 2017


I do not know much of anything in assembly, but this exchange of
information between all involved was fascinating.  What a learning
experience!

Thank you all for that!

-Ron


On Sat, Feb 18, 2017 at 5:06 PM, William Mikrut <wmikrut72 at gmail.com> wrote:

> Some slight re ordering of the code and it works perfectly!
> 48 Bytes total, Less 17 for storage -- 31 program bytes to get the job
> done.
>
> My original code was 61 program bytes... down to half the size and does the
> exact same thing.
> Absolutely amazing!
>
>
> ORG $1200
> CCD     RMB 16
> RESULT  RMB 1
>
> START   LEAX CCD+16,PCR
> CLRA
>         LDB #8
>
>
> LOOP    ADDA ,-X
>         DAA
>         PSHS A
>         LDA ,-X
>         LSLA
>         CMPA #10
>         BLO LOOP2
>         SUBA #9
> LOOP2   ADDA ,S+
>         DAA
>
>         DECB
>         BNE LOOP
>
>
>
>         ANDA #$0F
>         STA RESULT,PCR
> ENDPGM  RTS
> END START
>
> On Sat, Feb 18, 2017 at 1:03 PM, William Mikrut <wmikrut72 at gmail.com>
> wrote:
>
> > You are right -- I looked at is closer.
> > One thing I need to do is reverse the order of operations.
> >
> > The LSLA is performed first.
> > First I need to store the byte and LSLA the next byte.
> >
> > Otherwise if I flip it from left to right:
> > (LEAX CCD,PCR
> > ...
> > LDA ,X+
> > ...
> > ADDA ,X+)
> >
> >  it works perfectly.
> >
> >
> > On Sat, Feb 18, 2017 at 11:35 AM, William Astle <lost at l-w.ca> wrote:
> >
> >> Take a closer look. It only does the LSLA on every other digit. It does
> >> *two* digits  per loop, just like Brett's version.
> >>
> >> You can easily pretend all numbers are 16 digits by right justifying the
> >> numbers in your buffer and padding with zeros.
> >>
> >>
> >> On 2017-02-18 10:06 AM, William Mikrut wrote:
> >>
> >>> I like how this works from right to left.
> >>> The only issue is the LSLA on every number.
> >>>
> >>> The algo is to double every other number, starting with the right most
> >>> digit, and sub 9 if the result is 10 or more.
> >>>
> >>> Now if the number is always 16 digits, Brett's 16 bit word seems the
> >>> easiest way to go.
> >>> If the number is 13 digits long the 16 bit word method won't work, but
> I
> >>> am
> >>> happy to pretend all numbers are 16 digits!
> >>>
> >>> I am going to try to include a couple things you showed me into Brett's
> >>> 16
> >>> bit chunk method and try a slightly different routine!
> >>>
> >>>
> >>> On Sat, Feb 18, 2017 at 10:22 AM, William Astle <lost at l-w.ca> wrote:
> >>>
> >>> On 2017-02-18 12:43 AM, msmcdoug wrote:
> >>>>
> >>>> Actually I'm surprised noone has suggested bcd arithmetic on the
> result
> >>>>> to eliminate divide by 10 loop
> >>>>>
> >>>>>
> >>>> BCD would certainly give a predictable overall cycle count. It would
> >>>> require a significantly different approach, though. The only register
> >>>> you
> >>>> can use for BCD arithmetic is A and DAA is only useful after ADDA or
> >>>> ADCA.
> >>>>
> >>>> I had thought about using BCD but had initially dismissed it due to
> >>>> possible complexity. However, upon reflection, the extra cycles to use
> >>>> BCD
> >>>> would probably be less than the average cycle time of the modulus loop
> >>>> combined or checking for digit overflow during the loop.
> >>>>
> >>>> I think you could use code that looks something like the following
> which
> >>>> is based off Mr. Mikrut's most recent posted code. (warning: mailer
> >>>> codeā„¢
> >>>> follows so it may have errors)
> >>>>
> >>>>         ORG $1200
> >>>> CCD     RMB 16
> >>>> RESULT  RMB 1
> >>>> START   LEAX CCD+16,PCR
> >>>>         CLRA
> >>>>         LDB #8
> >>>> LOOP    PSHS A
> >>>>         LDA ,-X
> >>>>         LSLA
> >>>>         CMPA #10
> >>>>         BLO LOOP2
> >>>>         SUBA #9
> >>>> LOOP2   ADDA ,S+
> >>>>         DAA
> >>>>         ADDA ,-X
> >>>>         DAA
> >>>>         DECB
> >>>>         BNE LOOP
> >>>>         ANDA #$0F
> >>>>         STA RESULT,PCR
> >>>> ENDPGM  RTS
> >>>>
> >>>> I'm using the stack for a temporary storage location instead of
> >>>> something
> >>>> PCR relative for code size reasons. You could use the "RESULT variable
> >>>> for
> >>>> the temporary to eliminate stack usage. That would probably be
> slightly
> >>>> faster at the expense of two more code bytes. This is one of those
> >>>> size/speed trade-offs.
> >>>>
> >>>> DAA has to be used after every addition and only applies to A. Using
> BCD
> >>>> means we can eliminate the mod 10 loop and just mask off the upper
> digit
> >>>> (BCD stores two decimal digits in a byte). That gives a constant time
> >>>> for
> >>>> the "mod 10" result and also only takes 2 bytes (and 2 cycles).
> >>>>
> >>>> I have also eliminated the STATUS variable and just store the result.
> >>>> You
> >>>> can test RESULT for non-zero trivially so there's no need for a
> separate
> >>>> STATUS value.
> >>>>
> >>>> By my calculation, this version is 32 bytes, requires 1 byte of stack
> >>>> space, 17 bytes of data space, and runs in a maximum of 351 cycles
> (and
> >>>> a
> >>>> minimum of 336 cycles if none of the doubled digits goes above 9). For
> >>>> this
> >>>> analysis, I've assumed 8 bit offsets for the PCR references. 16 bit
> >>>> offsets
> >>>> in PCR mode are quite a bit more expensive (4 extra cycles and 1 extra
> >>>> byte).
> >>>>
> >>>>
> >>>> --
> >>>> Coco mailing list
> >>>> Coco at maltedmedia.com
> >>>> https://pairlist5.pair.net/mailman/listinfo/coco
> >>>>
> >>>>
> >>>
> >>
> >> --
> >> Coco mailing list
> >> Coco at maltedmedia.com
> >> https://pairlist5.pair.net/mailman/listinfo/coco
> >>
> >
> >
>
> --
> Coco mailing list
> Coco at maltedmedia.com
> https://pairlist5.pair.net/mailman/listinfo/coco
>


More information about the Coco mailing list