[Coco] Mod10 Suggestions
Ron Klein
ron at kdomain.org
Sat Feb 18 20:14:02 EST 2017
I do not know much of anything in assembly, but this exchange of
information between all involved was fascinating. What a learning
experience!
Thank you all for that!
-Ron
On Sat, Feb 18, 2017 at 5:06 PM, William Mikrut <wmikrut72 at gmail.com> wrote:
> Some slight re ordering of the code and it works perfectly!
> 48 Bytes total, Less 17 for storage -- 31 program bytes to get the job
> done.
>
> My original code was 61 program bytes... down to half the size and does the
> exact same thing.
> Absolutely amazing!
>
>
> ORG $1200
> CCD RMB 16
> RESULT RMB 1
>
> START LEAX CCD+16,PCR
> CLRA
> LDB #8
>
>
> LOOP ADDA ,-X
> DAA
> PSHS A
> LDA ,-X
> LSLA
> CMPA #10
> BLO LOOP2
> SUBA #9
> LOOP2 ADDA ,S+
> DAA
>
> DECB
> BNE LOOP
>
>
>
> ANDA #$0F
> STA RESULT,PCR
> ENDPGM RTS
> END START
>
> On Sat, Feb 18, 2017 at 1:03 PM, William Mikrut <wmikrut72 at gmail.com>
> wrote:
>
> > You are right -- I looked at is closer.
> > One thing I need to do is reverse the order of operations.
> >
> > The LSLA is performed first.
> > First I need to store the byte and LSLA the next byte.
> >
> > Otherwise if I flip it from left to right:
> > (LEAX CCD,PCR
> > ...
> > LDA ,X+
> > ...
> > ADDA ,X+)
> >
> > it works perfectly.
> >
> >
> > On Sat, Feb 18, 2017 at 11:35 AM, William Astle <lost at l-w.ca> wrote:
> >
> >> Take a closer look. It only does the LSLA on every other digit. It does
> >> *two* digits per loop, just like Brett's version.
> >>
> >> You can easily pretend all numbers are 16 digits by right justifying the
> >> numbers in your buffer and padding with zeros.
> >>
> >>
> >> On 2017-02-18 10:06 AM, William Mikrut wrote:
> >>
> >>> I like how this works from right to left.
> >>> The only issue is the LSLA on every number.
> >>>
> >>> The algo is to double every other number, starting with the right most
> >>> digit, and sub 9 if the result is 10 or more.
> >>>
> >>> Now if the number is always 16 digits, Brett's 16 bit word seems the
> >>> easiest way to go.
> >>> If the number is 13 digits long the 16 bit word method won't work, but
> I
> >>> am
> >>> happy to pretend all numbers are 16 digits!
> >>>
> >>> I am going to try to include a couple things you showed me into Brett's
> >>> 16
> >>> bit chunk method and try a slightly different routine!
> >>>
> >>>
> >>> On Sat, Feb 18, 2017 at 10:22 AM, William Astle <lost at l-w.ca> wrote:
> >>>
> >>> On 2017-02-18 12:43 AM, msmcdoug wrote:
> >>>>
> >>>> Actually I'm surprised noone has suggested bcd arithmetic on the
> result
> >>>>> to eliminate divide by 10 loop
> >>>>>
> >>>>>
> >>>> BCD would certainly give a predictable overall cycle count. It would
> >>>> require a significantly different approach, though. The only register
> >>>> you
> >>>> can use for BCD arithmetic is A and DAA is only useful after ADDA or
> >>>> ADCA.
> >>>>
> >>>> I had thought about using BCD but had initially dismissed it due to
> >>>> possible complexity. However, upon reflection, the extra cycles to use
> >>>> BCD
> >>>> would probably be less than the average cycle time of the modulus loop
> >>>> combined or checking for digit overflow during the loop.
> >>>>
> >>>> I think you could use code that looks something like the following
> which
> >>>> is based off Mr. Mikrut's most recent posted code. (warning: mailer
> >>>> codeā¢
> >>>> follows so it may have errors)
> >>>>
> >>>> ORG $1200
> >>>> CCD RMB 16
> >>>> RESULT RMB 1
> >>>> START LEAX CCD+16,PCR
> >>>> CLRA
> >>>> LDB #8
> >>>> LOOP PSHS A
> >>>> LDA ,-X
> >>>> LSLA
> >>>> CMPA #10
> >>>> BLO LOOP2
> >>>> SUBA #9
> >>>> LOOP2 ADDA ,S+
> >>>> DAA
> >>>> ADDA ,-X
> >>>> DAA
> >>>> DECB
> >>>> BNE LOOP
> >>>> ANDA #$0F
> >>>> STA RESULT,PCR
> >>>> ENDPGM RTS
> >>>>
> >>>> I'm using the stack for a temporary storage location instead of
> >>>> something
> >>>> PCR relative for code size reasons. You could use the "RESULT variable
> >>>> for
> >>>> the temporary to eliminate stack usage. That would probably be
> slightly
> >>>> faster at the expense of two more code bytes. This is one of those
> >>>> size/speed trade-offs.
> >>>>
> >>>> DAA has to be used after every addition and only applies to A. Using
> BCD
> >>>> means we can eliminate the mod 10 loop and just mask off the upper
> digit
> >>>> (BCD stores two decimal digits in a byte). That gives a constant time
> >>>> for
> >>>> the "mod 10" result and also only takes 2 bytes (and 2 cycles).
> >>>>
> >>>> I have also eliminated the STATUS variable and just store the result.
> >>>> You
> >>>> can test RESULT for non-zero trivially so there's no need for a
> separate
> >>>> STATUS value.
> >>>>
> >>>> By my calculation, this version is 32 bytes, requires 1 byte of stack
> >>>> space, 17 bytes of data space, and runs in a maximum of 351 cycles
> (and
> >>>> a
> >>>> minimum of 336 cycles if none of the doubled digits goes above 9). For
> >>>> this
> >>>> analysis, I've assumed 8 bit offsets for the PCR references. 16 bit
> >>>> offsets
> >>>> in PCR mode are quite a bit more expensive (4 extra cycles and 1 extra
> >>>> byte).
> >>>>
> >>>>
> >>>> --
> >>>> Coco mailing list
> >>>> Coco at maltedmedia.com
> >>>> https://pairlist5.pair.net/mailman/listinfo/coco
> >>>>
> >>>>
> >>>
> >>
> >> --
> >> Coco mailing list
> >> Coco at maltedmedia.com
> >> https://pairlist5.pair.net/mailman/listinfo/coco
> >>
> >
> >
>
> --
> Coco mailing list
> Coco at maltedmedia.com
> https://pairlist5.pair.net/mailman/listinfo/coco
>
More information about the Coco
mailing list