[Coco] lbsr and rts

Wayne Campbell asa.rand at yahoo.com
Thu Sep 17 17:49:00 EDT 2009


You wrote:
>Quite frankly, I don't see the point of doing that unless the algorithm
is using some sort of self-modifying code. Since OS9 doesn't like
self-modifying code, the only way to do that is to copy the code you
want to modify into the process's data space and modify it there. Even
then, it seems like the whole point of that sequence is to obfuscate
the code.

If obfuscating the code
was the intent, they did a good job.

I can actually follow what you're saying here, so I guess I understand things a little better than I thought I did. I have noticed that there are quite a few places in the code that seems to do something similar. Many labels are used just once in the code, and they point to some series of bytes in the fcb's. Many are not lbsr's though. Using the header file as a guide, I've been able to identify some labels with more sensible names. From that, I can tell where the code deals with things like identifying a token for a keyword, or identifying a prompt (Basic09 has 3, B:, E:, and D:). I've been comparing instructions in the disasm with the actual code in the bin, and so far every instruction code is right on. There are still things I don't get, like why, in some cases, LDD is followed by a single byte, but in other cases it is followed by a byte and an integer. The syntax of the instruction doesn't give me any clues, yet. (Lack of understanding on my part)

Thanks for helping me understand this better, William. :)

Wayne




________________________________
From: William Astle <lost at l-w.ca>
To: CoCoList for Color Computer Enthusiasts <coco at maltedmedia.com>
Sent: Wednesday, September 16, 2009 5:06:29 PM
Subject: Re: [Coco] lbsr and rts

Since L010A and L010D both turn into JSR <$1E with a single byte after it, and L013A has JSR <$2A followed by something that's probably data, and assuming that L08F8 is not the disassembler out of sync (seems unlikely given the sequence of code), here's what I think is happening.

I suspect that at process startup, the direct page is being loaded with a couple of subroutines, one starting at $1E and one at $2A. I further suspect that those subroutines examine the byte following the JSR call and probably removed the return address from the stack. Since I don't have the code, I'll contrive an example without all the direct page stuff (the mechanics are the same whether it's a JSR to the direct page or a BSR, JSR, or LBSR somewhere else):

SUBR    PULS X
    LDA ,X
    * do stuff based on value in A
    RTS

DO1    JSR SUBR
    FCB 1
DO2    JSR SUBR
    FCB 2

MAIN    BSR DO1
    BSR DO2
    * ...


Let's trace it. MAIN calls DO1. This puts the address of the second BSR on the stack. DO1 calls SUBR which puts the address of the FCB 1 on the stack. The SUBR yanks first return address from the stack and stuffs it in X, reads the byte pointed to, does some stuff, and returns...to the previous return address. So even though you have a JSR at DO1, the RTS in SUBR returns to the "BSR DO2" instruction after MAIN.

Quite frankly, I don't see the point of doing that unless the algorithm is using some sort of self-modifying code. Since OS9 doesn't like self-modifying code, the only way to do that is to copy the code you want to modify into the process's data space and modify it there. Even then, it seems like the whole point of that sequence is to obfuscate the code.

I've seen similar tricks used with SWI, SWI2, and SWI3. I've even written some really ugly code using even stranger tricks.

Wayne Campbell wrote:
> I may be missing something, but I thought calls to subroutines required a rts instruction. In Basic09, there has to be a RETURN statement for any given GOSUB, but not necessarily one-for-one. I can, for example, have a series of subroutine calls, where there are multiple entry points for the subroutine, and one RETURN statement, like:
> 
> code
> GOSUB 20
> more code
> GOSUB 30
> more code
> END
> 
> 20 code
> more code
> 30 even more code
> still more code
> RETURN
> 
> It will not generate any errors, and as long as the code is written correctly, it will work.
> 
> I have found, in the assembly code generated by disasm, that there are labels in the declarations (all those fcb's) that are being addressed in the code using lbsr op codes. Yet, there is no rts instruction anywhere in those fcb's. Can someone explain how this works? An example is:
> 
> * The L010A label is addressed twice in the program code.
> L010A    fcb   $9D
>          fcb   $1E          fcb   $04 * The L010D label is addressed ten times in the program code.
> L010D    fcb   $9D
>          fcb   $1E          fcb   $02 
> this series continues until:
> 
> L013A    fcb   $9D          fcb   $2A *
>          fcb   $00          fcb   $00          fcb   $72 r
>          fcb   $02 
> In the code, the following statements address these labels:
> 
> Both occurrences of the L010A label, and one of the L010D labels, occur in this routine:
> L08F8    ldx   $02,s
>          lda   #$80
>          lbsr  L010A
>          bne   L090F
>          lbsr  L010D
>          beq   L0915
>          leax  $03,x
>          lda   #$20
>          lbsr  L010A
>          beq   L0915
> 
> I just don't understand.
> 
> Wayne
> 
> 
> 
>      
> --
> Coco mailing list
> Coco at maltedmedia.com
> http://five.pairlist.net/mailman/listinfo/coco
> 


-- William Astle
lost at l-w.ca


--
Coco mailing list
Coco at maltedmedia.com
http://five.pairlist.net/mailman/listinfo/coco


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 



More information about the Coco mailing list