[Coco] questions about constants

Wayne Campbell asa.rand at yahoo.com
Mon Sep 14 16:27:53 EDT 2009


I think I need to better establish what I already do understand. Having written DCom, and currently writing its replacement (unpack), I already understand that the compiled code contains no direct reference to declarations.

In Basic09 I-Code, there are no TYPE, DIM or PARAM statements. These statements have to be determined by examining the variable references and creating a map of the memory allocated to all of the data. In I-Code, there are two tables that help with this. They are called the Symbol Table (I call it the Variable Declaration Table, or VDT), and the description area (I call it the Data Storage Allocation Table or DSAT).

With these tables, I can reconstruct all of the TYPE, DIM and PARAM statements. Since Baasic09 does not store a reference to a unused variable (or field, in a record), there are usually "gaps" in the memory map. From these gaps, I can determine something of what *might have* occurred there. Example, if it is a 2-byte gap, I can assume an INTEGER, even though it could also be 2 BYTES, 2 BOOLEANS, or 1 of each.

Likewise, I understand that all of the rmb statements at the top of the disasm output are reconstructed variables, based on the memory location allocated, and its size. In order to understand what all those generic uxxxx lables, I am trying to match them up to the header file.

>From my understanding of data memory allocation, the header rmb entries should match the disasm rmb entries, if the object code being decompiled is the same version and build as the header file I am looking at. The problem comes in when the uxxxx label is being allocated 2 bytes, where the header shows 2 separate entries of 1 byte each.

The entries in the header, from the beginning, define the tokens used in Basic09 to represent the various keywords. Each location corresponds to the token value of that keyword. Example:


org 0

T.GLOB rmb 1 Global (reserved) - set to $00, at location $00
T.PRAM rmb 1 Param - set to $01, at location $01
T.TYPE rmb 1 Type - set to $02, at location $02
T.DIM  rmb 1 Dim - set to $03, at location $03

The disasm output starts out correctly, but begins to be different at u001C:

u001C    rmb   2

The header, in that same position, shows:

T.EEXT rmb 1 Endexit
T.ON   rmb 1 On

There are other sections of the header that start with org 0, showing that the following section begins allocation at a relative offset of 0. None of them that I have compared to the disasm output matches anywhere near the disasm output.

In order to understand what Basic09 is doing with the data, I have to know what the data is. I can only keep playing with it until I get that figured out.

As far as understanding the assembly instructions is concerned, I can tell you exactly (well, almost) what each instruction is doing. What I can't tell you is what any specific grouping of instructions is doing. I can't tell the difference between code that is performing a calculation and acting on the result, and code that is writing output to a data buffer.

The Motorola manual is a big help, as is the os9 development system manuals (L1 & L2). By being able to refer to all 3, I am better able to sort out what the instructions are doing. What I sorely lack is example of commented code that show me what different kinds of code looks like, like "this is a loop in assembly", or "this is a subroutine in assembly, and how you call it". The samples don't go very far, and the Assist09 ROM listing in the Motorola manual is so long I could study it for 10 years and come no closer to understanding it. The tutorials online are either so generic it's difficult to translate, or so specific to the processor being discussed that it's impossible to translate.

I'm guessing that, before it's all over, I will have to try to write some assembly code and go through the debugging process to get what's going on.

Wayne




________________________________
From: William Astle <lost at l-w.ca>
To: CoCoList for Color Computer Enthusiasts <coco at maltedmedia.com>
Sent: Sunday, September 13, 2009 4:47:57 PM
Subject: Re: [Coco] questions about constants

Wayne Campbell wrote:
> If it is read sequentially at the beginning, then I need to understand why the order of the rmb's at the beginning of the decompile are in a different order from the header file.

I think you're a bit confused about what a header file actually is. A header file is merely a sequence of definitions which DOES NOT generate any output whatsoever. Usually, a header file simply defines constants and possibly memory locations for use by the program including the file.

All files are read sequentially from start to finish by the assembler. The contents of a header file should not ever appear in the assembled binary. If a header causes anything to appear in the output, it is not, technically, a header file.

For instance, a header file might look something like so:

    ORG 0
TOKEN1    RMB 1
TOKEN2    RMB 2
TOKEN3    RMB 3

and so on. All that does is define TOKEN1 as 0, TOKEN2 as 1, and TOKEN3 as 2. None of that will appear in the output file, however.

-- William Astle
lost at l-w.ca


--
Coco mailing list
Coco at maltedmedia.com
http://five.pairlist.net/mailman/listinfo/coco



      



More information about the Coco mailing list