[Coco] BASIC irony
John R. Hogerhuis
jhoger at pobox.com
Thu Aug 12 20:45:08 EDT 2004
On Thu, 2004-08-12 at 17:11, tim lindner wrote:
> John R. Hogerhuis <jhoger at pobox.com> wrote:
>
> > Have you started on the tokenizer?
>
> Yeah, and it isn't pretty nor is it working yet! :)
>
> > I haven't written that portion yet. I
> > guess an inefficient algorithm is simple, for each character, loop
> > through the token table and compare characters; if get a match then put
> > the token instead of just text. Otherwise just put the character as-is.
> >
> > The exceptions are of course for :else and ' and ignore-to-EOLN handing
> > for REM.
>
> Heh, I forgot about the REM ignore to end of line. But I did remember
> the ignore within quotes and the ignore after DATA until a colon.
>
Yep! I didn't think of DATA and "ed strings.
Any algorithm will probably be good enough on a PC. If you wanted
efficiency, you would go for a state machine that can recognize BASIC
keywords, and possible handle stuff like quotes, REM, and DATA. The
basic idea is that as you process character by character, you should be
able to rule out a lot of keywords. If you hit a P, you know you're down
to PRINT, POS, PSET, etc. Then you get another character 'R' and you're
down to PRINT. At this point you could keep processing character by
character until you get a T or you could short circuit the search and do
a memcmp to see if it is PRINT or not, write out the token, and skip to
the character after T.
The hard part is coming up with the state machine. Once you have that,
the coding should be very straightforward. The good thing about a state
machine is that you can make a state table out of it that forces you to
consider all special cases. That way you don't miss some oddball case
that would be hard to detect with the average BASIC program.
I don't know which I am going to do yet.
> > Other strangeness in LIST comes with hidden high-numbered lines.
>
> What are you talking about?
Here's what I was told by Ron Wiesen:
John R. Hogerhuis asks:
> What is this about 'reserved line numbers'?
Here's the timeline. "Let there be light" said He. And lo, the Big
Bang
happened. Fast forward several eons to the middle part of the 20th
century
and we hear the words: "Let there be bit". And lo, the Primordial Bit,
mother of all bits, is born of the Big Bit Bang. Slow forward to the
late
part of the 20th century and at Darthmouth College we hear the words:
"Let
there be BASIC". And lo, Dartmouth BASIC 1.0, mother of all BASICs, is
born
of the Big BASIC Bang. Apparently, you became acquainted with BASIC
during
the very latest part of the 20th century John. By that time the Big
BASIC
Bang was well under way.
Reserved line number range is 6553x, per Dartmouth BASIC 1.0. The
practical
"16-bit" ranges is 65530 through 65535. There's a bunch of "Thou shalt
not"
commandments that apply to reserved line numbers. I've heard that these
commandments were written in a stone tablet by the Primordial Bit
herself,
and that this tablet is kept somewhere on the campus of Darthmouth
College.
I might be wrong about where it's kept. Anyhow, below are two of the
commandments.
#1 Thou shalt not place reserved line numbers in textual equivalent
files.
The penalty for breaking this commandment is paid during tokeniztion to
the
BASIC program construct. Textual line number 65530, for example,
suffers
"separation" where it becomes line 6553 0. Try it for yourself if you
dare
to violate a commandment given by the Primordial Bit -- I opt to believe
and
merely obey.
#2 Thou shalt not receive any textual result, listing result, or other
visible result from the presence of reserved line numbers in BASIC file
construct. Reserved line numbers in BASIC file construct is sacred; it
is
not for mortal eyes to ever see. By practicing "black art" you can
"transmute" an ordinary line number to become a sacred reserved line
numbers. Then try to LIST, SAVE,A and so on and you'll find no visible
evidence of its sacred existance.
Keeper of the Primordial Bit -= Ron =-
More information about the Coco
mailing list