---------------------------------------------------------------------------
                          Inform: An Apology

                    pertaining to the second release
---------------------------------------------------------------------------


            "Look on my works, ye mighty, and despair..."


  Hello, Informer!

  Inform is an assembler for Infocom version-3 format story files.  It has
some of the trappings of a compiler, though its code is still haphazard
in some places.  It reports errors strangely at times, and in particular its
expression evaluator has a few eccentric mannerisms.  Some features one
might expect from a compiler are flagrantly missing.  Worse yet, much of
the source code is still written in a naive and unsystematic fashion.

  On the bright side, it works most of the time, and runs in only two
passes.  (This may sound easy but is not, because the story file format
requires all manner of tricky operations to be done: for example, the
dictionary must be alphabetically sorted, and the code must know absolute
addresses of its entries... and the address of the start of the dictionary
depends on many other things not known during pass 1... and so on.)  It
produces "Curses", the author's game, correctly.  This is a fairly
strenuous test since the game is about 123K long and pushes most of the
version-3 format to the limits.

  In Appendix A is a complete specification of the version-3 "Z-machine",
and some details of how to use Inform as an assembler instead of a compiler. 
Some of this information is already circulating in other files, but
uncollated.  The rest seems only to be available in as much as it is
implicit in the interpreter sources.

  Implementation bears about the same relation to designing a game as typing
does to writing poetry.  Appendix B contains some of the author's opinions
on game design, and may safely be ignored by those of a nervous disposition.
(In any case he has not absolutely always followed his own advice.)

  Appendix C discusses example programs.  One of these ("Deja Vu") is a toy
game which, although small and not very interesting in itself, contains the
source of a fairly good parser and implements most of the standard kernel
of adventure games; this may freely be stolen and adapted.  The source of
the other ("Hello Cruel World") is included within this file.  The source of
"Curses", on the other hand, is not on public show.

  Inform is not public domain, as mistakenly stated at earlier times,
in the proper legal sense of the term.  The copyright is retained by the
author, Graham Nelson.  He is perfectly happy for Inform to be used by
anybody for any recreational purpose.  It may be freely distributed
provided no profit is involved, and provided the copyright message is
retained.  Please do not circulate heavily modified versions, and please
comment any private changes of your own at the top of the source code.
Story files produced by Inform belong to whoever wrote the source for them;
I think, however, it is fair to ask that game-writers put some message into
their credits saying that Inform was used, and giving the version number
used to compile it.

  And now the author stands back, and looks forward to seeing new games with
bated breath...

                                                              Graham Nelson
                                                   Magdalen College, Oxford
                                                                 April 1993


  Since the first release, much improvement has been made in memory
management which is now quite efficient: it allocates between 50 and 75K of
memory, as opposed to 800K in the first edition.  The code is in ANSI C, is
contained in a single file (without needing non-standard headers) and some
effort has been made to improve its portability.  Hopefully it doesn't
assume an ASCII character set, or 32-bit integers, or any particular
byte-orientation within integers.  PC versions now ought to be feasible.

  The code has been annotated to some extent, and contains notes which
should be useful to anyone trying to port the code to a new machine.

  This documentation has changed only within the above introduction, in the
new "objectloop" construction, and in Appendix C (sample output for the
given programs).  The language which Inform compiles has not changed (except
that two defunct features, which had not in any case been documented, have
been withdrawn).  Details of changes to the ANSI source code of Inform may
be found in detailed comments at its head.

  The author's email address may be found at the bottom of this file. 
Comments and bug reports (by email) are welcomed with whatever degree of
enthusiasm he can muster.

                                                                        GAN
                                                                  June 1993

---------------------------------------------------------------------------
                             Contents
---------------------------------------------------------------------------

           1.   Command line format
           2.   Source file format
           3.   Compiler directives
           4.   Variables
           5.   Constants
           6.   Routines
           7.   Expressions
           8.   Commands
           9.   Conditions
           10.  Built-in functions
           11.  Objects
           12.  Verbs and grammar
           13.  The Dictionary
           14.  Indirect function calls
           15.  Text spacing
           
           A1.  The Z-machine
           A2.  How text is encoded
           A3.  How Z-code is encoded
           A4.  Using Inform as an assembler

           B1.  A Bill of Player's Rights
           B2.  What makes a good game?

           C1.  A Hello Cruel World program
           C2.  "Deja Vu": a toy game


---------------------------------------------------------------------------
1. Command line format
---------------------------------------------------------------------------

  inform [-options] <filename>

where four switches may be given in options:

  h   help information
  l   list assembly lines
  s   give statistics
  p   give statistics after both passes
  m   print memory allocation made
  d   contract double spaces in text

Samples of -s output can be found in Appendix C.  For -d, see section (15).

-m reveals how many bytes were malloc'ed.  The program can be compiled in
several different version: the default (and most economical) settings
use about 75K.  With judicious adjustment of various #defines at the
beginning, this could be reduced a little further.

Inform will write its output to a file with the same name, but prefixed with
"z3".  (This is easy to alter by changing #defines at the beginning of the
source.)


---------------------------------------------------------------------------
2. Source file format
---------------------------------------------------------------------------


Lines in an Inform file are terminated by semicolons.  Exclamation marks
! thus...
denote that the rest of that physical line is a comment.  Backslashes "fold"
lines, thus:

initpos "A hinged trapdoor in the floor stands open, and light streams in \
         from below.";
         
is treated as if the "f" in "from below." follows directly from where the
backslash \ is; i.e., the carriage return and leading spaces are removed.

These lines may either be compiler directives (all of which fit on one line)
or routines (which take more than one line).

Inform command names are not case sensitive.


---------------------------------------------------------------------------
3. Compiler directives
---------------------------------------------------------------------------


ATTRIBUTE <name>                     Make new attribute flag
CONSTANT <name> <value>              Declare a constant
DICTIONARY <name> <text>             Enter <text> in dictionary, and make
                                     a new constant for its address
END                                  End compilation here (this is optional)
GLOBAL <name> [ = <a> ]              Make a new global variable;
                                       [give it the initial value a]
              [ string <a> ]           [make it point to an (a+1)-byte array,
                                         which has <a> as first byte, and is
                                         otherwise zeros]
              [ data <a> ]             [make it point to an a-byte array,
                                         which is all zeros]
              [ initial <i1> ... ]     [make it point to an array, the bytes
                                         of which are as given]
              [ initstr "text" ]       [make it point to an array, the bytes
                                         of which are the ASCII values of the
                                         characters in the string]
OBJECT ...                           Make an object (see below)
PROPERTY ...                         Make a new property (see below)
RELEASE <a>                          Set the release number to <a>
VERB ...                             Enter a line of grammar (see below)

The following are mainly for debugging the compiler (should anyone ever
get around to doing this) but might sometimes be amusing or helpful:

LIST                                 List the symbol table
SHOWDICT                             Show dictionary
TREE                                 List object tree
VERBS                                List verb table

TRACE                                Trace assembler
LTRACE                               List the lines of input
ETRACE                               Trace expression evaluator
BTRACE                               Trace assembler on both passes
NOTRACE, NOLTRACE, etc               Turn off appropriate tracing


---------------------------------------------------------------------------
4. Variables
---------------------------------------------------------------------------


Variables are all two-byte integers, which are treated as signed when it
makes sense to do so (eg in asking whether one is positive or not) but
not when it isn't (eg when it is used as an address).

There can be up to 240 global variables; as indicated in (3), these can be
initialised to point to dynamic workspace, so as to achieve the effect of
strings and arrays.

In any routine, there can be up to 15 local variables.

There is also a stack, but it should be tampered with only with care.  Never
call a variable "sp", as this is the stack pointer variable which you might
occasionally need to use.

The observant reader will have noticed that 240+15+1 = 256.  This is of
course no coincidence.


---------------------------------------------------------------------------
5. Constants
---------------------------------------------------------------------------


Constants may be prefixed with a # character if desired.  This can be useful
if they are alphabetical and might otherwise be confused with something else.

A constant in "double quotes" assembles the given text at a suitable (even)
address, and gives half this address as the integer value.  Inside this text
the character ^ is replaced by a newline character, and the character ~ by
a double-quote mark.

A character in single quotes, such as 'e', means the ASCII value of that
character.

A dollar $ indicates that a hexadecimal constant follows; $$ indicates that
binary follows.

Declared constants can be given, and so can the special constants

  adjectives_table
  preactions_table
  actions_table

which give the code address of these tables.

A constant beginning a$, followed by the name of a routine which is an
action routine, will have as value the number of the action.

A constant beginning w$, followed by a word of text, has as value the
address of the given word in the dictionary (Inform will give an error
at compile time if no such word is there).

Thus, for instance, the following are legal constants:

  31415
  $ff
  $$1001001
  #adjectives_table
  #a$LookSub
  #w$invent
  'X'
  "an emerald the size of a plover's egg"
  "~Hello,~ said Peter.^~Hello, Peter,~ said Jane.^"


---------------------------------------------------------------------------
6. Routines
---------------------------------------------------------------------------


The syntax to begin a routine is

  [ RoutineName <l1> ... <ln>;

and to end it, is

  ];

l1 to ln are the names of local variables, which are also the call
parameters.  For example, if you have a routine

  [ Look i j k;
    ...some code...
  ];

and it is called by

  Look(attic);

then i will initially have the value "attic" when this is executed.
Any local variables not specified (in this case, j and k) are initially
zero.

Every routine returns a value to the caller; if no such value is
explicitly given, this value is the integer 1.

Inside a routine, labels may be declared with a line of their own:

  .labelname;

but note that whereas local variables have names which only mean anything
locally, labels have names which are global.  In other words, you can't
have a label called "loop" more than once in the file.

There is one special routine, which you must define, called Main.  This is
where execution of the game will begin, and it _must_ be the first one
defined.  Returning from Main will cause the interpreter to crash: you
should explicitly QUIT instead.  Also, uniquely and for peculiar reasons,
Main is _not_ permitted to have any local variables of its own.  This means
it is usually only used as an outer shell.


---------------------------------------------------------------------------
7. Expressions
---------------------------------------------------------------------------


The usual arithmetic expressions are allowed, including the operators:

    =             set variable (only) on left equal to value on right
    + -           plus, minus
    * / % & |     times, divide, remainder, bitwise and, bitwise or
    -> -->        byte, word array entry
                  (eg: buffer->4 gives contents of the byte with address
                  buffer+4, while table-->3 gives the word at table+6)
                  
In addition one may call a function, either a built-in function or a
routine.

For example:

  4*(x+3/y)
  i=j-->1
  Fish(x)+Fowl(y)

Warning: for a few commands, strange results may occur if two or more
complicated expressions are used in the same command, for instance:

  put buffer+6 byte i+j+1 56*prime(4);

One can only describe this as a hideous bug, but in practice the need
seldom arises and the solution would be quite difficult to implement.


---------------------------------------------------------------------------
8. Commands
---------------------------------------------------------------------------


The "high level" commands in Inform are as follows:

NEW_LINE                     Print a carriage return
QUIT                         Quit the game (at once, with no confirmatory
                             question to the user)
RESTART                      Restart the game from its initial state (ditto)
SHOW_SCORE                   Redisplay the score bar immediately, without
                             waiting for the next keyboard input
PRINT "text"                 Print text
PRINT_RET "text"             Print text, print a newline and return 1
PRINT_NUM <a>                Print a as a (signed) decimal number
PRINT_CHAR <a>               Print the character whose ASCII value is a
PRINT_ADDR <a>               Print the string whose address is a
PRINT_PADDR <a>              Print the string whose address is 2*a
PRINT_OBJ <a>                Print the short name of object a

READ <a> <b>                 Reads keyboard into buffer a and decomposes it
                             to the buffer b:
                             on entry, a[0] = buffer size, b[0] similarly
                             on exit a[1] = no chars typed,
                                       2 to a[1]+1 are the chars (unterminated)
                             From byte 2, b contains 4-byte chunks, one for
                             each word of input:
                                 address of dictionary entry if recognised,
                                   0000 otherwise
                                 number of letters in word
                                 first char of word in a
                             This command automatically redisplays the status
                             (score) line.  Precisely, it prints the short name
                             of the object whose number is the first declared
                             global variable, then prints the next two
                             globals in the form "45/34".  It is assumed that
                             these are the location, score and number of turns
                             so far.

REMOVE <a>                   Remove object a from the tree of objects
                             (it may certainly be later put back)
MOVE <a> TO <b>              Add object a to the things possessed by b

PUT <addr> BYTE <index> <v>  Write byte value v into index'th byte after addr
PUT <addr> WORD <index> <v>  ...and similarly for words

PUT_PROP <o> <p> <v>         Set property p of object o to value v

INC <var>                    Increment variable
DEC <var>                    Decrement 

RETURN <a>                   Return the value a
RET#TRUE                     Return true, i.e. the value 1
RET#FALSE                    Return false, i.e. the value 0

INVERSION                    Print the version number of Inform used to
                             compile the story file

IF <condition>               If the condition is true, execute the code
  {  ... code ... }          (braces are _compulsory_) [else execute the
[ ELSE { ... other ...} ]    other code instead]

WHILE <condition>            While loop
  {  ... code ... }

FOR <var> <init> TO <final>  For loop: the final value must be a constant
  {  ... code ... }          or another variable.  If the range is empty, it
                             does not execute even once.
DO                           Until loop
  {  ... code ... }
UNTIL <condition>

OBJECTLOOP <var> FROM/IN <obj>  A form of while loop.  The var first holds
                             either the obj value (if it is FROM) or its
                             child (if IN), and runs through the sibling
                             objects.  So, for instance,

                       objectloop x in lamp { print_obj x; new_line; }

                             is equivalent to

                       x=child(lamp); while x~=0 { print_obj x; new_line;
                       x=sibling(x); }

BREAK                        Break out of current loop (not block)
JUMP <label>                 Jump to label (warning: exercise caution in
                             jumping out of one routine into another)
SAVE <label>                 Try to save the game (asking the user for a file
                             to put it in): if successful, jump to the label,
                             otherwise carry on
RESTORE <label>              Ditto, but restore

If a command matches none of these, or if it began with an @ character, the
line is sent to the assembler instead.


---------------------------------------------------------------------------
9. Conditions
---------------------------------------------------------------------------


These take the form

  <a>  <relation>  <b>

where the relation is one of

  ==            a equals b
  ~=            a doesn't equal b
  < > >= <=     comparisons
  has           object a has attribute b at the moment
  hasnt         ...hasnt...
  near          objects a and b have the same parent
  far           ...haven't...

These may _not_ be used in expressions (as if the language were C) and
there is no AND/OR construction.  There is a reason for this, but not a
very good one (unless you count laziness).  However, one tiny concession
towards such a feature is provided, viz. the construction

  <something> == <v1> [or <v2> [or <v3>]]

which is true if the first something is any of the values given.

---------------------------------------------------------------------------
10. Built-in functions
---------------------------------------------------------------------------


The built in functions are

  PARENT(obj)  SIBLING(obj)  CHILD(obj)

for reading the object tree (see (11) below), and

  RANDOM(x)

which returns a uniformly random number between 1 and x, and

  PROP_LEN(addr)  PROP_ADDR(o,p)  PROP(o,p)

for which see (11) below.

Warning: some interpreters set up the random number generator with poor
choices of seed value, which means that the first few random numbers may be
rather peculiarly distributed.  After a time, it settles down.  To get
around this, "Curses" (for example) takes and throws away 100 random numbers
when it begins.


---------------------------------------------------------------------------
11. Objects
---------------------------------------------------------------------------


The object hierarchy is a tree of up to 255 "objects", which you might use
for many different game elements: rooms, compass points, scenery, things
which can be picked up, and so on.

They are numbered from 1 to 255, and the number 0 by convention means
"nothing".  Attempting to print_obj object 0 will produce a string full of
peculiar letters and (if you are very unlucky indeed) even random ASCII
values.

In the tree, each object has a parent, a sibling, and a child.  Thus, for
instance, a portion may resemble

            Meadow
               |
            Mailbox -> Player
               |          |
             Note      Sceptre -> Cucumber -> Torch -> Magic Rod
                                                |
                                              Battery

in which -> shows siblings, and | parents and children.  In this case, the
Meadow has nothing as its parent.  Anything with no possessions, such as the
note, has nothing as its child, and so on.

When an object is moved, its possessions move with it, of course.

In practice an object needs rather more data than just a position in a tree. 
It also has a collection of variables attached to it.

Firstly, there are 32 flags, called "attributes", which can be either set or
clear.  These might be such conditions as "giving light", "currently worn"
or "is one of the featureless white cubes".  All 32 are free for the user to
use.  They must be declared before use, by commands like

  ATTRIBUTE locked;

which will allocate a new attribute and make a constant "locked" to have the
value of its number.  You never then need to know about these numbers,
because you can use commands like

  IF obj HAS locked { print_ret "But it's locked!"; }

  SET_ATTR obj locked;

  CLEAR_ATTR obj locked;

Warning: 32 sounds like plenty, but the limit can quite easily be hit.  The
author has found it useful to declare one as "general", to be used for
different things for different objects.

Secondly, there are 30 "properties".  These are far more elaborate.  For one
thing, not every object has every property.  The following all declare new
properties:

  PROPERTY door_to;
  PROPERTY article "a";
  PROPERTY blorpleroutine $ffff;

The value given, in the case of article and blorpleroutine, is the default
value: that is, the value of the property which an object will have if it
doesn't explicitly have some other value.  If you don't define a default
value, it will by default be 0.

The data for a given property can be a number, or up to four numbers in a
row, or up to eight bytes of data.  The simplest way to get at the current
value is something like

  i=PROP(location,door_to);

which will get the first number in the property door_to of object location.
Similarly, it can be written to with

  PUT_PROP location door_to hall_of_mists;

A subtle point is that numbers smaller than 256 are stored differently from
larger ones.  In order to decide whether the property is one byte's worth or
two, the Z-machine looks at the number of bytes which the property has in
all, and sees whether it is odd or even; if even, it presumes the number is
a 2-byte word; if odd, it presumes it is just one byte.

This is seldom something you need to know about, but occasionally you will
want a property which will, later in the game, need to hold a value of, say,
1000, but which initially will be zero.  This is particularly the case with
timing mechanisms, for instance.  The command

  PROPERTY LONG timeleft;

declares the property "timeleft" and requires Inform to make sure that all
"timeleft" fields are 2 bytes wide, even if they have small initial values.

More elaborate manipulation has to be done by hand.

  k=PROP_ADDR(o,weird);

sets k to the address of the "weird" data of object o.  To find out how many
bytes there are, apply PROP_LEN to this address.

  l=PROP_LEN(k);
    
Once you have the address you can read and write to it directly.  Be careful
not to overrun the length, which may not be changed.

Warning: the Z-machine crashes if you attempt to write to a property field
which an object hasn't got.

An object is declared (before the body of the code) by something like:

OBJECT trapdoor "hinged trapdoor" attic
  WITH name "hinged" "trap" "door" "trapdoor",
       initpos "A hinged trapdoor in the floor stands open, and light \
                streams in from below.",
       closedpos "There is a closed trapdoor in the middle of the floor.",
       portalto house,
       postroutine TrapdoorPost,
       dirprop d_to
  HAS  portal static open light openable;

trapdoor is a constant which is set to its object number; "hinged trapdoor"
is its attached short name; attic is the object which initially possesses
it.  If it was to be initially unowned, this would be "nothing" instead of
"attic".

After WITH is a list of property definitions, in the form

   <property> <data1> ...
   [[, <property> <data1> ...]]

Warning: an excellent source of mysterious errors is missing off the commas
between these, since property names are themselves legal constants.

There is one special property, called "name" and numbered 1.  Its data must
be (up to four at most) words, as above, and these are entered into the
dictionary as nouns (if they aren't already present): the data actually
stored is the dictionary addresses.

Note that the dictionary itself does _not_ know that "door" refers to this
object: there might be any number of objects which could be called "door".

After HAS is a list of attributes which the object initially has.


---------------------------------------------------------------------------
12. Verbs and grammar
---------------------------------------------------------------------------


Whereas objects should be declared at the start of the file, the grammar
to be allowed by the game should be declared at the end.  This is done with
the VERB command.  VERB does something very complicated, but probably not
what you think.  A typical VERB command would be:

VERB "take" "get" "pick" "lift"  * "out"                    -> ExitSub
                                 * multi                    -> TakeSub
                                 * multiinside "from" noun  -> RemoveSub
                                 * "in" noun                -> EnterSub
                                 * "off" held               -> DisrobeSub;

This declares a verb, for which "take", "get" etc are synonyms, and which
can take five different courses.  In the first, it must be followed by the
word "out".  In the last, it must be followed by "off" and then an item
which is currently held by the player.  In the second, it can be followed by
one object, or a list, perhaps specified as "everything", for instance.
There can be no grammar at all, for example

VERB "invent" "i"                *                          -> InvSub;

After the "->" is the name of a routine which is to be called when this is
matched.

For traditional reasons unclear to the author, previous Infocom hackers have
called words such as "out" and "off", adjectives.  This is monstrously
illiterate since they are of course prepositions.  We shall wearily follow
convention anyway.

Remember that the Z-machine does _not_ contain the bulk of a game parser,
only the computationally expensive and low-level part which works out what
the words are.  So this command only sets up a table with some numbers in. 
If you want a parser, you have to write code to deal with the table again.

By convention, adjectives are numbered downwards from $ff.  Thus, if
the above were the opening lines of grammar, "from" would be $fe, and so on. 
As they are created, they are entered into the dictionary, and also into the
adjective table, which has four-byte entries

  <dictionary address of word>  00  <adjective number>
  ----2 bytes-----------------  ----2 bytes----------- 

In order to make life more interesting, these entries are stored in reverse
order (i.e., lowest adjective number first).  The address of this table is
rather difficult to deduce from the file header information, so the constant
#adjectives_table is set up by Inform to refer to it.  In any event, the
table isn't very useful and is created only for the sake of conforming to
Infocom internal conventions.

The important tables are the grammar and action tables.

The grammar table address is stored in word 7 (ie bytes 14 and 15) of the
header.  The table consists of a list of two-byte addresses to the entries
for each word.  This list is immediately followed by these entries, one
after another.

An entry consists of one byte giving the number of lines (eg, 5 for the
"take" definition above) and then that many 8-byte lines.  These lines
have the form

  <objects>  <sequence of words>  <action number>
  --1 byte-  ----6 bytes--------  --1 byte-------

<objects> is the number of objects which need to be supplied: eg, 0 for
"inventory", 1 for "take frog", 2 for "tie rope to dog".  The sequence
of words gives up to 6 blocks of syntax to follow the verb, which must
be matched in order.  Large numbers such as $ff mean that the appropriate
adjective must appear; small numbers are inserted by special words such as 
"held" or "noun" in the VERB command:

   Word         Byte       What the "Deja Vu" parser uses it for
   ====         ====       =====================================
   noun          0         any visible object
   held          1         object held
   multi         2         one or more visible objects
   multiheld     3         one or more held objects
   multiexcept   4         one or more objects, except the other object 
   multiinside   5         one or more objects, inside the other object
   creature      6         an animate creature
   special       7         any word or number

The sequence is padded out to 6 bytes with zeros.

The action numbers begin at 0.  The first routine mentioned as an action (in
the above example, ExitSub) is assigned action number 0; the next (TakeSub)
is given 1, and so on.  The appropriate number is stored in the last byte of
the line.

Thus, a little later on in the grammar, the line

VERB "exit" "leave"              *                          -> ExitSub;

might well appear, and ExitSub will mean "action 0" as before.

So this table does not store the address of the action routine, as one might
expect.  Instead the addresses corresponding to the action numbers are
stored in the actions table.  Once again, Inform puts this table in its
conventional place, but this address being difficult to work out, the
constant #actions_table is set up to hold it.  The actions table is simply
a list of 2-byte entries giving the routine addresses (divided by 2).

There is also a preactions table, with another constant #preactions_table,
created only to conform to Infocom conventions; it is set up containing 0000
for each action.  ("Curses", for instance, makes no use of this.)

In the mean time, what has happened to the actual words, "take", "get",
"pick" and "lift"?  Note that these do not appear in the grammar table at
all.  Instead they are entered into the dictionary, along with the verb
number.  As a final baroque twist, these numbers also count down from $ff.
Any number of words can be given, all referring to the same verb number;
"Curses" has 11 synonyms for "attack", for instance.

Of course, Inform does not know or care what is done with any of these
tables.   For instance, the "take" verb has the entry

005
000 255 000 000 000 000 000 000
001 002 000 000 000 000 000 001
002 005 254 000 000 000 000 002
001 253 000 000 000 000 000 003
001 252 001 000 000 000 000 004

but it is up to the code you write to deal with this.  (The VERBS command
will print out the full verb table in a similar format.)


---------------------------------------------------------------------------
13. The Dictionary
---------------------------------------------------------------------------


This section describes what Inform does with the dictionary.

The fourth word of the file header (bytes 8 and 9) contain the dictionary
table's address.

The table begins with a 7-byte header:

  03 '.' ',' '"'

meaning there are three characters used to separate words in typed input,
full stops, commas and quotation marks.  (The Z-machine will allow any list
to be given here but Inform decides on this for you.)

  07  <number_of_entries>
      ----2 bytes--------

meaning there are that many entries in the dictionary, all 7 bytes long. 
(This could again be in principle varied, but allows for six significant
letters in words, while still enabling the text of the word to occupy a
4-byte integer - which is convenient and fast when the compiler is
alphabetically sorting.)

The seven-byte entries are in alphabetical order, and look like:

  <the text of the word>  <flags>  <verb number>  <adjective number>
  ----4 bytes-----------  --1 b--  ----1 byte---  ----1 byte--------

The text is stored in the usual text format, thus allowing up to 6
characters.  The flags (chosen once again to conform loosely to Infocom
conventions, not for any sensible reason) have the eight bits

  7      6  5  4  3     2      1  0
  <noun> .. .. .. <adj> <spec> .. <verb>

<verb>, <noun> and <adj> mean the word can be a verb, noun or adjective; the
<spec> bit means the word was inserted by a DICTIONARY command in the
program, except that <verb> words also have the <spec> bit set (ours not to
wonder why).

Note that a word can be any combination of these at once.  It can even be
simultaneously a verb, adjective and noun.

Typically a full game contains about 600 dictionary entries - about ten
times the number of portable objects.  Even so it only consumes about 4K, or
1/64th of the available memory.  It's never worth economising on dictionary
entries; nothing else a designer can do with 4K will be as good�to the user.


---------------------------------------------------------------------------
14. Indirect function calls
---------------------------------------------------------------------------


Occasionally one needs to call a function whose address is in a variable:
for example, if the routine address has been looked up from a table, or an
object's property list.

For this, the function "indirect" is provided:

  a=indirect(b);

sets a to the return value of calling the function whose address is in b.

If you want to pass arguments as well, you should use the assembler-level
@icall.  But do so with care: it is dangerously easy to leave values lying
about on the stack, which will overflow causing a mysterious crash
hundreds of turns later.


---------------------------------------------------------------------------
15.  Text spacing
---------------------------------------------------------------------------


Typewritten English, like this file, normally puts a double space after a
full stop.  This is much easier to read.  Unfortunately Infocom-standard
interpreters do not usually understand that.  When they fold text across
lines, they can easily turn

  ...and a pomegranate.  After all, you always hated fruit.

into something which looks like

   |You decline the offer of a banana, an apple and a pomegranate.   |
   | After all, you always hated fruit.                              |
   |                                                                 |
   |>                                                                |

which looks awful.  It would be easy to fix the interpreter not to do this;
but nobody does.  In case (like the author's) your typing is habitually
double-spaced, Inform provides a command line option -d to change it back
again.  It does this only by replacing the string ".  " by ". " in text
conversion.


---------------------------------------------------------------------------
A1. The Z-machine
---------------------------------------------------------------------------


The so-called Z-machine (the imaginary machine for which story files are
programs) is quite well-adapted to its task.  It maintains a hierarchy of
objects and possessions, and does the computationally-intensive part of
parsing input itself.  That said, it does not contain the bulk of the
parser.  The parsing tables which some investigators think are part of the
Z-machine format, are in fact the same across different Infocom games only
because they all contain essentially the same parser code.  Thus, Inform is
in principle free not to compile such tables, but it does so in order to
INFODUMP properly.  Some tables are put to subtly different uses, however.

The following description is fairly complete, but only covers version 3.
It would be helpful if someone public-spirited would write an account of the
differences in later versions.

The version 3 Z-machine is 128K long at most.  Addresses within it are
nonetheless held in 2-byte words, which is why some addresses are stored as
half their actual values, and why some items (routines and static strings)
are always stored at even addresses.


The first 64 bytes contain a header.  The first 4 bytes are:

03  <Flags>  <Release Number>
             ----2 bytes-----

3 indicates version 3; the release number is as set in the program; the
flags byte contains bits:

  1   Status line type (clear for Score/Turns, set for Hours:Mins)
  3   Censorship bit (used by some games, but not by the Z-machine)
  4   Alternative prompts - sometimes used by primitive interpreters
  5   Status window support - used only by "Seastalker"

Next come seven word addresses, at words 2 to 8:

2     <Start of Routines>    Where routines begin, in bytes
3     <Main Routine>         Address of main routine, in bytes, +1

(This +1 is why Main cannot have local variables - it is a peculiarity
of the standard.  Note also that this is uniquely a routine address in
bytes and not words: Main must occur in the lower 64K of the file.  Inform
always sets word 3 to be word 2, plus 1.)

4     <Dictionary>           The dictionary table address, in bytes
5     <Object tree>          Object table address, in bytes
6     <Variables>            Global variables address, in bytes
7     <Save area size>       The total number of bytes in a saved game

(Saving the game is done by saving this many bytes from the beginning of
the machine.  (Saved games also contain the current state of the Z-machine
stack; the stack is _not_ stored anywhere in the Z-machine's memory.))

8     <More flags>

This word of flags has bits:

  0   Scripting on: send output to printer
  1   Disable proportional fonts while this is set
  4   Something mysterious to do with sound effects in The Lurking Horror

This is followed by the six bytes from byte 18 to 23, which are the version
number string.  (Inform sets these to the current date, in the form YYMMDD.)
Then more words:

12    <Synonyms table>       Synonym table address in bytes
13    <Length>               Length of file, in words
14    <Checksum>             Sum of bytes from 64 upwards, mod $10000

(The length and checksum are not actually used at all by many interpreters.)

The remaining bytes in the header are used by the interpreter and should be
left alone by the game code.


By convention, the next item in the memory map, beginning at $40, is the
synonyms table.  There are 3*32=96 strings stored here (entries 0 to 31 in
three dictionaries), one after another.  This means they all have even
addresses, conveniently.  Once these 96 strings are entered, the actual
table begins, and this is what the synonyms address points to.  The table
contains 96 two-byte entries, which are the word addresses of the strings
before it.

(Since Inform never makes use of synonyms, this could just be left out
altogether, but for the sake of convention it creates a null table
containing 96 copies of "   " (three spaces).) 


Next is the object table.  In fact it begins with what is sometimes called
the "global properties table", though it is actually a table of default
values of properties.  This is a list of 31 2-byte words.  There is no
property 0, so the first word is always 0000.

(Inform also sets the default for property 1 - the special "name" property -
to 0000; the remainder are set in property definitions.)

After these 62 bytes, the objects begin, beginning from object 1.  An object
entry consists of 9 bytes, looking like:

   <the 32 attribute flags>   <parent>  <sibling>  <child>  <properties>
   ---32 bits in 4 bytes---   ---3 bytes------------------  ---2 bytes--

The last three bytes are 00 when the object pointed to is "nothing".  The
<properties> is an address (in bytes) of the properties attached to the
given object.

When all these 9-byte entries are out of the way, the properties tables
begin.  (Inform keeps these in the same order as the objects they are
attached to.)  An individual property table has the brief header

  03  <text of short name of object>
      --some even number of bytes---

and then lists the properties held, in descending numerical order.  (This
order is essential.)  A property is stored as

  <size byte>   <the actual property data>
                ---between 1 and 8 bytes--

The size byte is arranged as 32*the number of data bytes, plus the property
number.

Each list of properties is ended by a 00 size byte.  This is why there is no
property 0.


When all the property tables are done, we come to the global variable table.
Global variables are numbered from 0 to 239, and this table begins with 240
initial 2-byte values for them.  After this is conventially left space for
all the arrays, dynamic strings and so on which they point to.


We have now reached the top of the save area.  Everything above here is
never altered.


Next is the table of grammar, which is described as above.  It is
immediately followed by the actions table, the preactions table and then the
adjectives table, also described above.


And next the dictionary table, described above.


Next is the code area.  Not all Infocom games begin with Main, but all
Informed ones do.  The code area simply contains a list of routines.

All routines (and static strings) must occur at even addresses, so as to
enable them to have word addresses instead.  (Inform occasionally inserts
00 bytes between routines to ensure this.)

A routine begins with one byte indicating the number of local variables the
routine has (from 0 to 15), and then with that many 2-byte words giving
their initial values, if not supplied by the call to the routine.  (Inform
never makes use of this initialisation, and simply stores 0000's here.) 
Unlike global variables, these bytes are _not_ used for the current values
of the variables: they are kept on the stack.

Executable code follows this header.  There is no special marker for the end
of a routine; it is simply expected that in every case a legal return
instruction will be hit.

Finally, from the end of the code to the top of memory are the static
strings.  These are put up here to be out of the way, where they won't clog
up the bottom 64K of memory.  There's no table of their addresses, or pointer
to where they begin; each is referred to by an address in the code or data
given earlier.


---------------------------------------------------------------------------
A2. How text is encoded
---------------------------------------------------------------------------


Text is stored as a sequence of 2-byte words.  Each of these is divided into
three 5-bit pieces, plus 1 bit left over, arranged as

   --first byte-------   --second byte---
   7    6 5 4 3 2  1 0   7 6 5  4 3 2 1 0 
   bit  --first--  --second---  --third--

The bit is set only on the last 2-byte word of the text, and so marks the
end.

The pieces are then characters, with values in the range 0 to 31.

There are three alphabets, in which the numbers 6 to 31 mean:

  A0     abcdefghijklmnopqrstuvwxyz
  A1     ABCDEFGHIJKLMNOPQRSTUVWXYZ
  A2      ^0123456789.,!?_#'~/\-:()

('^' being actually the new-line character.)

Character 0 is a space in all alphabets.  Characters 1, 2 and 3 are used for
abbreviations.  Inform makes no use of these, but the Z-machine provides for
commonly occurring strings to be printed out as if they were characters. 
Being plainly abbreviations, these are for some reason called "synonyms".

By default, a character is presumed to be in A0, i.e. to be a lower-case
English letter.  However, the character 4 means that the next one (only) is
in A1; and 5 means the next is in A2.

Notice that character 6 in A2 is blank.  It isn't a space: it simply isn't
there.  The sequence 5 followed by 6 indicates that the next two characters
define an ASCII value.  This is the way to get at the characters not in any
of the three alphabets.  For example, the familiar message

  *** You are dead ***

takes four "characters" to produce each of the *'s.

Finally, note that the end-bit only comes up once every three characters,
so that a way is needed to safely use up any spare characters in the last
2-byte block.  This is done by padding out with 5's.  (5 followed by 5 does
nothing.)

This is especially the case with dictionary entries.  Some dictionary
entries, like "i", ought only to take one 2-byte block, but in order to make
all entries 2-byte blocks and alphabetically sortable by number, they are
padded out by up to five 5's in a row.

In practice the text compression factor is not really very good: "Curses"
contains about 127000 characters of text, stored in 91000 bytes.  (Text
usually accounts for about three quarters of a story file.)  But the
encoding does at least encrypt the text so that casual browsers can't read
it.


---------------------------------------------------------------------------
A3. How Z-code is encoded
---------------------------------------------------------------------------


The encoding of version 3 Z-code is to say the least complicated.  The
reader is warned that it is also different to that in all other versions. 
There are all kinds of exceptions intended either to make small economies of
code size (these are very seldom worth the effort, in fact) or to provide
new features tacked on at the last minute.

Experimenting with Inform as an assembler, while tracing is turned on, may
be helpful.

Z-code understands four kinds of operand, and describes these in 2-bit
fields:

  $$00    Large constant (>=256)         2 bytes
  $$01    Small constant (0 to 255)      1 byte
  $$10    Variable                       1 byte
  $$11    Omitted altogether             0 bytes

Variables are described in one byte.  00 means the top of the stack, 01 to
$0f are the local variables of the current routine and $10 to $ff are the
global variables, 0 to 239.  Writing to 00 pushes something onto the stack
and reading from it pulls it off.  The stack can also be manipulated (with
care) using the PUSH, PULL and POP instructions.  The stack is guaranteed to
be at least 512 bytes long, and some interpreters are more generous.  There
isn't any way to check stack overflowing, so be careful with recursion.

(One of the trickiest problems in compiling Z-code is throwing away unwanted
return values of routines which are left on the stack... it can take
hundreds of turns before a game crashes if this is got wrong.)

Z-code opcodes are 1 byte only.  To begin with, look at the top two bits.
If these are $$11, we shall call it "variable"; if $$10, "short"; and
otherwise "long".

In this description, we shall adopt the opcode names used by the existing
Infocom disassembler "TXD".

For short opcodes, look at the next two bits (4 and 5).  These give the kind
of operand which the code has.  If this is $11, there isn't an operand and
the opcode has no argument at all.  In this event, the remaining part of the
opcode gives what it is:

    $00   RET#TRUE                       (1) The opcode is followed by text
    $01   RET#FALSE                          in 2-byte chunks as usual
    $02   PRINT             (1)
    $03   PRINT_RET         (1)          (2) Opcode followed by a branch
    $05   SAVE              (2)
    $06   RESTORE           (2)          (3) This is an abbreviation for
    $07   RESTARE                            RET SP, to save one byte
    $08   RET(SP)+          (3)
    $09   POP
    $0A   QUIT
    $0B   NEW_LINE
    $0C   SHOW_SCORE
    $0D   VERIFY            (2)

If the type wasn't $11, then an operand follows, and moreover the "code"
part of the opcode means something different:

    $00   JZ                (2)          (4) Followed by a store opcode
    $01   GET_SIBLING       (2) (4)          (before the branch, if there
    $02   GET_CHILD         (2) (4)          is also a branch)
    $03   GET_PARENT        (4)
    $04   GET_PROP_LEN      (4)          (5) Refers indirectly to variables
    $05   INC               (5)              by their number (Inform
    $06   DEC               (5)              suppresses this feature, so
    $07   PRINT_ADDR                         "@inc sp" produces the constant
                                             0 instead of variable no. 0 as
    $09   REMOVE_OBJ                         operand)
    $0A   PRINT_OBJ
    $0B   RET
    $0C   JUMP
    $0D   PRINT_PADDR
    $0E   LOAD              (4) (5)
    $0F   NOT               (4)

"Long" opcodes have two operands.  The bottom 5 bits of the opcode say what
it is:

    $01   JE                (2) (6)      (6) If this is encoded as
    $02   JLE               (2)              "variable", then operands 3 and
    $03   JGE               (2)              4 (if present) are used as a
    $04   DEC_CHK           (2) (5)          kind of OR command: eg,
    $05   INC_CHK           (2) (5)          branch if o1 = o2, o3 or o4
    $06   COMPARE_POBJ      (2)   
    $07   TEST              (2)   
    $08   OR                (4)   
    $09   AND               (4)   
    $0A   TEST_ATTR         (2)   
    $0B   SET_ATTR
    $0C   CLEAR_ATTR
    $0D   STORE             (5)
    $0E   INSERT_OBJ
    $0F   LOADW             (4)   
    $10   LOADB             (4)   
    $11   GET_PROP          (4)   
    $12   GET_PROP_ADDR     (4)   
    $13   GET_NEXT_PROP     (4)   
    $14   ADD               (4)   
    $15   SUB               (4)   
    $16   MUL               (4)   
    $17   DIV               (4)   
    $18   MOD               (4)

The alert reader will notice that bits 5 and 6 are left spare to be used. 
Now there are two operands to specify, which ought to take up 4 bits, which
obviously won't fit.  So a more economical form is used instead.  Bit 6
refers to the first operand, and bit 5 to the second.  A value of 0 means a
small constant and 1 means a variable.  Now, type $11 (not really there)
operands can't happen, so that's no problem, but there might well be type
$00 (large constant) operands, for example in "@mul x #666 sp".  In this
event, the opcode is instead programmed as a "variable" opcode.

So we must now describe the "variable" opcode form.  In addition to the
possible opcodes which can arise from overflowing "long" opcodes, there are
others which can only be "variable".  Here all of the bottom 6 bits are
available to describe the opcode, and this either holds the above numbers
$00 to $18 or else:

    $20   CALL              (4)          (7) These codes are somewhat
    $21   STOREW                             conjectural and only apply
    $22   STOREB                             to a few Infocom games; Inform
    $23   PUT_PROP                           never uses them unless told to
    $24   READ                               explicitly
    $25   PRINT_CHAR
    $26   PRINT_NUM               
    $27   RANDOM            (4)
    $28   PUSH                    
    $29   PULL              (5)
    $2A   STATUS_SIZE       (7)      
    $2B   SET_WINDOW        (7)      

    $33   SET_PRINT         (7)      
    $34   #RECORD_MODE      (7)      
    $35   SOUND             (7)      

Some of these are only of "variable" type because the available codes for
the other types had run out - PRINT_CHAR, for instance.  Others, especially
CALL, need the flexibility to have between 1 and 4 operands.

In the "variable" type opcode, all eight bits of the opcode have been used
up, so we have to add another byte describing the operands.  This is divided
into four 2-bit fields.  For example, $$00101111 means large constant
followed by variable (and no third or fourth opcode).

Once the opcode is out of the way, the operands are simply stored in one or
two-byte form as appropriate.

PRINT and PRINT_RET are followed by text: this is assembled in the usual way
immediately after the opcode (which may well be at an odd address, but this
doesn't matter) and execution resumes after the last 2-byte chunk of text
(the one with top bit set).

Opcodes marked as "store" in the above tables, return a value: for example,
MUL multiplies its two arguments together, and CALL calls a routine which
must return a value.  Such instructions are followed by a single byte giving
the variable (stack pointer, local or global as usual) to put it in.  This
may look like an extra operand but is not: there is no need to tell the
Z-machine what type it has, since it must be a variable.

Finally, there are instructions which test a condition.  Apart from the
obvious branch instructions (JE and so on), SAVE does this, for example, the
test in question being whether or not the save was successful.  Branches are
stored in two different ways for economy reasons: nearby ones in a single
byte at the end of the instruction, farther ones in two bytes.

The top bit of the first byte of a branch is the "flag".  If this is clear,
then a branch occurs when the condition came out false.  If it is set, then
the branch occurs when it was true.

If the next bit (bit 6) is set, then the branch is in abbreviated 1-byte
format and the offset is in the bottom 6 bits (0 to 5).  If not, the offset
is in the bottom 15 bits (0 to 6 of the first byte, and all of the second).
This offset can be positive or negative.  (Eg., all 1's means -1 in the
usual way.)

In the abbreviated form, an offset of 1 in fact means "return true from the
current routine" and an offset of $20 (i.e., -31) means "return false".  An
offset of 1 is never useful but -31 might arise, and so it is essential to
use the long form for such branches.

Working out what the offset ought to be is more complicated than it appears
because the PC has already moved on from the start of the instruction when
it reaches the branch.  The bizarre formula in question is

  Offset = Destination address - Address of this instruction - Length + B

where

  Length = number of bytes in instruction (not counting the branch)

and B is 1 for short branches, 0 for long ones.

In practice Inform compiles branches in the long form, considering the
economy to be not worth the nightmarish computation needed to make the
long/short decision.  (One problem is that the number of bytes in each
instruction _must_ be the same in both passes, so that the decision needs to
be made before the value of the offset is known... in a 2-pass compiler this
is insoluble.  Another is that the offsets are affected by the size of the
branch, confusing things considerably on forward branches.)  However, its
assembler mode allows you to make an explicit choice.

JUMP instructions similarly encode their address operand as an offset, but
always as a two-byte (signed) constant.  In this respect they differ from
CALL instructions.  In a CALL, the address is half the absolute routine
address.


---------------------------------------------------------------------------
A4. Using Inform as an assembler
---------------------------------------------------------------------------


Inform can also act as an assembler.  A line beginning with an @ character
is sent straight to the assembly routines.  Constants and variable names
can be given as operands but not compound expressions.  The following are
supported:

jump <label>                               go to (local) label

jz <a> [~]<label>                          If a==0 go to label
                                           (or return if "rtrue" or "rfalse")
je <a> <b> [~]<label>                      a=b
jge <a> <b> [~]<label>                     a>b    (note: not >=)
jle <a> <b> [~]<label>                     a<b
test_attr <a> <b> [~]<label>               object a has attribute b
test <a> <b> [~]<label>                    a&b != 0
compare_pobj <a> <b> [~]<label>            objects a, b have same parent
                                           In the above, if the ~ is set,
                                           the condition is negated

ret#true                                   return true
ret#false                                  return false
ret
ret(sp)+                                   return sp
ret <a>                                    return a

save [~]<label>                            save; go to label if successful
restore [~]<label>                         restore; ...
verify                                     verify file integrity
restart                                    reset Z-machine
quit                                       exit Z-machine
show_score                                 redisplay status line immediately

store <v> <a>                              v=a
loadw <a> <b> <v>                          v=word at (word address) a+b
loadb <a> <b> <v>                          v=byte at (byte address) a+b
storew <a> <b> <c>                         word at (word address) a+b=c
storeb <a> <b> <c>                         byte at (byte address) a+b=c
get_prop <a> <b> <v>                       v=property b of object a
get_prop_addr <a> <b> <v>                  v=address of...
get_next_prop <a> <b> <v>                  ? - seldom used
put_prop <a> <b> <c>                       property b of obj a is c
get_parent <a> <v>                         v=parent of a
get_prop_len <a> <v>                       v=property length of a
get_sibling <a> <v> [~]<label>             v=sibling of a, branch if this =0
get_child <a> <v> [~]<label>               ...child
inc_chk <v> <a> [~]<label>                 if v++=a then label
dec_chk <v> <a> [~]<label>                 if v--=a then label

load <v1> <v2>                             ?

random <a> <v>                             v=random number up to a

inc <v>                                    v=v+1
dec <v>                                    v=v-1

or <a> <b> <v>                             v=a | b
and <a> <b> <v>                            v=a & b
add <a> <b> <v>                            v=a + b
sub <a> <b> <v>                            v=a - b
div <a> <b> <v>                            v=a / b
mod <a> <b> <v>                            v=a % b

set_attr <a> <b>                           set attribute bit b on object a
clear_attr <a> <b>                         clear...

push <a1> [... <a4>]                       push a1 to a4 onto the stack
pull <a1> [... <a4>]                       pull a1 to a4 from it
pop                                        throw away top of stack

insert_obj <a> <b>                         give object a to b
remove_obj <a>                             remove a from hierarchy

call <rname> [ <a1>... ] <v>               v=rname(a1,...)

icall                                      indirect call: sp=(sp)()

read <a> <b>                               see above
print_num <v>                              print v in decimal
print "<text>"                             print text
print_ret "<text>"                         print text, newline and return 1
new_line                                   print newline
print_addr <a>                             print string at address a
print_paddr <a>                            print string at address 2*a
print_obj <a>                              print name of object a
print_char <a>                             print ASCII char <a>

Branch statements use the long form of the branch code if the label (or
tilde) is prefaced with a question mark '?', and otherwise use the short
form.


---------------------------------------------------------------------------
B1. A Bill of Player's Rights
---------------------------------------------------------------------------


  Perhaps the most important point about designing a game is to think as a
player and not a designer.  I think the least a player deserves is:

    1.  Not to be killed without warning

  At its most basic level, this means that a room with three exits, two of
which lead to instant death and the third to treasure, is unreasonable
without some hint.  Mention of which brings us to:

    2.  Not to be given horribly unclear hints
 
  Many years ago, I played a game in which going north from a cave led to a
lethal pit.  The hint was: there was a pride of lions carved above the
doorway.  Good hints can be skilfully hidden, or very brief (I think, for
example, the hint in the moving-rocks plain problem in "Spellbreaker" is a
masterpiece) but should not need explaining even after the event.

  A more sophisticated version of (1) leads us to:

    3.  To be able to win without experience of past lives

  Suppose, for instance, there is a nuclear bomb buried under some anonymous
floor somewhere, which must be disarmed.  It is unreasonable to expect a
player to dig up this floor purely because in previous games, the bomb blew
up there.  To take a more concrete example, in "The Lurking Horror" there is
something which needs cooking for the right length of time.  As far as I can
tell, the only way to find out the right time is by trial and error.  But
you only get one trial per game.  In principle a good player should be able
to play the entire game out without doing anything illogical.  In similar
vein:

    4.  To be able to win without knowledge of future events

  For example, the game opens near a shop.  You have one coin and can buy a
lamp, a magic carpet or a periscope.  Five minutes later you are transported
away without warning to a submarine, whereupon you need a periscope.  If you
bought the carpet, bad luck.

    5.  Not to have the game closed off without warning

  Closed off meaning that it would become impossible to proceed at some
later date.  If there is a papier-mache wall which you can walk through at
the very beginning of the game, it is extremely annoying to find that a
puzzle at the very end requires it to still be intact, because every one of
your saved games will be useless.  Similarly it is quite common to have a
room which can only be visited once per game.  If there are two different
things to be accomplished there, this should be hinted at.

    6.  Not to need to do unlikely things

  For example, a game which depends on asking a policeman about something he
could not reasonably know about.  (Less extremely, the problem of the
hacker's keys in "The Lurking Horror".)  Another unlikely thing is waiting
in uninteresting places.  If you have a junction such that after five turns
an elf turns up and gives you a magic ring, a player may well never spend
five turns there and never solve what you intended to be straightforward. 
On the other hand, if you were to put something which demanded investigation
in the junction, it might be fair enough.  ("Zork III" is especially poor in
this respect.)

    7.  Not to need to do boring things for the sake of it

  In the bad old days many games would make life difficult by putting
objects needed to solve a problem miles away from where the problem was,
despite all logic - say, putting a boat in the middle of a desert.  Or, for
example, it might be fun to have a four-discs tower of Hanoi puzzle in a
game.  But not an eight-discs one.

    8.  Not to have to type exactly the right verb

  For instance, looking inside a box finds nothing, but searching it does. 
Or consider the following dialogue (amazingly, from "Sorcerer"):

    >unlock journal
    (with the small key)
    No spell would help with that!

    >open journal
    (with the small key)
    The journal springs open.

This is so misleading as to constitute a bug.  But it's an easy design fault
to fall into.  (Similarly, the wording needed to use the brick in Zork II
strikes me as quite unfair.  Or perhaps I missed something obvious.)

    9.  To be allowed reasonable synonyms

  In the same room in "Sorcerer" is a "woven wall hanging" which can instead
be called "tapestry" (though not "curtain").  This is not a luxury, it's an
essential.

    10.  To have a decent parser

  This goes without saying.  At the very least it should provide for taking
and dropping multiple objects.

  The last few are more a matter of taste, but I believe in them:

    11.  To have reasonable freedom of action

  Being locked up in a long sequence of prisons, with only brief escapes
between them, is not all that entertaining.  After a while the player begins
to feel that the designer has tied him to a chair in order to shout the plot
at him.

    12.  Not to depend much on luck

  Small chance variations add to the fun, but only small ones.  The thief in
"Zork I" seems to me to be just about right in this respect, and similarly
the spinning room in "Zork II".  But a ten-ton weight which fell down and
killed you at a certain point in half of all games is just annoying.

    13.  To be able to understand a problem once it is solved

  This may sound odd, but many problems are solved by accident or trial and
error.  A guard-post which can be passed only if you are carrying a spear,
for instance, ought to have some indication that this is why you're allowed
past.  (The most extreme example must be the notorious Bank of Zork.)

    14.  Not to be given too many red herrings

  A few red herrings make a game more interesting.  A very nice feature of
"Zork I", "II" and "III" is that they each contain red herrings explained in
the others (in one case, explained in "Sorcerer").  But difficult puzzles
tend to be solved last, and the main technique players use is to look at
their maps and see what's left that they don't understand.  This is
frustrated when there are many insoluble puzzles and useless objects.  So
you can expect players to lose interest if you aren't careful.  My personal
view is that red herrings ought to have some clue provided (even only much
later): for instance, if there is a useless coconut near the beginning, then
perhaps much later an absent-minded botanist could be found who wandered
about dropping them.  The coconut should at least have some rationale.

  The very worst game I've played for red herrings is "Sorcerer", which by
my reckoning has 10.

    15.  To have a good reason why something is impossible

  Unless it's also funny, a very contrived reason why something is
impossible just irritates.  (The reason one can't walk on the grass in
"Trinity" is only just funny enough, I think.)

    16.  Not to need to be American to understand hints

  The diamond maze in "Zork II" being a case in point.  Similarly, it's
polite to allow the player to type English or American spellings or idiom. 
For instance "Trinity" endears itself to English players in that the soccer
ball can be called "football" - soccer is a word almost never used in
England.

    17.  To know how the game is getting on

  In other words, when the end is approaching, or how the plot is
developing.  Once upon a time, score was the only measure of this, but
hopefully not any more.


---------------------------------------------------------------------------
B2. What makes a good game?
---------------------------------------------------------------------------


1.  The Plot

The days of games which consisted of wandering around doing unrelated things
to get treasures, are long passed: the original Adventure was fun, and so
was Zork, but two such games are enough.  There should be some overall task
to be achieved, and it ought to be apparent to the player in advance.

This isn't to say that it should be apparent at once.  Instead, one can
begin with just an atmosphere or mood.  But if so, there must be a
consistent style throughout and this isn't easy to keep up.  "The Lurking
Horror" is an excellent example of a successful genre style; so is "Leather
Goddesses of Phobos".

At its most basic, this means there should be no electric drills lying about
in a medieval-style fantasy.  The original Adventure was very clean in this
respect, whereas Zork was less so: I think this is why Adventure remains the
better game even though virtually everything in Zork was individually
better.

If the chosen genre isn't fresh and relatively new, then the game had better
be very good.

Plot begins with the opening message, rather the way an episode of Star Trek
begins before the credits come up.  It ought to be striking and concise (not
an effort to sit through, like the title page of "Beyond Zork").  By and
large Infocom were good at this.  A fine example is the overture to
"Trinity" (by Brian Moriarty):

  Sharp words between the superpowers. Tanks in East Berlin. And now,
  reports the BBC, rumors of a satellite blackout. It's enough to spoil your
  continental breakfast.

  But the world will have to wait. This is the last day of your $599 London
  Getaway Package, and you're determined to soak up as much of that
  authentic English ambience as you can. So you've left the tour bus behind,
  ditched the camera and escaped to Hyde Park for a contemplative stroll
  through the Kensington Gardens.

Already you know: who you are (an unadventurous American tourist, of no
significance in the world); exactly where you are (Kensington Gardens, Hyde
Park, London, England); and what is going on (World War III is about to
break out).  Notice the careful details: mention of the BBC, of continental
breakfasts, of the camera and the tour bus.  More subtly, "Trinity" is a
game which starts as a kind of escapism from a disastrous world out of
control: notice the way the first paragraph is in tense, blunt,
headline-like sentences, whereas the second is much more relaxed.  So a lot
has been achieved by these two opening paragraphs.

The most common plots boil down to saving the world, by exploring until
eventually you vanquish something ("Lurking Horror" again, for instance) or
collecting some number of objects hidden in awkward places ("Leather
Goddesses" again, say).  The latter can get very hackneyed (got to find the
nine magic spoons of Zenda to reunite the Kingdom...), so much so that it
becomes a bit of a joke ("Hollywood Hijinx") but still it isn't a bad idea,
because it enables many different problems to be open at once.

Most games have a prologue, a middle game and an end game, which are usually
quite closed off from each other.  Usually once one of these phases has been
left, it cannot be returned to.

2.  The Prologue

In establishing an atmosphere, the prologue gives a good head start.  In the
original mainframe Adventure, this was the above-ground landscape; the fact
that it was there gave a much greater sense of claustrophobia and depth to
the underground bulk of the game.

Sometimes a dream-sequence is used (for instance, in "Lurking Horror"), or
sometimes simply a more mundane region of game (for instance, the
guild-house in "Sorcerer").  It should not be too large or too hard.

As well as establishing the mood of the game, and giving out some background
information, the prologue has to attract a player enough to make him carry
on playing.  It's worth imagining that the player is only toying with the
game at this stage, and isn't drawing a map or being at all careful.  If the
prologue is big, the player will quickly get lost and give up.  If it is too
hard, then many players simply won't reach the middle game.

Perhaps eight to ten rooms is the largest a prologue ought to be, and even
then it should have a simple (easily remembered) map layout.

3.  The Middle Game

A useful exercise is to draw out a tree (or more accurately a lattice) of
all the puzzles in a game.  At the top is a node representing the start of
the game, and then lower nodes represent solved puzzles.  An arrow is drawn
between two puzzles if one has to be solved before the other can be.  For
instance, a simple portion might look like:

                               Start
                              /     \
                             /       \
                      Find key     Find car
                             \        |
                              \       |
                               Start car
                                   |
                                   |
                             Reach motorway

This is useful because it checks that the game is soluble (for example, if
the ignition key had been kept in a phone box on the motorway, it wouldn't
have been) but also because it shows the overall structure of the game.
The questions to ask are:

  How much is visible at once?
  Do large parts of the game depend on one difficult puzzle?
  How many steps does a typical problem need?

Some games, such as the original Adventure, are very wide: there are thirty or
so puzzles, all easily available, none leading to each other.  Others, such as
"Spellbreaker", are very narrow: a long sequence of puzzles, each of which
leads only to a chance to solve the next.

A compromise is probably best.  Wide games are not very interesting, while
narrow ones can in a way be easy: if only one puzzle is available at a time,
the player will just concentrate on it, and will not be held up by trying to
use objects which are provided for different puzzles.

Bottlenecks should be avoided unless they are reasonably guessable:
otherwise many players will simply get no further.

Puzzles ought not to be simply a matter of typing in one well-chosen line. 
One hallmark of a good game is not to get any points for picking up an
easily-available key and unlocking a door with it.  This sort of low-level
achievement - like wearing an overcoat found lying around, for instance -
should not be enough.  A memorable puzzle will need several different ideas 
to solve (the Babel fish dispenser in "Hitch-hikers", for instance).

4.  Density

Once upon a time, the sole measure of quality in advertisements for
adventure games was the number of rooms.  Even quite small programs would
have 200 rooms, which meant only minimal room descriptions and simple
puzzles which were scattered thinly over the map.

Nowadays a healthier principle has been adopted: that (barring a few
junctions and corridors) there should be something out of the ordinary about
every room.

One reason for the quality of the "Infocom" games is that the version 3
system has an absolute maximum of 255 objects, which needs to cover rooms,
objects and many other things (eg, compass directions, or the spells in
"Enchanter" et al).  Many "objects" are not portable anyway: walls,
tapestries, thrones, control panels, coal-grinding machines and so on.

As a rule of thumb, four objects to one room is about right: this means
there will be, say, 50-60 rooms.  Of the remaining 200 objects, one can
expect 15-20 to be used up by the game's administration (eg, a "darkness"
room, 10 compass directions, a player and so on).  Another 50-75 or so
objects will be portable but the largest number, at least 100, will be
furniture.

So an object limit can be a blessing as well as a curse: it forces the
designer to make the game dense.  Rooms are too precious to be wasted.

5.  Rewards

There are two kinds of reward which need to be given to a player in return
for solving a puzzle.  One is obvious: that the game should advance a
little.  But the player at the keyboard needs a reward as well, that the
game should offer something new to look at.  In the old days, when a puzzle
was solved, the player simply got a bar of gold and had one less puzzle to
solve.

Much better is to offer the player some new rooms and objects to play
with, as this is a real incentive.  If no new rooms are on offer, at least
the "treasure" objects can be made interesting, like the spells in the
"Enchanter" trilogy or the cubes in "Spellbreaker".

6.  Mazes

Almost every game contains a maze.  Nothing nowadays will ever equal the
immortal

  You are in a maze of twisty little passages, all alike.

But now we are all jaded.  A maze should offer some twist which hasn't been
done before (the ones in "Enchanter" and "Sorcerer" being fine examples).

The point is not to make it hard and boring.  The standard maze solution is
to litter the rooms with objects in order to make the rooms distinguishable.
It's easy enough to obstruct this, the thief in "Zork I" being about the
wittiest way of doing so.  But that only makes a maze tediously difficult.

Instead there should be an elegant quick solution: for instance a guide who
needs to be bribed, or fluorescent arrows painted on the floor which can
only be seen in darkness (plus a hint about darkness, of course).

Above all, don't design a maze which appears to be a standard impossibly
hard one: even if it isn't, a player may lose heart and give up rather than
go to the trouble of mapping it.

7.  Wrong guesses

For some puzzles, a perfectly good alternative solution will occur to
players.  It's good style to code two or more solutions to the same puzzle,
if that doesn't upset the rest of the game.  But even if it does, at least
a game should say something when a good guess is made.  (Trying to cross the
volcano on the magic carpet in "Spellbreaker" is a case in point.)

One reason why "Zork" held the player's attention so firmly (and why it took
about ten times the code size, despite being slightly smaller than the
original mainframe Adventure) was that it had a huge stock of usually funny
responses to reasonable things which might be tried.

My favourite funny response, which I can't resist reprinting here, is:

   You are falling towards the ground, wind whipping around you.
   >east
   Down seems more likely.                                  ["Spellbreaker"]

(Though I also recommend trying to take the sea serpent in "Zork II".)  This
is a good example because it's exactly the sort of boring rule (can't move
from the midair position) which most designers usually want to code as fast
as possible, and don't write with any imagination.

Just as some puzzles should have more than one solution, some objects should
have more than one purpose.  In bad old games, players automatically threw
away everything as soon as they'd used them.  In better designed games,
obviously useful things (like the crowbar and the gloves in "Lurking
Horror") should be hung on to by the player throughout.

8.  The Map

To maintain an atmosphere throughout it's vital that the map should be
continuous.  Adventure games used to have maps like

                            Glacier
                               |
                          Oriental Room  --  Fire Station
                           (megaphone)        (pot plant)
                               |
                           Cheese Room

in which the rooms bore no relation to each other, so that the game had no
overall geography at all, and objects were unrelated to the rooms they were
in.  Much more believable is something like

                   Snowy Mountainside
                            \  
                         Carved Tunnel
                               |
                         Oriental Room  -- Jade Passage -- Fire Dragon
                            (buddha)       (bonsai tree)      Room
                               |
                         Blossom Room

Try to have some large-scale geography too: the mountainside should extend
across the map in both directions.  If there is a stream passing through a
given location, what happens to it?  And so on.

In designing a map, it adds to the interest to make a few connections in the
rarer compass directions (NE, NW, SE, SW) to prevent the player from a
feeling that the game has a square grid.  Also, it's nice to have a few
(possibly long) loops which can be walked around, to prevent endless
retracing of steps.

If the map is very large, or if a good deal of to-and-froing is called for,
there should be some rapid means of moving across it, such as the magic
words in Adventure, or the cubes in "Spellbreaker".

9.  The End Game

Some end games are small ("Lurking Horror", or "Sorcerer" for instance),
others large (the master game of the mainframe Adventure).  Nonetheless
almost all games have one.

End games serve two purposes.  Firstly they give the player a sense of being
near to success, and can be used to culminate the plot, to reveal the game's
secrets.  This is obvious enough.  But secondly they also serve to stop the
final stage of the game from being too hard.

As a designer, you don't usually want the last step to be too difficult; you
want to give the player the satisfaction of finishing, as a reward for
having got through the game.  (But of course you want to make him work for
it.)  An end game helps, because it narrows the game, so that only a few
rooms and objects are accessible.

The most annoying thing is requiring the player to have brought a few
otherwise useless objects with him.  The player should not be thinking that
the reason for being stuck on the master game is that something very obscure
should have been done 500 turns before.

10.  And Finally...

Finally, the winner gets some last message (which, like the opening message,
should have something amusing in it and should not be too long).  That
needn't quite be all, though.  In its final incarnations (alas, not the one
included in Lost Treasures), "Zork I" offered winners access to the hints
system at the RESTART, RESTORE or QUIT prompt.

As a last word on game design: Inform makes it easy to knock up games which
look a little bit like the classics, until you play them.  Infocom, Inc,
acquired a certain mystique in their time, but a game is not good merely
because it runs under one of their interpreters.  Still, I hope a few people
may put in the effort to turn out a finished game or two, and add to the
canon.


---------------------------------------------------------------------------
C1. A Hello Cruel World program
---------------------------------------------------------------------------


>   !
>   ! A great step backward in interactive fiction...
>   !
>
>   Object hillside "Bare hillside" nothing;
>   global place = hillside;
>   global score = 0;
>   global turns = 1;
>   
>   [ Main;
>   
>     print "^^^^^^^^^^^You wake up, shivering to see that Morgoth \
>            the Flatulent Devil is towering over you...^^";
>     Message();
>     print "^^\
>       ...and he squashes you effortlessly.^^   *** You have died ***^^^^^";
>   
>     quit;
>   ];
>   
>   [ Message i;
>     print "HELLO CRUEL WORLD^";
>     print "A Non-Interactive Demonstration^\
>            Copyright (c) 1993 by Graham Nelson. All rights reserved.^";
>     print "Release ";
>     print_num 0-->1;
>     print " / Serial number ";
>   
>     for i 18 to 23 { print_char 0->i; } new_line;
>   ];
>   

Note that the familiar banner has to be produced by your code.  By
convention, the first word (at bytes 2 and 3) of the file is the release
number, and this is what is set by the RELEASE command.  In this file there
isn't a RELEASE command, so it comes out as 1.  Bytes 18 to 23 contain the
serial number, or in fact the serial string of ASCII characters.  By custom
and tradition, these are the date of compilation arranged YYMMDD, and Inform
sets these automatically.

Note also that Message had to be a separate routine since we needed a local
variable, and Main is not permitted to have local variables of its own.

(The above source has changed a little since the first release: if the
object is not included, then some interpreters (not the InfoTaskForce one)
which voluntarily display the status line (when not asked to do so), get in
a quandary printing a location, time and score.  So for their benefit, here
are all three.)

On my machine (an Acorn Archimedes A5000), compiling with statistics
produces something like:

*inform -s hellow
Archimedes Inform 0.6 (v613)
   1 objects (maximum 255)        0 dictionary entries (maximum 800)
   0 attributes (maximum 32)      2 properties (maximum 30)
   0 adjectives (maximum 240)     0 verbs (maximum 250)
   0 actions (maximum 110)       1K long (maximum 128K)
   3 globals (maximum 240)      480 variable space (maximum 2048)
  28 symbols (maximum 3000)       2 routines
 314 characters of text (compressed to 276)

Offsets in story file:
0042 Synonyms     0102 Defaults     0140 Objects    0149 Properties
0155 Variables    0335 Parse table  0335 Actions    0335 Preactions
0337 Adjectives   0337 Dictionary   033e Code       0480 Strings

Completed in 1 seconds.

(This second being mostly consumed in printing out the statistics.  In
practice the compilation time is roughly proportional to the output
length, and typically takes 1.5-2 seconds per K of story file on my
machine.)

Note: if you try compiling the example games, and get different
statistics outputs, do not worry; it probably means you're using a later
version of Inform than the one that printed the above.  Similarly,
do not worry if your compiled story file shows differences with the
object code in the archive.

If the "Deja Vu" example (see below) compiles without Inform errors and
plays at all properly, then Inform is probably working OK.


---------------------------------------------------------------------------
C2. "Deja Vu": a toy game
---------------------------------------------------------------------------


The "Hello World" program above is enticingly short and easy, but only
because it doesn't contain a parser.  A fully functioning parser is hard
work to write, and occupies a good deal of "Inform" code.  Besides this, the
everyday mechanics of an adventure game involve more coding than most
designers want to go into, at least at the outset.

"Deja Vu" is a toy game in terms of how large it is to play, but contains a
fairly good parser and a kernel of routines which is easily adapted to any
ordinary adventure game.  The source should be available from where this
file is.  It can freely be used or adapted by anybody who would like to.

For the most part it should be self-explanatory with a little experiment. 
Roughly it works like this: the main loop calls the parser.  This returns
an actor (person to carry out the action, usually the player), the action
number and one or two argument objects (one of which may in fact be a list
of argument objects).  Next, the address of the action routine is looked up. 
For each object the verb is to be applied to, the room where the actor is is
asked via its "preroutine" whether it objects; then the object in question
is similarly asked; then the action takes place, via an indirect jump; then
the object and room are told that is has taken place, via their
"postroutines", and given the opportunity to give their own output instead
of the usual one.  (Thus, "Taken" can become "You drop the snake in horror. 
It is still alive!" or some such.)  Finally, the Time routine is called, to
run timed and random events.

For the sake of contrast, compiling "Deja Vu" with statistics should give
something like:

*inform -s dejavu
Archimedes Inform 0.6 (v613)
  43 objects (maximum 255)      253 dictionary entries (maximum 800)
  28 attributes (maximum 32)     24 properties (maximum 30)
  16 adjectives (maximum 240)    71 verbs (maximum 250)
  78 actions (maximum 110)      21K long (maximum 128K)
  43 globals (maximum 240)     1044 variable space (maximum 2048)
1116 symbols (maximum 3000)     127 routines
16363 characters of text (compressed to 11982)

Offsets in story file:
0042 Synonyms     0102 Defaults     0140 Objects    02c3 Properties
0636 Variables    0a4a Parse table  0ef7 Actions    0f93 Preactions
1031 Adjectives   1071 Dictionary   1764 Code       4c5a Strings

Completed in 36 seconds.


---------------------------------------------------------------------------
    Graham Nelson,            gan10@uk.ac.cam.phx  (preferably)
    Magdalen College,         nelson@uk.ac.ox.vax  (my Mr Hyde)
    Oxford OX1 4AU,
    UK.                       18th April 1993
                              and 16th May 1993
---------------------------------------------------------------------------