PICKLE specification version 1 ("Platform-Independent Composite Kumquat Lickable Format") Written by Andrew Plotkin Updated 12/25/95 (merry greed-day!) Introduction PICKLE is a way to encapsulate many resources of different types into a single file, which can easily be transferred between systems. The archetypical use is an enhanced text adventure, containing a program (in a platform-independent format such as Z-code) and a set of graphics and sound resources (in platform-independent formats such as GIF and AIFF.) (Smart-alecks may point out that this is what the MIME (Multimedia Internet Mail something-or-other) standard is for. I know. I feel that while MIME is simple to parse, it's not *trivial*, and it's a little too powerful for this use. It also worries about encoding as well as format, which I'd rather not.) What's the point? This format was inspired by the following notation in the Z-machine specification: "The interpreter is simply expected [by the story file] to know where to find them [the sounds or graphics.]" The de facto standard set by Infocom is to put the resources in a separate directory, one per file. I think this is terrible. It places the burden of keeping all the game resources together on the user, instead of on the game author. The convenience of downloading a single story file and knowing that it will work has, I think, been taken for granted. PICKLE is an attempt to extend that convenience to games which include more resources than just a story file. (Note that this is very close to the idea of stand-alone games. PICKLE solves a potential problem here as well: how do you create a stand-alone game from an arbitrary number of resource files? How does the player determine what to include? Making a stand-alone game from a PICKLE file is as easy as making one from a single story file. And PICKLE provides a model, if not an actual format, for assembling many resources into one package.) Although I was thinking of the Z-machine when I invented this stuff, it can equally well be used by any interpreter system that uses the concept of "the story file requests some data, and the interpreter is expected to know where to find it." If the story file has more low-level control (such as access to direct file manipulation) then PICKLE is of less use. But then, I hope that future interpreter designers will take the existence of PICKLE into account. What a PICKLE File Is Conceptually, a PICKLE file is a collection of chunks. Each chunk has a *use*, a *number*, and a *format*. A chunk's use might be executable, text, image, sound, animation, and so on. A chunk whose use is image might have a format of GIF, JPEG, ASCII-art, and so on. The number of a chunk is just a label, used to request it. The idea is that the file can provide several alternative formats for a given resource, by containing chunks with the same use and number, but different formats. All such chunks should have the same content. There are only two operations defined on a PICKLE file: you can request a chunk, or check for the existence of a usable chunk. When you wish to request a chunk, you request it by use and number. For example, you might request image number 4. The interpreter (or whatever is parsing the PICKLE file) would look through for a chunk whose use is image and whose number is 4. There may be no such chunk, or one, or several: * If there is none, the interpreter should fail gracefully, displaying an error message, and go on with what it's doing (if possible.) Do not assume that "nothing" will happen. The error message may be obtrusive or cause the interpreter to exit. * If there is one such chunk, the interpreter may or may not understand the format of the chunk. If it does, it will give it to you, to do with as you please. If not, it will again fail gracefully and display an error. * If there are several chunks which have the given use and number, the interpreter may pick *any one of them*. The interpreter will of course try to pick a chunk whose format it understands, and to pick one which will be the nicest (a GIF image is nicer than an ASCII-art one, for example.) However, if it has more than one nice option, there is no guarantee as to which it will pick. Whichever chunk it picks, it will return it or fail with an error, as above. When you wish to check for a chunk, you again check by use and number. The interpreter then tells you whether there is a chunk with that use and number, and a format the interpreter understands. This means that if the interpreter says "yes", you may request that use and number safely. If the interpreter says "no", you know that such a request will produce an error. There is no restriction as to what chunks may be in a PICKLE file. (In particular, numbered chunks of the same use need not be sequential.) It is even legal to have more than one chunk with the same use, number, and format. There is no point to it, however, unless you like leaving decisions to the whim of the interpreter. Who is "you" in this discussion? It is, of course, possible to write a simple interpreter which accepts command-line input, requests chunks, and displays the results (or plays them, or whatever.) However, the default action of a user-level PICKLE interpreter should be to *request executable chunk number zero, and execute it.* This executable chunk may then request further chunks in due course. Note that there may be more than one executable chunk zero. Theoretically, you could create a PICKLE file which contains both a TADS and a Z-code program, with the interpreter running whichever it knows how to run. (In practice, if I ever see this done with a full-length game, I will run gibbering for the hills.) A simple example, in case the above verbosity wasn't *quite* enough for you. The Z-machine version 6 understands an opcode @draw_picture(num,y,x) which means to draw the picture number *num* at coordinates (*x*,*y*) in the game window. We put together a PICKLE file containing the following chunks: use # (format) : contents ----------------------------------------- executable 0 (Z-code V6) : A game file image 1 (GIF) : Picture #1 image 1 (JPEG) : Picture #1 image 2 (GIF) : Picture #2 image 2 (JPEG) : Picture #2 etc... The player loads this into his interpreter, which starts executing the game (since it is executable 0.) When the Z-machine emulator reaches a @draw_picture(num,y,x) opcode, it requests image *num* from the interpreter, takes what is returned, and draws it on the screen. Simple, and will work as long as the interpreter is capable of drawing either GIFs or JPEGs. Not all languages have the capacity to check for the existence of chunks. The Z-machine has an opcode to check how many images it holds, but although it can play sounds, it has no opcode to check for the existence of sounds. This is tough noogies on the Z-machine. If someone deletes all the sound chunks from a game that expects them, the player will get error messages. Standards go two ways, so here is a list of various things promised and assumed by various members of the PICKLE-using community: The file creator guarantees, and thus the reader library assumes, that the order of the chunks is not significant. (This is why the reader library API has no functions to find the first, second, ... nth chunk. The interpreter doesn't have to care.) The file creator guarantees, and thus the reader library and game file assume, that two chunks with the same use and number (but different formats) have the same content -- the same image or sound or whatever. (This is why the game file doesn't have to care what formats are available. It can just request a use and number, and trust that the interpreter will get the right information out to the player.) The PICKLE Format After all this discussion of the semantics, the syntax is very simple. A PICKLE file consists of a header, followed by one chunk descriptor for each chunk, followed by the chunks themselves. (A NUM is a 32-bit integer, stored MSB first. A TYPE is a 32-bit value representing four ASCII characters, stored first character first, in the manner common on the Macintosh. TYPEs are generally manipulated as long integers, so every bit of every byte is significant. The characters can in theory be any 8-bit values. All the TYPEs I suggest in this document will be made of lower-case letters, but interpreters should *not* take it on themselves to ignore capitalization differences or skip unprintable characters.) Header: (16 bytes) TYPE: 'pikl' (that is, hex value 70696b6c) NUM : the version number of the PICKLE file (1 for the version described in this document) NUM : number of chunks in the file NUM : length of the file (in bytes) Descriptor: (24 bytes for each) (The descriptors are not in any guaranteed order.) TYPE: use of chunk NUM : number of chunk TYPE: major format of chunk NUM : minor format of chunk NUM : position of beginning of chunk data (in bytes, from the beginning of the PICKLE file) NUM : length of chunk (in bytes) Chunks: All the chunk data stuck together. The chunks do not have to be in the same order as their descriptors. Note that the use of a chunk is represented by a 32-bit TYPE. The format is represented by 64 bits, a TYPE and a NUM. (The meaning of the NUM is determined by the TYPE; it will usually be a version number.) Here are some suggested uses and formats. I have no suggestion for most of the version numbers, because in most cases I don't know much about the format or the version numbers it's had. If there's a question mark, you should do some research and start some net discussion before you make any PICKLE files containing that type. If a format has never had different versions, just use 0 in the minor format field. Use Major format --------------- 'exec' : Executable chunk. 'zcod' 1-8 : Z-code file, with the version (1 through 8 these days) stored in the minor format field. 'tads' ? : TADS game file 'hugo' ? : HUGO game file 'text' : Text. (No idea why anyone would want one, but we might as well define it.) 'text' 0 : ASCII text, with newline (ctrl-J, '\012') characters delineating the ends of paragraphs (not lines). Characters with the high bit set are legal, and should be interpreted via ISO 8859 Latin-1. 'pict' : Two-D image. 'giff' ? : GIF (I made up the other 'f'.) Versions are 87 and 89, maybe? 'jpeg' ? : JPEG 'text' 0 : ASCII text, as described above, which the interpreter will display in a fixed-width font. Newlines (ctrl-J, '\012') can go at the end of each line, as is usual for ASCII graphics. 'audi' : Sound. 'aiff' ? : AIFF 'idat' 0 : Infocom DAT sound format, as described in Stefan Jokisch's sound format article. The data should be unsigned; that is, values range from 0 to 255, with 128 in the middle. 'imid' 0 : Infocom MID sound format, as described in the same article. 'midi' ? : General MIDI file 'text' 0 : ASCII text, as described above. (Which might contain the message "You hear a horrid scream!".) 'anim' : Two-D animation (possibly with built-in sound). 'mpeg' ? : MPEG 'qktm' 0 : QuickTime (flattened) Note that the format 'text' can be used in several ways. This is a simple way to ensure, say, that sounds will be "audible" even on a machine with no sound capacity. (The interpreter might display the 'text' contents in a separate window.) However, you are certainly under no obligation to provide a 'text' equivalent for every sound and picture. Your language may have much better mechanisms to detect and work around the capacities of the player's machine. A few other formats may also be flexible enough to work under several different uses. (One might have a 'audi' 'qktm', which would be a QuickTime movie with only audio tracks.) However, it is not really required that a given format TYPE mean the same thing in different uses. It is merely convenient. Finally, note that it is very simple to outfit an existing text game interpreter (Z-code, TADS, etc) to accept PICKLE-packed game files as well as normal game files. If the first four bytes of the file are 'pikl', you know you have a PICKLE file; you then scan through the chunk descriptors looking for use 'exec', number 0, format 'zcod' (or whatever.) If you find one, you pull out its offset and length, read it from the file, and go on as usual. If not, you display an error.