BlockValues

BlockValues.i6t

BlockValues contents

Overview.

Each I7 value is represented at run-time by an I6 word: on the Z-machine, a 16-bit number, and on Glulx, a 32-bit number. The correspondence between these numbers and the original values depends on the kind of value: "number" comes out as a signed twos-complement number, but "time" as an integer number of minutes since midnight, "rulebook" as the index of the rulebook in order of creation, and so on.

Even if a 32-bit number is available, this is not enough to represent the full range of values we might want: consider all the possible hundred-word essays of text, for instance. So for a whole range of kinds – "text", "list of K", "stored action" and so on – the I6 value at run-time is only a pointer to what is called a "short block". This is typically only a few words long, and often only a single word: hence the term "short". It has no header or other overhead, and its contents depend on the kind of value.

If we know that a given kind of value can be stored in, say, exactly 128 bits, then it's possible simply to store the whole thing in the short block. More often, though, the data needs to be flexible in size, or needs to be large. In that case, the short block will include (and sometimes, will consist only of) a pointer to data stored in a "long block". Unlike the short block, the long block is a chunk of memory stored using the Flex system, and thus is genuinely a "block" in the sense of the Flex documentation.

It's possible to have several different short blocks each pointing to the same long block of underlying data: for example, the result of the I7 code

let L1 be { 2, 3, 5, 7, 11 }; let L2 be L1;

is to create L1 and L2 as pointers to two different short blocks, but the two SBs each point to the same long block, which contains the data for the list 2, 3, 5, 7, 11. Note that this makes it very fast to copy L1's contents into L2, because only L2's short block needs to change.

The rules for customers who want to deal with values like this are much like the rules for allocating memory with Flex. Calling BlkValueCreate creates a new value, but this must always, and only once, later be disposed of using BlkValueFree.

So if the short blocks of L1 and L2 both point to the same long block of actual data, what happens when only one of them is freed? The answer is that every long block has a reference count attached, which counts the number of short blocks pointing to it. In our example, this count is 2. If list L1 is freed, the long block's reference count is decremented to 1, but it remains in memory, and only L1's short block is given up; when list L2 is subsequently freed, both its short block and the now unwanted long block are given up.

The harder case to handle is what happens when L1 and L2 share a long block containing 2, 3, 5, 7, 11, but when the source text asks to "add 13 to L1". If we simply changed the long block, that would affect L2 as well. So we must first make L1 "mutable". This means copying the long block to make a new unique copy with reference count 1; assigning that to L1 in place of the original; and decrementing the reference count of the original from 2 to 1. L1 and L2 now point to two different long blocks, so it's safe to modify L1's.

Subtle and beautiful bugs can occur as a result of making a value mutable at the wrong moment. Beware in particular of reading data out of a long block, then writing it back again, because the act of writing may force the value owning the long block to become mutable; this will make a new copy of the data; but you will be left holding the old copy. Since these are functionally identical, you may not even notice, but calamities will occur later because the version of the value you're holding really belongs to somebody else and may be freed at any point.

Finally, note that the I7 compiler also creates block values representing constants. For example, the source text

let L1 be { 2, 3, 5, 7, 11 };

causes a block value representing this list to be stored in memory. The long block for a constant needs to be immortal, since this memory must never be freed: it's therefore given a reference count of "infinity".

250[ BlkValueRead from pos do_not_indirect 251 long_block chunk_size_in_bytes header_size_in_bytes flags entry_size_in_bytes seek_byte_position; 252 if (from == 0) rfalse; 253 if (do_not_indirect) 254 long_block = from; 255 else 256 long_block = BlkValueGetLongBlock(from); 257 258 flags = long_block->BLK_HEADER_FLAGS; 259 entry_size_in_bytes = 1; 260 if (flags & BLK_FLAG_16_BIT) entry_size_in_bytes = 2; 261 else if (flags & BLK_FLAG_WORD) entry_size_in_bytes = WORDSIZE; 262 263 if (flags & BLK_FLAG_MULTIPLE) header_size_in_bytes = BLK_DATA_MULTI_OFFSET; 264 else header_size_in_bytes = BLK_DATA_OFFSET; 265 266 seek_byte_position = pos*entry_size_in_bytes; 267 for (: long_block~=NULL: long_block=long_block-->BLK_NEXT) { 268 chunk_size_in_bytes = FlexSize(long_block) - header_size_in_bytes; 269 if ((seek_byte_position >= 0) && (seek_byte_position<chunk_size_in_bytes)) { 270 long_block = long_block + header_size_in_bytes + seek_byte_position; 271 switch(entry_size_in_bytes) { 272 1: return long_block->0; 273 2: #Iftrue (WORDSIZE == 2); return long_block-->0; 274 #ifnot; return (long_block->0)*256 + (long_block->1); 275 #endif; 276 4: return long_block-->0; 277 } 278 } 279 seek_byte_position = seek_byte_position - chunk_size_in_bytes; 280 } 281 "*** BlkValueRead: reading from index out of range: ", pos, " in ", from, " ***"; 282]; 283 284[ BlkValueWrite to pos val do_not_indirect 285 long_block chunk_size_in_bytes header_size_in_bytes flags entry_size_in_bytes seek_byte_position; 286 if (to == 0) rfalse; 287 if (do_not_indirect) 288 long_block = to; 289 else { 290 BlkMakeMutable(to); 291 long_block = BlkValueGetLongBlock(to); 292 } 293 294 flags = long_block->BLK_HEADER_FLAGS; 295 entry_size_in_bytes = 1; 296 if (flags & BLK_FLAG_16_BIT) entry_size_in_bytes = 2; 297 else if (flags & BLK_FLAG_WORD) entry_size_in_bytes = WORDSIZE; 298 299 if (flags & BLK_FLAG_MULTIPLE) header_size_in_bytes = BLK_DATA_MULTI_OFFSET; 300 else header_size_in_bytes = BLK_DATA_OFFSET; 301 302 seek_byte_position = pos*entry_size_in_bytes; 303 for (:long_block~=NULL:long_block=long_block-->BLK_NEXT) { 304 chunk_size_in_bytes = FlexSize(long_block) - header_size_in_bytes; 305 if ((seek_byte_position >= 0) && (seek_byte_position<chunk_size_in_bytes)) { 306 long_block = long_block + header_size_in_bytes + seek_byte_position; 307 switch(entry_size_in_bytes) { 308 1: long_block->0 = val; 309 2: #Iftrue (WORDSIZE == 2); long_block-->0 = val; 310 #ifnot; long_block->0 = (val/256)%256; long_block->1 = val%256; 311 #endif; 312 4: long_block-->0 = val; 313 } 314 return; 315 } 316 seek_byte_position = seek_byte_position - chunk_size_in_bytes; 317 } 318 "*** BlkValueWrite: writing to index out of range: ", pos, " in ", to, " ***"; 319];

326[ BlkValueSeekZeroEntry from 327 long_block chunk_size_in_bytes header_size_in_bytes flags entry_size_in_bytes 328 byte_position addr from_addr to_addr; 329 if (from == 0) return -1; 330 long_block = BlkValueGetLongBlock(from); 331 332 flags = long_block->BLK_HEADER_FLAGS; 333 entry_size_in_bytes = 1; 334 if (flags & BLK_FLAG_16_BIT) entry_size_in_bytes = 2; 335 else if (flags & BLK_FLAG_WORD) entry_size_in_bytes = WORDSIZE; 336 337 if (flags & BLK_FLAG_MULTIPLE) header_size_in_bytes = BLK_DATA_MULTI_OFFSET; 338 else header_size_in_bytes = BLK_DATA_OFFSET; 339 340 byte_position = 0; 341 for (: long_block~=NULL: long_block=long_block-->BLK_NEXT) { 342 chunk_size_in_bytes = FlexSize(long_block) - header_size_in_bytes; 343 from_addr = long_block + header_size_in_bytes; 344 to_addr = from_addr + chunk_size_in_bytes; 345 switch(entry_size_in_bytes) { 346 1: 347 for (addr = from_addr: addr < to_addr: addr++) 348 if (addr->0 == 0) 349 return byte_position + addr - from_addr; 350 2: 351 #iftrue (WORDSIZE == 2); 352 for (addr = from_addr: addr < to_addr: addr=addr+2) 353 if (addr-->0 == 0) 354 return (byte_position + addr - from_addr)/2; 355 #ifnot; 356 for (addr = from_addr: addr < to_addr: addr=addr+2) 357 if ((addr->0 == 0) && (addr->1 == 0)) 358 return (byte_position + addr - from_addr)/2; 359 #endif; 360 4: 361 for (addr = from_addr: addr < to_addr: addr=addr+4) 362 if (addr-->0 == 0) 363 return (byte_position + addr - from_addr)/4; 364 } 365 byte_position = byte_position + chunk_size_in_bytes; 366 } 367 return -1; 368];

382[ BlkValueMassCopyEntries to_bv from_bv no_entries_to_copy 383 from_long_block from_addr from_bytes_left from_header_size_in_bytes 384 to_long_block to_addr to_bytes_left to_header_size_in_bytes 385 bytes_to_copy flags entry_size_in_bytes min; 386 387 BlkMakeMutable(to_bv); 388 389 from_long_block = BlkValueGetLongBlock(from_bv); 390 to_long_block = BlkValueGetLongBlock(to_bv); 391 392 flags = from_long_block->BLK_HEADER_FLAGS; 393 entry_size_in_bytes = 1; 394 if (flags & BLK_FLAG_16_BIT) entry_size_in_bytes = 2; 395 else if (flags & BLK_FLAG_WORD) entry_size_in_bytes = WORDSIZE; 396 397 if ((flags & (BLK_FLAG_MULTIPLE + BLK_FLAG_TRUNCMULT)) && 398 (BlkValueSetLBCapacity(to_bv, no_entries_to_copy) == false)) 399 BlkValueError("copy resizing failed"); 400 401 if (flags & BLK_FLAG_MULTIPLE) from_header_size_in_bytes = BLK_DATA_MULTI_OFFSET; 402 else from_header_size_in_bytes = BLK_DATA_OFFSET; 403 flags = to_long_block->BLK_HEADER_FLAGS; 404 if (flags & BLK_FLAG_MULTIPLE) to_header_size_in_bytes = BLK_DATA_MULTI_OFFSET; 405 else to_header_size_in_bytes = BLK_DATA_OFFSET; 406 407 from_addr = from_long_block + from_header_size_in_bytes; 408 from_bytes_left = FlexSize(from_long_block) - from_header_size_in_bytes; 409 to_addr = to_long_block + to_header_size_in_bytes; 410 to_bytes_left = FlexSize(to_long_block) - to_header_size_in_bytes; 411 412 bytes_to_copy = entry_size_in_bytes*no_entries_to_copy; 413 while (true) { 414 if (from_bytes_left == 0) { 415 from_long_block = from_long_block-->BLK_NEXT; 416 if (from_long_block == 0) BlkValueError("copy destination exhausted"); 417 from_addr = from_long_block + from_header_size_in_bytes; 418 from_bytes_left = FlexSize(from_long_block) - from_header_size_in_bytes; 419 } else if (to_bytes_left == 0) { 420 to_long_block = to_long_block-->BLK_NEXT; 421 if (to_long_block == 0) BlkValueError("copy source exhausted"); 422 to_addr = to_long_block + to_header_size_in_bytes; 423 to_bytes_left = FlexSize(to_long_block) - to_header_size_in_bytes; 424 } else { 425 min = from_bytes_left; if (to_bytes_left < min) min = to_bytes_left; 426 if (bytes_to_copy <= min) { 427 Memcpy(to_addr, from_addr, bytes_to_copy); 428 return; 429 } 430 Memcpy(to_addr, from_addr, min); 431 bytes_to_copy = bytes_to_copy - min; 432 from_addr = from_addr + min; 433 from_bytes_left = from_bytes_left - min; 434 to_addr = to_addr + min; 435 to_bytes_left = to_bytes_left - min; 436 } 437 } 438];

444[ BlkValueMassCopyFromArray to_bv from_array from_entry_size no_entries_to_copy 445 to_long_block to_addr to_entries_left to_header_size to_entry_size 446 flags; 447 448 BlkMakeMutable(to_bv); 449 450 to_long_block = BlkValueGetLongBlock(to_bv); 451 452 flags = to_long_block->BLK_HEADER_FLAGS; 453 to_entry_size = 1; 454 if (flags & BLK_FLAG_16_BIT) to_entry_size = 2; 455 else if (flags & BLK_FLAG_WORD) to_entry_size = WORDSIZE; 456 457 if ((flags & (BLK_FLAG_MULTIPLE + BLK_FLAG_TRUNCMULT)) && 458 (BlkValueSetLBCapacity(to_bv, no_entries_to_copy) == false)) 459 BlkValueError("copy resizing failed"); 460 461 if (flags & BLK_FLAG_MULTIPLE) to_header_size = BLK_DATA_MULTI_OFFSET; 462 else to_header_size = BLK_DATA_OFFSET; 463 464 to_addr = to_long_block + to_header_size; 465 to_entries_left = (FlexSize(to_long_block) - to_header_size)/to_entry_size; 466 467 while (no_entries_to_copy > to_entries_left) { 468 Arrcpy(to_addr, to_entry_size, from_array, from_entry_size, to_entries_left); 469 no_entries_to_copy = no_entries_to_copy - to_entries_left; 470 from_array = from_array + to_entries_left*from_entry_size; 471 to_long_block = to_long_block-->BLK_NEXT; 472 if (to_long_block == 0) BlkValueError("copy source exhausted"); 473 to_addr = to_long_block + to_header_size; 474 to_entries_left = (FlexSize(to_long_block) - to_header_size)/to_entry_size; 475 } 476 if (no_entries_to_copy > 0) { 477 Arrcpy(to_addr, to_entry_size, from_array, from_entry_size, no_entries_to_copy); 478 } 479];

KOVS Routines.

Different kinds of value use different data formats for both their short and long blocks, so it follows that each kind needs its own routines to carry out the fundamental operations of creating, destroying, copying and comparing. This is organised at run-time by giving each kind of block value a "KOVS", a "kind of value support" routine. These are named systematically by suffixing _Support: that is, the support function for TEXT_TY is called TEXT_TY_Support and so on.

I7 automatically compiles a function called KOVSupportFunction which returns the KOVS for a given kind. Note that this depends only on the weak kind, not the strong one: so "list of numbers" and "list of texts", for example, share a common KOVS which handles all list support.

The support function can be called with any of the following task constants as its first argument: it then has a further one to three arguments depending on the task in hand.

Creation.

To create a block value, call:

BlkValueCreate(kind)

where K is its (strong) kind ID. Optionally, call:

BlkValueCreate(K, short_block)

to mandate that the short block needs to be located at the given address outside the heap: but don't do this unless you can guarantee that space of the necessary length will be available there for as long as the lifetime of the value; and please note, it really does matter that this address lies outside the heap, for reasons to be seen below.

These work by delegating to:

kovs(CREATE_KOVS, strong_kind, short_block)

which returns the address of the short block for the new value.

Slow Copy.

Why don't we always do this? Consider the case where B is a list of rooms, and A is a list of objects. If we give A's short block a pointer to the long block of B, A will suddenly change its kind as well as its contents, because the strong kind of a list is stored inside the long block. So there are a few cases where it's not safe to make a quick copy. In any case, sooner or later you have to duplicate actual data, not just rearrange pointers to it, and here's where.

We first call:

kovs(KINDDATA_KOVS, to_bv)

which asks for an ID for the kinds stored in the BV: for example, for a list of rooms it would return the kind ID for "room". We ask for this because it's information stored in the long block, which is about to be overwritten.

As with the quick copy, we must now make sure any data currently in the destination is properly destroyed. We could do so by making the destination mutable and then destroying its contents, but that would be inefficient, in that it might create a whole lot of temporary copies and then delete them again. So if the long block has a high reference count, we decrement it and then replace the short block (in place) with a fresh one pointing to empty data; we only destroy the contents if the long block has reference count 1.

All of which finally means we can scribble over the destination without spoiling anybody else's day. We resize it to make room for the incoming data; we copy the raw data of the long block; and finally we:

kovs(COPY_KOVS, to_bv, from_bv, k)

This is where the KOVS should make a proper copy, using BlkValueCopy and thus perhaps recursing, if any of that data contained block values in turn: as for instance it will if we're copying a list of texts. Note that k is the value given us by KINDDATA_KOVS.

Destruction.

We will also need primitives for two different forms of destruction. This is something which should happen whenever a block value is thrown away, not to be used again: either because it's being freed, or because new contents are being copied into it.

The idea of destruction is that any data stored in the long block should safely be disposed of. If the reference count of the long block is 2 or more, there's no problem, because we can simply decrement the count and let other people worry about the data from now on. But if it's only 1, then destroying the data is on us. Since we don't know what's in the long block, we have to ask the KOVS to do this by means of:

kovs(DESTROY_KOVS, bv)

Note that all of this frequently causes recursion: destruction leads to freeing of some of the data, which in turn means that that data must be destroyed, and so on. So it's essential that block values be well-founded: a list must not, for example, contain itself.

Mutability.

A block value is by definition mutable if it has a long block with reference count 1, because then the data in the long block can freely be changed without corrupting other block values.

We offer the KOVS a chance to handle this for us:

kovs(MAKEMUTABLE_KOVS, bv)

should return 0 to say that it has done so, or else return the size of the short block in words to ask us to handle it. The way we do this is to create a temporary value to make a safe copy into; it would be unnecessarily slow to allocate the short block for this safe copy on the heap and then free it again moments later, so instead we put the short block on the stack, making a temporary one-value stack frame instead to hold it.

Serialisation.

Some block values can be written to external files (on Glulx): others cannot. The following routines abstract that.

If ch is -1, then:

kovs(READ_FILE_KOVS, bv, auxf, ch)

returns true or false according to whether it is possible to read data from an auxiliary file auxf into the block value bv. If ch is any other value, then the routine should do exactly that, taking ch to be the first character of the text read from the file which makes up the serialised form of the data.

kovs(WRITE_FILE_KOVS, bv)

is simpler because, strictly speaking, it doesn't write to a file at all: it simply prints a serialised form of the data in bv to the output stream. Since it is called only when that output stream has been redirected to an auxiliary file, and since the serialised form would often be illegible on screen, it seems reasonable to call it a file input-output function just the same. The WRITE_FILE_KOVS should return true or false according to whether it was able to write the data.

I6 Template Layer

BlockValues.i6t

Overview.

Short Block Format.

Long Block Access.

Weak Kind.

Reference counting.

Changing Reference Counts.

Long Block Capacity.

Long Block Array Access.

First Zero Entry.

Mass Copy Entries.

Mass Copy From Array.

KOVS Routines.

Creation.

Errors.

Short Block Allocation.

Block Values On Stack.

Freeing.

Quick Copy.

Short Block Copy.

Slow Copy.

Copy.

Destruction.

Recycling.

Mutability.

Casting.

Comparison.

Hashing.

Serialisation.

Debugging.

Printing Memory Addresses.

Hexadecimal Printing.