Text

Text.i6t

Text contents

Block Format.

The short block for a text is two words long: the first word selects which form of storage will be used to represent the content, and the second word is a reference to that content. This reference is an I6 String or Routine in all cases except one, when it's a pointer to a long block containing a null-terminated array of characters, like a C string.

Clearly we need PACKED_TEXT_STORAGE and UNPACKED_TEXT_STORAGE to distinguish between the two basic methods of text storage, roughly equivalent to the pre-2013 kinds "text" and "indexed text". But why do we need four?

CONSTANT_PACKED_TEXT_STORAGE is easy to explain: the BlkValue routines normally detect constants using metadata in their long blocks, but of course that won't work for values which haven't got any long blocks. We use this instead. We don't need a CONSTANT_UNPACKED_TEXT_STORAGE because I7 never compiles constant text in unpacked form.

The surprising one is CONSTANT_PERISHABLE_TEXT_STORAGE. This is a constant created by the I7 compiler which is marked as being tricky because its value is a text substitution containing references to local variables. Unlike other text substitutions, this can't meaningfully be stored away to be expanded later: it must be expanded into unpacked text before it perishes.

33Constant CONSTANT_PACKED_TEXT_STORAGE = BLK_BVBITMAP_TEXT + BLK_BVBITMAP_CONSTANT + 1; 34Constant CONSTANT_PERISHABLE_TEXT_STORAGE = BLK_BVBITMAP_TEXT + BLK_BVBITMAP_CONSTANT + 2; 35Constant PACKED_TEXT_STORAGE = BLK_BVBITMAP_TEXT + 3; 36Constant UNPACKED_TEXT_STORAGE = BLK_BVBITMAP_TEXT + BLK_BVBITMAP_LONGBLOCK + 4;

43[ TEXT_TY_Extent arg1 x; 44 x = BlkValueSeekZeroEntry(arg1); 45 if (x < 0) return -1; ! should not happen, of course 46 return x+1; 47];

Character Set.

On the Z-machine, we use the 8-bit ZSCII character set, stored in bytes; on Glulx, we use the opening 16-bit subset of Unicode (which though only a subset covers almost all letter forms used on Earth), stored in half-words.

The Z-machine does have very partial Unicode support, but not in a way that can help us here. It is capable of printing a wide range of Unicode characters, and on a good interpreter with a good font (such as Zoom for Mac OS X, using the Lucida Grande font) can produce many thousands of glyphs. But it is not capable of printing those characters into memory rather than the screen, an essential technique for texts: it can only write each character to a single byte, and it does so in ZSCII. That forces our hand when it comes to choosing the indexed-text character set.

64#IFDEF TARGET_ZCODE; 65Constant TEXT_TY_Storage_Flags = BLK_FLAG_MULTIPLE; 66Constant ZSCII_Tables; 67#IFNOT; 68Constant TEXT_TY_Storage_Flags = BLK_FLAG_MULTIPLE + BLK_FLAG_16_BIT; 69Constant Large_Unicode_Tables; 70#ENDIF; 71 72{-segment:UnicodeData.i6t} 73{-segment:Char.i6t}

82[ TEXT_TY_Support task arg1 arg2 arg3; 83 switch(task) { 84 CREATE_KOVS: return TEXT_TY_Create(arg2); 85 CAST_KOVS: TEXT_TY_Cast(arg1, arg2, arg3); 86 MAKEMUTABLE_KOVS: return TEXT_TY_Mutable(arg1); 87 COPYQUICK_KOVS: rtrue; 88 COPYSB_KOVS: TEXT_TY_CopySB(arg1, arg2); 89 KINDDATA_KOVS: return 0; 90 EXTENT_KOVS: return TEXT_TY_Extent(arg1); 91 COMPARE_KOVS: return TEXT_TY_Compare(arg1, arg2); 92 READ_FILE_KOVS: if (arg3 == -1) rtrue; 93 return TEXT_TY_ReadFile(arg1, arg2, arg3); 94 WRITE_FILE_KOVS: return TEXT_TY_WriteFile(arg1); 95 HASH_KOVS: return TEXT_TY_Hash(arg1); 96 DEBUG_KOVS: TEXT_TY_Debug(arg1); 97 } 98 ! We choose not to respond to: DESTROY_KOVS, COPYKIND_KOVS, COPY_KOVS 99 rfalse; 100];

106[ TEXT_TY_Debug txt; 107 switch (txt-->0) { 108 CONSTANT_PACKED_TEXT_STORAGE: print " = cp~", (PrintI6Text) txt-->1, "~"; 109 CONSTANT_PERISHABLE_TEXT_STORAGE: print " = cp~", (PrintI6Text) txt-->1, "~"; 110 PACKED_TEXT_STORAGE: print " = p~", (PrintI6Text) txt-->1, "~"; 111 UNPACKED_TEXT_STORAGE: print " = ~", (TEXT_TY_Say) txt, "~"; 112 default: print " broken?"; 113 } 114];

122[ TEXT_TY_Create short_block x; 123 return BlkValueCreateSB2(short_block, PACKED_TEXT_STORAGE, EMPTY_TEXT_PACKED); 124];

131[ TEXT_TY_CopySB to_bv from_bv; 132 BlkValueCopySB2(to_bv, from_bv); 133 if (to_bv-->0 & BLK_BVBITMAP_CONSTANTMASK) to_bv-->0 = PACKED_TEXT_STORAGE; 134];

143[ TEXT_TY_Transmute txt; 144 TEXT_TY_Temporarily_Transmute(txt); 145]; 146 147[ TEXT_TY_Temporarily_Transmute txt x; 148 if ((txt) && (txt-->0 & BLK_BVBITMAP_LONGBLOCKMASK == 0)) { 149 x = txt-->1; ! The old value was a packed string 150 151 txt-->0 = UNPACKED_TEXT_STORAGE; 152 txt-->1 = FlexAllocate(32, TEXT_TY, TEXT_TY_Storage_Flags); 153 if (x ~= EMPTY_TEXT_PACKED) TEXT_TY_CastPrimitive(txt, false, x); 154 155 return x; 156 } 157 return 0; 158]; 159 160[ TEXT_TY_Untransmute txt pk cp x; 161 if ((pk) && (txt-->0 == UNPACKED_TEXT_STORAGE)) { 162 x = txt-->1; ! The old value was an unpacked string 163 FlexFree(x); 164 txt-->0 = cp; 165 txt-->1 = pk; ! The value earlier returned by TEXT_TY_Temporarily_Transmute 166 } 167 return txt; 168];

175[ TEXT_TY_Mutable txt; 176 if (txt-->0 & BLK_BVBITMAP_LONGBLOCKMASK == 0) { 177 TEXT_TY_Transmute(txt); 178 return 0; 179 } 180 return 2; ! Tell BlockValue there's a long block pointer 181];

189[ TEXT_TY_Cast to_txt from_kind from_value; 190 if (from_kind == TEXT_TY) { 191 BlkValueCopy(to_txt, from_value); 192 } else if (from_kind == SNIPPET_TY) { 193 TEXT_TY_Transmute(to_txt); 194 TEXT_TY_CastPrimitive(to_txt, true, from_value); 195 } else BlkValueError("impossible cast to text"); 196]; 197 198[ SNIPPET_TY_to_TEXT_TY to_txt snippet; 199 return BlkValueCast(to_txt, SNIPPET_TY, snippet); 200];

Data Conversion.

We use a single routine to handle two kinds of format translation: a packed I6 string into an unpacked text, or a snippet into an unpacked text.

In each case, what we do is simply to print out the value we have, but with the output stream set to memory rather than the screen. That gives us the character by character version, neatly laid out in an array, and all we have to do is to copy it into the text and add a null termination byte.

What complicates things is that the two virtual machines handle printing to memory quite differently, and that the original text has unpredictable length. We are going to try printing it into the array TEXT_TY_Buffers, but what if the text is too big? Disastrously, the Z-machine simply writes on in memory, corrupting all subsequent arrays and almost certainly causing the story file to crash soon after. There is nothing we can do to predict or avoid this, or to repair the damage: this is why the Inform documentation warns users to be wary of using text with large strings in the Z-machine, and advises the use of Glulx instead. Glulx does handle overruns safely, and indeed allows us to dynamically allocate memory as necessary so that we can always avoid overruns entirely.

In either case, though, it's useful to have TEXT_TY_BufferSize, the size of the temporary buffer, large enough that it will never be overrun in ordinary use. This is controllable with the use option "maximum indexed text length".

229#ifndef TEXT_TY_BufferSize; 230Constant TEXT_TY_BufferSize = 512; 231#endif; 232Constant TEXT_TY_NoBuffers = 2; 233 234#ifdef TARGET_ZCODE; 235Array TEXT_TY_Buffers -> TEXT_TY_BufferSize*TEXT_TY_NoBuffers; ! Where characters are bytes 236#ifnot; 237Array TEXT_TY_Buffers --> (TEXT_TY_BufferSize+2)*TEXT_TY_NoBuffers; ! Where characters are words 238#endif; 239 240Global RawBufferAddress = TEXT_TY_Buffers; 241Global RawBufferSize = TEXT_TY_BufferSize; 242 243Global TEXT_TY_CastPrimitiveNesting = 0;

251#ifdef TARGET_ZCODE; 252[ TEXT_TY_CastPrimitive to_txt from_snippet from_value len news buffer; 253 if (to_txt == 0) BlkValueError("no destination for cast"); 254 SuspendRTP(); 255 buffer = RawBufferAddress + TEXT_TY_CastPrimitiveNesting*TEXT_TY_BufferSize; 256 TEXT_TY_CastPrimitiveNesting++; 257 if (TEXT_TY_CastPrimitiveNesting > TEXT_TY_NoBuffers) 258 FlexError("ran out with too many simultaneous text conversions"); 259 260 @push say__p; @push say__pc; 261 ClearParagraphing(6); 262 @output_stream 3 buffer; 263 if (from_value) { 264 if (from_snippet) print (PrintSnippet) from_value; 265 else print (PrintI6Text) from_value; 266 } 267 @output_stream -3; 268 @pull say__pc; @pull say__p; 269 ResumeRTP(); 270 271 len = buffer-->0; 272 if (len > RawBufferSize-1) len = RawBufferSize-1; 273 buffer->(len+2) = 0; 274 275 TEXT_TY_CastPrimitiveNesting--; 276 BlkValueMassCopyFromArray(to_txt, buffer+2, 1, len+1); 277];

282#ifnot; ! TARGET_ZCODE 283[ TEXT_TY_CastPrimitive to_txt from_snippet from_value 284 len i stream saved_stream news buffer buffer_size memory_to_free results; 285 286 if (to_txt == 0) BlkValueError("no destination for cast"); 287 288 buffer_size = (TEXT_TY_BufferSize + 2)*WORDSIZE; 289 290 RawBufferSize = TEXT_TY_BufferSize; 291 buffer = RawBufferAddress + TEXT_TY_CastPrimitiveNesting*buffer_size; 292 TEXT_TY_CastPrimitiveNesting++; 293 if (TEXT_TY_CastPrimitiveNesting > TEXT_TY_NoBuffers) { 294 buffer = VM_AllocateMemory(buffer_size); memory_to_free = buffer; 295 if (buffer == 0) 296 FlexError("ran out with too many simultaneous text conversions"); 297 } 298 299 if (unicode_gestalt_ok) { 300 SuspendRTP(); 301 .RetryWithLargerBuffer; 302 saved_stream = glk_stream_get_current(); 303 stream = glk_stream_open_memory_uni(buffer, RawBufferSize, filemode_Write, 0); 304 glk_stream_set_current(stream); 305 306 @push say__p; @push say__pc; 307 ClearParagraphing(7); 308 if (from_snippet) print (PrintSnippet) from_value; 309 else print (PrintI6Text) from_value; 310 @pull say__pc; @pull say__p; 311 312 results = buffer + buffer_size - 2*WORDSIZE; 313 glk_stream_close(stream, results); 314 if (saved_stream) glk_stream_set_current(saved_stream); 315 ResumeRTP(); 316 317 len = results-->1; 318 if (len > RawBufferSize-1) { 319 ! Glulx had to truncate text output because the buffer ran out: 320 ! len is the number of characters which it tried to print 321 news = RawBufferSize; 322 while (news < len) news=news*2; 323 i = VM_AllocateMemory(news*WORDSIZE); 324 if (i ~= 0) { 325 if (memory_to_free) VM_FreeMemory(memory_to_free); 326 memory_to_free = i; 327 buffer = i; 328 RawBufferSize = news; 329 buffer_size = (RawBufferSize + 2)*WORDSIZE; 330 jump RetryWithLargerBuffer; 331 } 332 ! Memory allocation refused: all we can do is to truncate the text 333 len = RawBufferSize-1; 334 } 335 buffer-->(len) = 0; 336 337 TEXT_TY_CastPrimitiveNesting--; 338 BlkValueMassCopyFromArray(to_txt, buffer, 4, len+1); 339 } else { 340 RunTimeProblem(RTP_NOGLULXUNICODE); 341 } 342 if (memory_to_free) VM_FreeMemory(memory_to_free); 343]; 344#endif;

356[ TEXT_TY_Compare left_txt right_txt rv; 357 @push say__comp; 358 say__comp = true; 359 rv = TEXT_TY_Compare_Inner(left_txt, right_txt); 360 @pull say__comp; 361 return rv; 362]; 363 364[ TEXT_TY_Compare_Inner left_txt right_txt 365 pos ch1 ch2 capacity_left capacity_right fl fr cl cr cpl cpr; 366 if (left_txt-->0 & BLK_BVBITMAP_LONGBLOCKMASK == 0) fl = true; 367 if (right_txt-->0 & BLK_BVBITMAP_LONGBLOCKMASK == 0) fr = true; 368 369 if (fl && fr) { 370 if ((left_txt-->1 ofclass String) && (right_txt-->1 ofclass String)) 371 return left_txt-->1 - right_txt-->1; 372 if ((left_txt-->1 ofclass Routine) && (right_txt-->1 ofclass Routine)) 373 return left_txt-->1 - right_txt-->1; 374 cpl = left_txt-->0; cl = TEXT_TY_Temporarily_Transmute(left_txt); 375 cpr = right_txt-->0; cr = TEXT_TY_Temporarily_Transmute(right_txt); 376 } else if (fl) { 377 cpl = left_txt-->0; cl = TEXT_TY_Temporarily_Transmute(left_txt); 378 } else if (fr) { 379 cpr = right_txt-->0; cr = TEXT_TY_Temporarily_Transmute(right_txt); 380 } 381 if ((cl) || (cr)) { 382 pos = TEXT_TY_Compare(left_txt, right_txt); 383 TEXT_TY_Untransmute(left_txt, cl, cpl); 384 TEXT_TY_Untransmute(right_txt, cr, cpr); 385 return pos; 386 } 387 capacity_left = BlkValueLBCapacity(left_txt); 388 capacity_right = BlkValueLBCapacity(right_txt); 389 for (pos=0:(pos<capacity_left) && (pos<capacity_right):pos++) { 390 ch1 = BlkValueRead(left_txt, pos); 391 ch2 = BlkValueRead(right_txt, pos); 392 if (ch1 ~= ch2) return ch1-ch2; 393 if (ch1 == 0) return 0; 394 } 395 if (pos == capacity_left) return -1; 396 return 1; 397]; 398 399[ TEXT_TY_Distinguish left_txt right_txt; 400 if (TEXT_TY_Compare(left_txt, right_txt) == 0) rfalse; 401 rtrue; 402];

408[ TEXT_TY_Hash txt rv len i p cp; 409 cp = txt-->0; p = TEXT_TY_Temporarily_Transmute(txt); 410 rv = 0; 411 len = BlkValueLBCapacity(txt); 412 for (i=0: i<len: i++) 413 rv = rv * 33 + BlkValueRead(txt, i); 414 TEXT_TY_Untransmute(txt, p, cp); 415 return rv; 416];

426[ TEXT_TY_Say txt ch i dsize; 427 if (txt==0) rfalse; 428 if (txt-->0 & BLK_BVBITMAP_LONGBLOCKMASK == 0) return PrintI6Text(txt-->1); 429 dsize = BlkValueLBCapacity(txt); 430 for (i=0: i<dsize: i++) { 431 ch = BlkValueRead(txt, i); 432 if (ch == 0) break; 433 #ifdef TARGET_ZCODE; 434 print (char) ch; 435 #ifnot; ! TARGET_ZCODE 436 @streamunichar ch; 437 #endif; 438 } 439 if (i == 0) rfalse; 440 rtrue; 441];

447[ TEXT_TY_Say_Capitalised txt mod rc; 448 mod = BlkValueCreate(TEXT_TY); 449 TEXT_TY_SubstitutedForm(mod, txt); 450 if (TEXT_TY_CharacterLength(mod) > 0) { 451 BlkValueWrite(mod, 0, CharToCase(BlkValueRead(mod, 0), 1)); 452 TEXT_TY_Say(mod); 453 rc = true; 454 say__p = 1; 455 } 456 BlkValueFree(mod); 457 return rc; 458];

470[ TEXT_TY_WriteFile txt len pos ch p cp; 471 cp = txt-->0; p = TEXT_TY_Temporarily_Transmute(txt); 472 len = BlkValueLBCapacity(txt); 473 print "S"; 474 for (pos=0: pos<=len: pos++) { 475 if (pos == len) ch = 0; else ch = BlkValueRead(txt, pos); 476 if (ch == 0) { 477 print "0;"; break; 478 } else { 479 print ch, ","; 480 } 481 } 482 TEXT_TY_Untransmute(txt, p, cp); 483];

491[ TEXT_TY_ReadFile txt auxf ch i v dg pos tsize p; 492 TEXT_TY_Transmute(txt); 493 tsize = BlkValueLBCapacity(txt); 494 while (ch ~= 32 or 9 or 10 or 13 or 0 or -1) { 495 ch = FileIO_GetC(auxf); 496 if (ch == ',' or ';') { 497 if (pos+1 >= tsize) { 498 if (BlkValueSetLBCapacity(txt, 2*pos) == false) break; 499 tsize = BlkValueLBCapacity(txt); 500 } 501 BlkValueWrite(txt, pos++, v); 502 v = 0; 503 if (ch == ';') break; 504 } else { 505 dg = ch - '0'; 506 v = v*10 + dg; 507 } 508 } 509 BlkValueWrite(txt, pos, 0); 510 return txt; 511];

516[ TEXT_TY_SubstitutedForm to txt; 517 if (txt) { 518 BlkValueCopy(to, txt); 519 TEXT_TY_Transmute(to); 520 } 521 return to; 522]; 523 524[ TEXT_TY_IsSubstituted txt; 525 if ((txt) && 526 (txt-->0 & BLK_BVBITMAP_LONGBLOCKMASK == 0) && 527 (txt-->1 ofclass Routine)) rfalse; 528 rtrue; 529];

536[ TEXT_TY_ExpandIfPerishable to from; 537 if ((from) && (from-->0 == CONSTANT_PERISHABLE_TEXT_STORAGE)) 538 return TEXT_TY_SubstitutedForm(to, from); 539 return from; 540];

Recognition-only-GPR.

An I6 general parsing routine to look at words from the position marker wn in the player's command to see if they match the contents of the text txt, returning either GPR_PREPOSITION or GPR_FAIL according to whether a match could be made. This is used when the an object's name is set to include one of its properties, and the property in question is a text: "A flowerpot is a kind of thing. A flowerpot has a text called pattern. Understand the pattern property as describing a flowerpot." When the player types EXAMINE STRIPED FLOWERPOT, and there is a flowerpot in scope, the following routine is called to test whether its pattern property – a text – matches any words at the position STRIPED FLOWERPOT. Assuming a pot does indeed have the pattern "striped", the routine advances wn by 1 and returns GPR_PREPOSITION to indicate a match.

This kind of GPR is called a "recognition-only-GPR", because it only recognises an existing value: it doesn't parse a new one.

561[ TEXT_TY_ROGPR txt p cp r; 562 if (txt == 0) return GPR_FAIL; 563 cp = txt-->0; p = TEXT_TY_Temporarily_Transmute(txt); 564 r = TEXT_TY_ROGPRI(txt); 565 TEXT_TY_Untransmute(txt, p, cp); 566 return r; 567]; 568[ TEXT_TY_ROGPRI txt 569 pos len wa wl wpos bdm ch own; 570 bdm = true; own = wn; 571 len = BlkValueLBCapacity(txt); 572 for (pos=0: pos<=len: pos++) { 573 if (pos == len) ch = 0; else ch = BlkValueRead(txt, pos); 574 if (ch == 32 or 9 or 10 or 0) { 575 if (bdm) continue; 576 bdm = true; 577 if (wpos ~= wl) return GPR_FAIL; 578 if (ch == 0) break; 579 } else { 580 if (bdm) { 581 bdm = false; 582 if (NextWordStopped() == -1) return GPR_FAIL; 583 wa = WordAddress(wn-1); 584 wl = WordLength(wn-1); 585 wpos = 0; 586 } 587 if (wa->wpos ~= ch or TEXT_TY_RevCase(ch)) return GPR_FAIL; 588 wpos++; 589 } 590 } 591 if (wn == own) return GPR_FAIL; ! Progress must be made to avoid looping 592 return GPR_PREPOSITION; 593];

Blobs.

That completes the compulsory services required for this KOV to function: from here on, the remaining routines provide definitions of text-related phrases in the Standard Rules.

What are the basic operations of text-handling? Clearly we want to be able to search, and replace, but that is left for the segment RegExp.i6t to handle. More basically we would like to be able to read and write characters from the text. But texts in I7 tend to be of natural language, rather than containing arbitrary material – that's indeed why we call them texts rather than strings. This means they are likely to be punctuated sequences of words, divided up perhaps into sentences and even paragraphs.

So we provide facilities which regard a text as being an array of "blobs", where a "blob" is a unit of text. The user can choose whether to see it as an array of characters, or words (of three different sorts: see the Inform documentation for details), or paragraphs, or lines.

614Constant CHR_BLOB = 1; ! Construe as an array of characters 615Constant WORD_BLOB = 2; ! Of words 616Constant PWORD_BLOB = 3; ! Of punctuated words 617Constant UWORD_BLOB = 4; ! Of unpunctuated words 618Constant PARA_BLOB = 5; ! Of paragraphs 619Constant LINE_BLOB = 6; ! Of lines 620 621Constant REGEXP_BLOB = 7; ! Not a blob type as such, but needed as a distinct value

634Constant WS_BRM = 1; 635Constant SKIPPED_BRM = 2; 636Constant ACCEPTED_BRM = 3; 637Constant ACCEPTEDP_BRM = 4; 638Constant ACCEPTEDN_BRM = 5; 639Constant ACCEPTEDPN_BRM = 6; 640 641[ TEXT_TY_BlobAccess txt blobtype ctxt wanted rtxt 642 p1 p2 cp1 cp2 r; 643 if (txt==0) return 0; 644 if (blobtype == CHR_BLOB) return TEXT_TY_CharacterLength(txt); 645 cp1 = txt-->0; p1 = TEXT_TY_Temporarily_Transmute(txt); 646 cp2 = rtxt-->0; p2 = TEXT_TY_Temporarily_Transmute(rtxt); 647 TEXT_TY_Transmute(ctxt); 648 r = TEXT_TY_BlobAccessI(txt, blobtype, ctxt, wanted, rtxt); 649 TEXT_TY_Untransmute(txt, p1, cp1); 650 TEXT_TY_Untransmute(rtxt, p2, cp2); 651 return r; 652]; 653[ TEXT_TY_BlobAccessI txt blobtype ctxt wanted rtxt 654 brm oldbrm ch i dsize csize blobcount gp cl j; 655 dsize = BlkValueLBCapacity(txt); 656 if (ctxt) csize = BlkValueLBCapacity(ctxt); 657 else if (rtxt) "*** rtxt without ctxt ***"; 658 brm = WS_BRM; 659 for (i=0:i<dsize:i++) { 660 ch = BlkValueRead(txt, i); 661 if (ch == 0) break; 662 oldbrm = brm; 663 if (ch == 10 or 13 or 32 or 9) { 664 if (oldbrm ~= WS_BRM) { 665 gp = 0; 666 for (j=i:j<dsize:j++) { 667 ch = BlkValueRead(txt, j); 668 if (ch == 0) { brm = WS_BRM; break; } 669 if (ch == 10 or 13) { gp++; continue; } 670 if (ch ~= 32 or 9) break; 671 } 672 ch = BlkValueRead(txt, i); 673 if (j == dsize) brm = WS_BRM; 674 switch (blobtype) { 675 PARA_BLOB: if (gp >= 2) brm = WS_BRM; 676 LINE_BLOB: if (gp >= 1) brm = WS_BRM; 677 default: brm = WS_BRM; 678 } 679 } 680 } else { 681 gp = false; 682 if ((blobtype == WORD_BLOB or PWORD_BLOB or UWORD_BLOB) && 683 (ch == '.' or ',' or '' or '?' 684 or '-' or '/' or '' or ':' or ';' 685 or '(' or ')' or '[' or ']' or '{' or '}')) 686 gp = true; 687 switch (oldbrm) { 688 WS_BRM: 689 brm = ACCEPTED_BRM; 690 if (blobtype == WORD_BLOB) { 691 if (gp) brm = SKIPPED_BRM; 692 } 693 if (blobtype == PWORD_BLOB) { 694 if (gp) brm = ACCEPTEDP_BRM; 695 } 696 SKIPPED_BRM: 697 if (blobtype == WORD_BLOB) { 698 if (gp == false) brm = ACCEPTED_BRM; 699 } 700 ACCEPTED_BRM: 701 if (blobtype == WORD_BLOB) { 702 if (gp) brm = SKIPPED_BRM; 703 } 704 if (blobtype == PWORD_BLOB) { 705 if (gp) brm = ACCEPTEDP_BRM; 706 } 707 ACCEPTEDP_BRM: 708 if (blobtype == PWORD_BLOB) { 709 if (gp == false) brm = ACCEPTED_BRM; 710 else { 711 if ((ch == BlkValueRead(txt, i-1)) && 712 (ch == '-' or '.')) blobcount--; 713 blobcount++; 714 } 715 } 716 ACCEPTEDN_BRM: 717 if (blobtype == WORD_BLOB) { 718 if (gp) brm = SKIPPED_BRM; 719 } 720 if (blobtype == PWORD_BLOB) { 721 if (gp) brm = ACCEPTEDP_BRM; 722 } 723 ACCEPTEDPN_BRM: 724 if (blobtype == PWORD_BLOB) { 725 if (gp == false) brm = ACCEPTED_BRM; 726 else { 727 if ((ch == BlkValueRead(txt, i-1)) && 728 (ch == '-' or '.')) blobcount--; 729 blobcount++; 730 } 731 } 732 } 733 } 734 if (brm == ACCEPTED_BRM or ACCEPTEDP_BRM) { 735 if (oldbrm ~= brm) blobcount++; 736 if ((ctxt) && (blobcount == wanted)) { 737 if (rtxt) { 738 BlkValueWrite(ctxt, cl, 0); 739 TEXT_TY_Concatenate(ctxt, rtxt, CHR_BLOB); 740 csize = BlkValueLBCapacity(ctxt); 741 cl = TEXT_TY_CharacterLength(ctxt); 742 if (brm == ACCEPTED_BRM) brm = ACCEPTEDN_BRM; 743 if (brm == ACCEPTEDP_BRM) brm = ACCEPTEDPN_BRM; 744 } else { 745 if (cl+1 >= csize) { 746 if (BlkValueSetLBCapacity(ctxt, 2*cl) == false) break; 747 csize = BlkValueLBCapacity(ctxt); 748 } 749 BlkValueWrite(ctxt, cl++, ch); 750 } 751 } else { 752 if (rtxt) { 753 if (cl+1 >= csize) { 754 if (BlkValueSetLBCapacity(ctxt, 2*cl) == false) break; 755 csize = BlkValueLBCapacity(ctxt); 756 } 757 BlkValueWrite(ctxt, cl++, ch); 758 } 759 } 760 } else { 761 if ((rtxt) && (brm ~= ACCEPTEDN_BRM or ACCEPTEDPN_BRM)) { 762 if (cl+1 >= csize) { 763 if (BlkValueSetLBCapacity(ctxt, 2*cl) == false) break; 764 csize = BlkValueLBCapacity(ctxt); 765 } 766 BlkValueWrite(ctxt, cl++, ch); 767 } 768 } 769 } 770 if (ctxt) BlkValueWrite(ctxt, cl++, 0); 771 return blobcount; 772];

779[ TEXT_TY_GetBlob ctxt txt wanted blobtype; 780 if (txt==0) return; 781 if (blobtype == CHR_BLOB) return TEXT_TY_GetCharacter(ctxt, txt, wanted); 782 TEXT_TY_BlobAccess(txt, blobtype, ctxt, wanted); 783 return ctxt; 784];

791[ TEXT_TY_ReplaceBlob blobtype txt wanted rtxt ctxt ilen rlen i p cp; 792 TEXT_TY_Transmute(txt); 793 cp = rtxt-->0; p = TEXT_TY_Temporarily_Transmute(rtxt); 794 if (blobtype == CHR_BLOB) { 795 ilen = TEXT_TY_CharacterLength(txt); 796 rlen = TEXT_TY_CharacterLength(rtxt); 797 wanted--; 798 if ((wanted >= 0) && (wanted<ilen)) { 799 if (rlen == 1) { 800 BlkValueWrite(txt, wanted, BlkValueRead(rtxt, 0)); 801 } else { 802 ctxt = BlkValueCreate(TEXT_TY); 803 TEXT_TY_Transmute(ctxt); 804 if (BlkValueSetLBCapacity(ctxt, ilen+rlen+1)) { 805 for (i=0:i<wanted:i++) 806 BlkValueWrite(ctxt, i, BlkValueRead(txt, i)); 807 for (i=0:i<rlen:i++) 808 BlkValueWrite(ctxt, wanted+i, BlkValueRead(rtxt, i)); 809 for (i=wanted+1:i<ilen:i++) 810 BlkValueWrite(ctxt, rlen+i-1, BlkValueRead(txt, i)); 811 BlkValueWrite(ctxt, rlen+ilen, 0); 812 BlkValueCopy(txt, ctxt); 813 } 814 BlkValueFree(ctxt); 815 } 816 } 817 } else { 818 ctxt = BlkValueCreate(TEXT_TY); 819 TEXT_TY_BlobAccess(txt, blobtype, ctxt, wanted, rtxt); 820 BlkValueCopy(txt, ctxt); 821 BlkValueFree(ctxt); 822 } 823 TEXT_TY_Untransmute(rtxt, p, cp); 824];

Replace Text.

This is the general routine which searches for any instance of ftxt, as a blob, in txt, and replaces it with the text rtxt. It works on any of the above blob-types, but two cases are special: first, if the blob-type is CHR_BLOB, then it can do more than search and replace for any instance of a single character: it can search and replace any instance of a substring, so that ftxt is not required to be only a single character. Second, if the blob-type is the special value REGEXP_BLOB then ftxt is interpreted as a regular expression rather than something literal to find: see RegExp.i6t for what happens next.

838[ TEXT_TY_ReplaceText blobtype txt ftxt rtxt 839 r p1 p2 cp1 cp2; 840 TEXT_TY_Transmute(txt); 841 cp1 = ftxt-->0; p1 = TEXT_TY_Temporarily_Transmute(ftxt); 842 cp2 = rtxt-->0; p2 = TEXT_TY_Temporarily_Transmute(rtxt); 843 r = TEXT_TY_ReplaceTextI(blobtype, txt, ftxt, rtxt); 844 TEXT_TY_Untransmute(ftxt, p1, cp1); 845 TEXT_TY_Untransmute(rtxt, p2, cp2); 846 return r; 847]; 848 849[ TEXT_TY_ReplaceTextI blobtype txt ftxt rtxt 850 ctxt csize ilen flen i cl mpos ch chm whitespace punctuation; 851 if (blobtype == REGEXP_BLOB or CHR_BLOB) 852 return TEXT_TY_Replace_RE(blobtype, txt, ftxt, rtxt); 853 854 ilen = TEXT_TY_CharacterLength(txt); 855 flen = TEXT_TY_CharacterLength(ftxt); 856 ctxt = BlkValueCreate(TEXT_TY); 857 TEXT_TY_Transmute(ctxt); 858 csize = BlkValueLBCapacity(ctxt); 859 mpos = 0; 860 861 whitespace = true; punctuation = false; 862 for (i=0:i<=ilen:i++) { 863 ch = BlkValueRead(txt, i); 864 .MoreMatching; 865 chm = BlkValueRead(ftxt, mpos++); 866 if (mpos == 1) { 867 switch (blobtype) { 868 WORD_BLOB: 869 if ((whitespace == false) && (punctuation == false)) chm = -1; 870 } 871 } 872 whitespace = false; 873 if (ch == 10 or 13 or 32 or 9) whitespace = true; 874 punctuation = false; 875 if (ch == '.' or ',' or '' or '?' 876 or '-' or '/' or '' or ':' or ';' 877 or '(' or ')' or '[' or ']' or '{' or '}') { 878 if (blobtype == WORD_BLOB) chm = -1; 879 punctuation = true; 880 } 881 if (ch == chm) { 882 if (mpos == flen) { 883 if (i == ilen) chm = 0; 884 else chm = BlkValueRead(txt, i+1); 885 if ((blobtype == CHR_BLOB) || 886 (chm == 0 or 10 or 13 or 32 or 9) || 887 (chm == '.' or ',' or '' or '?' 888 or '-' or '/' or '' or ':' or ';' 889 or '(' or ')' or '[' or ']' or '{' or '}')) { 890 mpos = 0; 891 cl = cl - (flen-1); 892 BlkValueWrite(ctxt, cl, 0); 893 TEXT_TY_Concatenate(ctxt, rtxt, CHR_BLOB); 894 csize = BlkValueLBCapacity(ctxt); 895 cl = TEXT_TY_CharacterLength(ctxt); 896 continue; 897 } 898 } 899 } else { 900 mpos = 0; 901 } 902 if (cl+1 >= csize) { 903 if (BlkValueSetLBCapacity(ctxt, 2*cl) == false) break; 904 csize = BlkValueLBCapacity(ctxt); 905 } 906 BlkValueWrite(ctxt, cl++, ch); 907 } 908 BlkValueCopy(txt, ctxt); 909 BlkValueFree(ctxt); 910];

917[ TEXT_TY_CharacterLength txt ch i dsize p cp r; 918 if (txt==0) return 0; 919 cp = txt-->0; p = TEXT_TY_Temporarily_Transmute(txt); 920 dsize = BlkValueLBCapacity(txt); r = dsize; 921 for (i=0:i<dsize:i++) { 922 ch = BlkValueRead(txt, i); 923 if (ch == 0) { r = i; break; } 924 } 925 TEXT_TY_Untransmute(txt, p, cp); 926 return r; 927]; 928 929[ TEXT_TY_Empty txt; 930 if (txt==0) rtrue; 931 if (txt-->0 & BLK_BVBITMAP_LONGBLOCKMASK == 0) { 932 if (txt-->1 == EMPTY_TEXT_PACKED) rtrue; 933 rfalse; 934 } 935 if (TEXT_TY_CharacterLength(txt) == 0) rtrue; 936 rfalse; 937];

945[ TEXT_TY_GetCharacter ctxt txt i ch p cp; 946 if (txt==0) return 0; 947 cp = txt-->0; p = TEXT_TY_Temporarily_Transmute(txt); 948 TEXT_TY_Transmute(ctxt); 949 if ((i<=0) || (i>TEXT_TY_CharacterLength(txt))) ch = 0; 950 else ch = BlkValueRead(txt, i-1); 951 BlkValueWrite(ctxt, 0, ch); 952 BlkValueWrite(ctxt, 1, 0); 953 TEXT_TY_Untransmute(txt, p, cp); 954 return ctxt; 955];

Casing.

In many programming languages, characters are a distinct data type from strings, but not in I7. To I7, a character is simply a text which happens to have length 1 – this has its inefficiencies, but is conceptually easy for the user.

TEXT_TY_CharactersOfCase(txt, case) determines whether all the characters in txt are letters of the given casing: 0 for lower case, 1 for upper case. In the case of ZSCII, this is done correctly handling all of the European accented letters; in the case of Unicode, it follows the Unicode standard.

Note that there is no requirement for txt to be only a single character long.

972[ TEXT_TY_CharactersOfCase txt case i ch len p cp r; 973 if (txt==0) return 0; 974 cp = txt-->0; p = TEXT_TY_Temporarily_Transmute(txt); 975 len = TEXT_TY_CharacterLength(txt); 976 r = true; 977 for (i=0:i<len:i++) { 978 ch = BlkValueRead(txt, i); 979 if ((ch) && (CharIsOfCase(ch, case) == false)) { r = false; break; } 980 } 981 TEXT_TY_Untransmute(txt, p, cp); 982 return r; 983];

992[ TEXT_TY_CharactersToCase ctxt txt case i ch len bnd pk cp; 993 if (txt==0) return 0; 994 cp = txt-->0; pk = TEXT_TY_Temporarily_Transmute(txt); 995 TEXT_TY_Transmute(ctxt); 996 len = TEXT_TY_CharacterLength(txt); 997 if (BlkValueSetLBCapacity(ctxt, len+1)) { 998 bnd = 1; 999 for (i=0:i<len:i++) { 1000 ch = BlkValueRead(txt, i); 1001 if (case < 2) { 1002 BlkValueWrite(ctxt, i, CharToCase(ch, case)); 1003 } else { 1004 BlkValueWrite(ctxt, i, CharToCase(ch, bnd)); 1005 if (case == 2) { 1006 bnd = 0; 1007 if (ch == 0 or 10 or 13 or 32 or 9 1008 or '.' or ',' or '' or '?' 1009 or '-' or '/' or '' or ':' or ';' 1010 or '(' or ')' or '[' or ']' or '{' or '}') bnd = 1; 1011 } 1012 if (case == 3) { 1013 if (ch ~= 0 or 10 or 13 or 32 or 9) { 1014 if (bnd == 1) bnd = 0; 1015 else { 1016 if (ch == '.' or '' or '?') bnd = 1; 1017 } 1018 } 1019 } 1020 } 1021 } 1022 BlkValueWrite(ctxt, len, 0); 1023 } 1024 TEXT_TY_Untransmute(txt, pk, cp); 1025 return ctxt; 1026];

1039[ TEXT_TY_Concatenate to_txt from_txt blobtype ref_txt 1040 p cp r; 1041 if (to_txt==0) rfalse; 1042 if (from_txt==0) return to_txt; 1043 TEXT_TY_Transmute(to_txt); 1044 cp = from_txt-->0; p = TEXT_TY_Temporarily_Transmute(from_txt); 1045 r = TEXT_TY_ConcatenateI(to_txt, from_txt, blobtype, ref_txt); 1046 TEXT_TY_Untransmute(from_txt, p, cp); 1047 return r; 1048]; 1049 1050[ TEXT_TY_ConcatenateI to_txt from_txt blobtype ref_txt 1051 pos len ch i tosize x y case; 1052 switch(blobtype) { 1053 CHR_BLOB, 0: 1054 pos = TEXT_TY_CharacterLength(to_txt); 1055 len = TEXT_TY_CharacterLength(from_txt); 1056 if (BlkValueSetLBCapacity(to_txt, pos+len+1) == false) return to_txt; 1057 for (i=0:i<len:i++) { 1058 ch = BlkValueRead(from_txt, i); 1059 BlkValueWrite(to_txt, i+pos, ch); 1060 } 1061 BlkValueWrite(to_txt, len+pos, 0); 1062 return to_txt; 1063 REGEXP_BLOB: 1064 return TEXT_TY_RE_Concatenate(to_txt, from_txt, blobtype, ref_txt); 1065 } 1066 print "*** TEXT_TY_Concatenate used on impossible blob type ***^"; 1067 rfalse; 1068];

I6 Template Layer

Text.i6t

Block Format.

Extent Of Long Block.

Character Set.

KOV Support.

Debugging.

Creation.

Copy Short Block.

Transmutation.

Mutability.

Casting.

Data Conversion.

Z Version.

Glulx Version.

Comparison.

Hashing.

Printing.

Capitalised printing.

Serialisation.

Unserialisation.

Substitution.

Perishability.

Recognition-only-GPR.

Blobs.

Blob Access.

Get Blob.

Replace Blob.

Replace Text.

Character Length.

Get Character.

Casing.

Change Case.

Concatenation.

Setting the Player's Command.