String Manipulation

Softcode

In MUSH softcode, every value is a string. Numbers, dbrefs, booleans, and lists are all represented as text. There are no separate data types. The string manipulation functions are therefore among the most heavily used parts of the function library.

Concatenation

cat() joins strings with a space between each one. strcat() joins them with no separator at all.

> say cat(hello, world)
You say, "hello world"
> say strcat(foo, bar, baz)
You say, "foobarbaz"

Use cat() when building readable output. Use strcat() when assembling tokens, dbrefs, or attribute names where spaces would break things.

Substrings

mid(<string>, <start>, <length>) extracts a substring. Positions are zero-based. If <length> is negative, extraction proceeds leftward from <start>.

left(<string>, <n>) (alias for strtrunc()) returns the first <n> characters. right(<string>, <n>) returns the last <n>.

delete(<string>, <first>, <len>) removes <len> characters starting at position <first>.

> say mid(abcdefgh, 2, 3)
You say, "cde"
> say delete(abcdefgh, 3, 2)
You say, "abcfgh"

Length

strlen(<string>) returns the number of visible characters (grapheme clusters). ANSI color codes do not count. strmem(<string>) returns the number of bytes the string occupies in memory. For pure ASCII text, the two are equal. For UTF-8, a single visible character such as e-acute (U+00E9) takes 2 bytes, so strlen() returns 1 while strmem() returns 2.

Searching

pos(<needle>, <haystack>) returns the 1-based position of the first occurrence of <needle> in <haystack>, or #-1 if not found.

lpos(<string>, <char>) returns a space-separated list of all 0-based positions where <char> appears.

strmatch(<string>, <pattern>) tests whether the entire string matches a wildcard pattern (* and ?), returning 1 or 0. Case-insensitive.

match(<list>, <pattern>) matches a wildcard pattern against each word in a list and returns the 1-based index of the first match (0 if none). member(<list>, <word>) does the same but with exact, case-sensitive comparison and no wildcards.

> say pos(man, superman)
You say, "6"
> say match(This is a test, *is*)
You say, "1"
> say member(This is a test, is)
You say, "2"

Transformation

edit(<string>, <from>, <to>) replaces all occurrences of <from> with <to>. The special <from> values ^ and $ prepend or append instead. Multiple pairs can be chained in a single call.

> say edit(Atlantic, ^, Trans)
You say, "TransAtlantic"

lcstr(), ucstr(), and capstr() convert to lowercase, uppercase, or capitalize the first character respectively.

scramble(<string>) returns a random permutation of all characters. reverse(<string>) reverses character order.

translate(<string>, <type>) converts raw ANSI codes and control characters. With type s (or 0), they become spaces. With type p (or 1), they become MUX substitutions like %c and %r.

Formatting and Alignment

ljust(), rjust(), and center() pad a string to a fixed width. All three accept an optional fill pattern.

> say -[ljust(foo, 6)]-
You say, "-foo   -"
> say =[center(*, 5, -)]=
You say, "=--*--="

printf(<format>, <args>...) provides C-style formatted output. Specifiers: %s (string), %d (integer), %f (float), %c (single character). Modifiers: - left-justify, = center (MUX extension), 0 zero-pad, a number for field width, .N for precision.

> say printf(|%-10s|%5d|, Apples, 42)
You say, "|Apples    |   42|"

table(<list>, <width>, <line length>) arranges list elements in a grid. columns(<list>, <width>) does similar columnar formatting. wrap(<text>, <width>) word-wraps text with optional justification, borders, and hanging indent.

ANSI Color

ansi(<codes>, <string>) wraps a string in ANSI terminal color. Codes include h (highlight), u (underline), f (flash), i (inverse), and n (normal). Foreground colors use lowercase letters: r red, g green, b blue, y yellow, c cyan, m magenta, x black, w white. Uppercase (R, G, B, etc.) sets the background. TinyMUX also supports 24-bit color: ansi(<#FF8040>/<#800080>, text).

stripansi(<string>) removes all ANSI codes from a string. This is important when you need the plain text for length calculations or storage.

Encoding and Safety

escape(<string>) prepends a backslash and escapes the characters %\[]{};,()^$ so that the string passes through evaluation unchanged. Use it when storing player-supplied text in attributes.

secure(<string>) replaces those same dangerous characters with spaces. It is a lossy alternative—the original string cannot be recovered.

encode64(<string>) and decode64(<string>) convert to and from Base64, useful for storing or transmitting binary-safe data.

url_escape(<string>) percent-encodes a string per RFC 3986 for use in URLs, and url_unescape() reverses it.

Buffer Limits

All string operations are constrained by LBUF_SIZE, which is 8000 bytes in TinyMUX. Any function result that would exceed this limit is silently truncated. Because LBUF_SIZE is measured in bytes, Unicode text reaches the limit sooner than ASCII text of the same visible length.

Common strategies for working within the limit: build output line by line with @pemit inside @dolist rather than assembling one huge string; use printf() for formatting instead of repeated strcat() calls; and check strmem() rather than strlen() when measuring proximity to the byte limit.

Common Patterns

Safe display of user input. Store with escape(), or sanitize with secure() before passing through @force or attribute evaluation.

Case-insensitive search. Use strmatch() or match(), both of which ignore case. For exact case-sensitive lookup in a list, use member().

Building a formatted table. Combine printf() for column alignment with iter() to loop over a list:

@pemit %#=iter(lattr(me/DATA_*),
  printf(%-15s %-10s %s, xget(me,##/NAME), xget(me,##/CLASS), xget(me,##/STATUS)))

Trimming and cleaning. Use edit(<string>, %b%b, %b) in a loop or trim() to collapse extra whitespace. Use strip(<string>, <chars>) to remove specific unwanted characters.