Strings

The standard library contains a type for strings, named core.Str. The types core.Object and core.TObject both contain a member toS that creates a string representation of objects. There is also a template function toS that converts values to strings using the output operator for a string buffer. As such, most types can be converted to string by typing <obj>.toS().

The Str type is immutable, meaning that it is not possible to modify a string once it has been created. It provides an interface where it is possible to access individual codepoints without specifying an internal representation of the string. Since the internal representation is not the same as the representation that is exposed, the Str class does not provide indexed access to the codepoints (this would not be efficient). Rather, it is necessary to utilize iterators to refer to codepoints in the string. Since the iterators provide a + operator, it is possible to step iterators a specific number of codepoints conveniently.

String Operations

The Str class contains many functions for inspecting and modifying the string. Below is a selection of its functionality, categorized by theme:

Inspecting Content

Note that languages like Basic Storm automatically derives comparison operators from the ones that are provided.

Manipulation

Substrings

The following operations extract and inspect substrings:

The second parameter to find and findLast is optional. If it is omitted, the search starts at the start or end of the string respectively.

Conversion

The Str class contains functions for converting strings to numbers. As with other types, Str provides members named int, nat, long, word, float, and double to convert a string into another type. Since these conversions may fail, they all return Maybe<T> to indicate whether the conversion was successful or not. In cases where the conversion is expected to succeed (e.g. when the string originated from a match in a grammar), the Str class also provide functions toX where X is one of the types Int, Nat, Long, Word, Float, or Double. These functions all throw a core.StrError if the format is invalid.

For conversion from hexadecimal, Str provides the functions hexNat and hexWord. They work like nat and word. Similarly, there are versions toHexNat and toHexWord that throw an exception instead of returning Maybe<T>.

As usual, conversion from numbers to string can be done by calling toS on almost any type.

Other Utilities

There are also a few other utility functions provided:

Characters

The type Char represents a single unicode codepoint. This is the type that is returned from the iterators in the Str type, and as such what Str essentially contains (however, Str uses a more compact internal representation).

The String Buffer

The string buffer, StrBuf, is a mutable string that is able to build strings efficiently. The toS member function that exists for all types usually calls an overloaded version of toS that accepts a core.StrBuf as a parameter. This makes it possible for objects to create their string representation efficiently, rather than relying on string concatenation. For example, a toS implementation for a simple class could look like below:

class MyClass {
    Int value;

    protected void toS(StrBuf to) : override {
        to << "My class: " << value;
    }
}

As can be seen above, the StrBuf class utilizes the << operator to add strings to the end of the string buffer. There is also a member add that can be used in languages where the << operator is not available (e.g. the Syntax Language). The string buffer contains overloads for the primitive types in the standard library.

The string buffer also has the ability to format output. The following formatting options are available:

The StrBuf also contains the members indent and dedent that automatically indents the output by one additional of the indentation string. The indentation string can be set by calling indentBy. There is also a class Indent, that can be used to indent a StrBuf as long as the Indent object is in scope.