The three virtues of a programmer: Laziness, Impatience, and Hubris. – Larry Wall

String

From Unreal Wiki, The Unreal Engine Documentation Site
Jump to: navigation, search

UnrealScript has a character string data type called string. Strings can contain any combination of Unicode characters. Note that the internal representation uses UTF-16 characters. UTF-8 is not recognized by the compiler or any of the built-in string operations.

String values are immutable and UnrealScript neither provides "character" type nor allows direct access to individual string characters. There are, however, functions for extracting substrings and returning the Unicode value of the first character of a string. There's also a function for returning a string of length 1 containing a character corresponding to a specified Unicode value.

Very early builds of Unreal Engine 1 had a fixed-length string type, which was declared with the syntax string[length]. This type is no longer supported and only mentioned here in case you run into its declaration in old code snippets.

In Unreal Engine 2 (or at least UT2004) there also is an undocumented type called button that is an alias for the standard string type, but implies the Cache variable modifier. It is not actually used in stock code and there's no reason for you to ever declare a variable with the cache modifier, so this is purely informative.

Literals[edit]

String literals start and end with double quotes. Between these there may be any number of characters, except line breaks or the null character. To include a double quote or backslash character in the string, it must be "escaped" by a(nother) backslash character. Starting with Unreal Engine 3, strings can also contain other escape sequences, such as \n for a newline character. In Unreal Engine 1 and 2 escaped letters only stand for themselves, i.e. the string "\n" is absolutely equal to the string "n" there.

Examples:

"abc"
"This is an \"example\"."
"" (the empty string and default string value)

Technically string literals are allowed to have a length of up to 1023 characters. String values generated at runtime may be longer. Internally strings are zero-terminated, which means no string can contain the null character because it would be recognized as the end of the string.

String operators[edit]

The following operators are commonly declared in the Object(RTNP, U1, UT, U2, U2XMP, UE2Runtime, UT2003, UT2004, UDK, UT3) class and thus are available globally:

==
Case-sensitive equality comparison. Returns true iff the string expression on the left and right are identical.
~=
Case-insensitive equality comparison. Returns true iff the string expression on the left and right are identical, ignoring any differences in the case of the letters A-Z. Note that non-ASCII characters will be matched case-sensitively, i.e. "a" ~= "A" is true, but "Ä" ~= "ä" is false.
<
Case-sensitive lexicographical comparison. Returns true iff the string expression on the left comes lexicographically before the string expression on the right.
>
Case-sensitive lexicographical comparison. Returns true iff the string expression on the left comes lexicographically after the string expression on the right.
<=
Case-sensitive lexicographical comparison. Returns false iff the string expression on the left comes lexicographically after the string expression on the right.
>=
Case-sensitive lexicographical comparison. Returns false iff the string expression on the left comes lexicographically before the string expression on the right.
!=
Case-sensitive inequality comparison. Returns false iff the string expression on the left and right are identical.
$
String concatenation. "abc" $ "def" evaluates to the value "abcdef" at runtime. Note that the UnrealScript compiler will not optimize this kind of expression. This operator will "coerce" (i.e. automatically typecast) its operands to type string.
@
Spaced string concatenation. The expression a @ b essentially corresponds to a $ " " $ b, except that it is slightly more efficient. Still you may want to limit its use to concatinating two non-literal expressions, as "abc " $ d is slightly more efficient than "abc" @ d. This operator will "coerce" (i.e. automatically typecast) its operands to type string.
$=2+
Combined concatenation and assignment. Left operand must be an assignable expression, i.e. a variable, struct member or array element. This operator will "coerce" (i.e. automatically typecast) the right operand to type string.
@=2+
Combined spaced concatenation and assignment. Left operand must be an assignable expression, i.e. a variable, struct member or array element. Like for the @ operator, foo $= " bar" is slightly more efficient than foo @= "bar". This operator will "coerce" (i.e. automatically typecast) the right operand to type string.
-=2+
Substring removal. This operator will remove all substrings from the variable, struct member or array element on the left side that match the string expression on the right side case-sensitively. foo -= "an" will turn the string "Banana Ananas" into "Ba Anas". This operator will "coerce" (i.e. automatically typecast) the right operand to type string. Note that there is no operator - for strings. a -= b; is essentially the same as a = Repl(a, b, "", False);.
=
The build in assignment operator. Assigns the string value of the right side operand to the variable, struct member or array element represented by the left operand. Note that this operator does not "coerce" any operands and unlike the combined operators also does not return any value.

Global string functions[edit]

The following static string functions are commonly declared in the Object(RTNP, U1, UT, U2, U2XMP, UE2Runtime, UT2003, UT2004, UDK, UT3) class and thus are available globally:

int Len(coerce string S)
Returns the number of characters in the specified string value.
int InStr(coerce string S, coerce string T)
Locates the first case-sensitive match of T in S and returns its zero-based start position. Returns -1 if T cannot be found in S.
string Mid(coerce string S, int start, optional int length)
Returns the substring of S starting at position start, including up to length characters. If length is omitted, all characters up to the end of the string are returned.
string Left(coerce string S, int i)
Returns the first i (i.e. left-most) characters from S.
string Right(coerce string S, int i)
Returns the last i (i.e. right-most) characters from S.
string Caps(coerce string S)
Returns S with all lowercase ASCII letters (a-z) converted to their uppercase forms (A-Z).
string Locs(coerce string S)2+
Returns S with all uppercase ASCII letters (A-Z) converted to their lowercase forms (a-z).
string Chr(int i)
Returns a string of length 1 containing the Unicode character with the specified character code. Note that the empty string will be returned when specifying 0 as code.
int Asc(string S)
Returns the Unicode character code of the first character in S. This function will return 0 for the empty string. Note that unlike most other global string function, Asc does not "coerce" (auto-convert) its parameter to type string.
string Repl(coerce string Src, coerce string Match, coerce string With, optional bool bCaseSensitive)2+
Replaces every occurrence of Match in Src with With using case-sensitive or case-insensitive (this is the default) matching.