User:Wormbo/Random useful code snippets: Difference between revisions
simple linear search |
→Binary Search: abusing it for finding longest common string prefix |
||
Line 40: | Line 40: | ||
</uscript> | </uscript> | ||
You can assume that <code>Low</code> always is a valid insert index less than or equal to <code>A.Length</code>, so <code>A.Insert(Low, 1); A[Low] = Value;</code> (or <code>A.InsertItem(Low, Value);</code> in [[UE3]]) will always work correctly. | You can assume that <code>Low</code> always is a valid insert index less than or equal to <code>A.Length</code>, so <code>A.Insert(Low, 1); A[Low] = Value;</code> (or <code>A.InsertItem(Low, Value);</code> in [[UE3]]) will always work correctly. | ||
==Finding common string prefix== | |||
For [[wp:incremental encoding|incremental encoding]] you need to find the common prefix of two strings. Usually you'd do that by comparing corresponding characters of both strings, starting from the front, but UnrealScript is quite inefficient for that. I found that, at least in UT2004, you could use the StrCmp() function to compare string prefixes. Now you only need to find the largest prefix length at which both string starts are still equal. Technically this is a searching operation and the easy way to do it would be a linear search. But if there's a reasonably good chance the strings have large common parts, you could even abuse binary search for this. The following code snippet finds the largest common prefix length of strings A and B: | |||
<uscript> | |||
local int Low, High, Middle; | |||
Low = 0; | |||
High = Min(Len(A), Len(B)); | |||
while (Low < High) { | |||
Middle = (High + Low) / 2; | |||
if (StrCmp(A, B, Middle) == 0) | |||
Low = Middle + 1; | |||
else | |||
High = Middle; | |||
} | |||
</uscript> | |||
Looks scary? I hope so. ;-) | |||
==Collapsing whitespace== | |||
This code will remove leading and trailing whitespace from a string and simultaneously collapse any sequence of space and tab characters into a single space: | |||
<uscript> | |||
static final function string CollapseAndTrimWhitespace(string S) | |||
{ | |||
S = " " $ Repl(S, Chr(9), " ") $ " "; // replace tabs with spaces and enclose in space chars for collapsing | |||
if (InStr(S, " ") != -1) { // need whitespace collapsing? | |||
do { | |||
S = Repl(S, " ", " "); | |||
} until (InStr(S, " ") == -1); | |||
} | |||
return Mid(S, 1, Len(S) - 2); // remove enclosing space chars | |||
} | |||
</uscript> | |||
{{DEFAULTSORT:{{SUBPAGENAME}}}} | {{DEFAULTSORT:{{SUBPAGENAME}}}} | ||
[[Category:UnrealScript utils]] | [[Category:UnrealScript utils]] |
Revision as of 14:48, 5 September 2010
On this page I will post various code snippets that are too useful to keep them to myself. Some of these probably deserve their own article pages with background information, and maybe I'll create them at some point.
Linear search one-liner
The most simple way to search for a value in an array is linear search. The following implementation is (almost) a one-liner that works on all arrays, sorted or not. <uscript> local int i;
i = -1;
do {} until (++i == A.Length || A[i] == Value);
</uscript>
Success is indicated by i < A.Length
, failure by i == A.Length
.
If the array is sorted and has more than just a few elements, you should consider using binary search instead.
Binary Search
The following very simply piece of code implements the binary search algorithm with deferred detection of equality, looking for Value
in array A
. If <vode>Value isn't in A
, the code returns the index where Value
would have to be inserted.
<uscript>
local int Low, High, Middle;
Low = 0; High = A.Length; while (Low < High) {
Middle = (High + Low) / 2; if (A[Middle] < Value) Low = Middle + 1; else High = Middle;
} </uscript> As you can see, the advantage here is that each loop iteration only performs a single less-than comparison between the search value and an array element.
So, why would you combine the equality and greater-than comparisons and give up early abort? Quite simple: The equality comparison would fail most of the time anyway, so it doesn't actually give that much of an advantage, considering binary search only has an over all worst case time complexity of O(log2n)
. And, well, you only need to implement a less-than comparison as opposed to less-then and greater-than. Here "less than" could actually mean "greater than" if the array is sorted in descending order or anything else, as long as it is a strict total order on the possible values for the array's type.
To figure out whether you found the value or not, an additional comparison is required: <uscript> if (Low < A.Length && A[Low] == Value)
// found it
else
// not found
</uscript>
You can assume that Low
always is a valid insert index less than or equal to A.Length
, so A.Insert(Low, 1); A[Low] = Value;
(or A.InsertItem(Low, Value);
in UE3) will always work correctly.
Finding common string prefix
For incremental encoding you need to find the common prefix of two strings. Usually you'd do that by comparing corresponding characters of both strings, starting from the front, but UnrealScript is quite inefficient for that. I found that, at least in UT2004, you could use the StrCmp() function to compare string prefixes. Now you only need to find the largest prefix length at which both string starts are still equal. Technically this is a searching operation and the easy way to do it would be a linear search. But if there's a reasonably good chance the strings have large common parts, you could even abuse binary search for this. The following code snippet finds the largest common prefix length of strings A and B: <uscript> local int Low, High, Middle;
Low = 0; High = Min(Len(A), Len(B)); while (Low < High) {
Middle = (High + Low) / 2; if (StrCmp(A, B, Middle) == 0) Low = Middle + 1; else High = Middle;
} </uscript> Looks scary? I hope so. ;-)
Collapsing whitespace
This code will remove leading and trailing whitespace from a string and simultaneously collapse any sequence of space and tab characters into a single space: <uscript> static final function string CollapseAndTrimWhitespace(string S) {
S = " " $ Repl(S, Chr(9), " ") $ " "; // replace tabs with spaces and enclose in space chars for collapsing if (InStr(S, " ") != -1) { // need whitespace collapsing? do { S = Repl(S, " ", " "); } until (InStr(S, " ") == -1); } return Mid(S, 1, Len(S) - 2); // remove enclosing space chars
} </uscript>