I search for solutions in this order: Past Code, Unreal Source, Wiki, BUF, groups.yahoo, google, screaming at monitor. – RegularX

Difference between revisions of "User:Wormbo/Random useful code snippets"

From Unreal Wiki, The Unreal Engine Documentation Site
Jump to: navigation, search
m (Collapsing whitespace: why this is a good idea)
m (Binary Search: more observations and a potentially useful modification)
Line 14: Line 14:
  
 
==Binary Search==
 
==Binary Search==
The following very simply piece of code implements the [[wp:binary search algorithm|binary search algorithm]] with deferred detection of equality, looking for <code>Value</code> in array <code>A</code>. If <vode>Value</code> isn't in <code>A</code>, the code returns the index where <code>Value</code> would have to be inserted.
+
The following very simply piece of code implements the [[wp:binary search algorithm|binary search algorithm]] with deferred detection of equality, looking for ''Value'' in array ''A''. If ''Value'' isn't in ''A'', the code returns the index where ''Value'' would have to be inserted.
 
<uscript>
 
<uscript>
 
local int Low, High, Middle;
 
local int Low, High, Middle;
Line 30: Line 30:
 
As you can see, the advantage here is that each loop iteration only performs a single less-than comparison between the search value and an array element.
 
As you can see, the advantage here is that each loop iteration only performs a single less-than comparison between the search value and an array element.
  
So, why would you combine the equality and greater-than comparisons and give up early abort? Quite simple: The equality comparison would fail most of the time anyway, so it doesn't actually give that much of an advantage, considering binary search only has an over all worst case time complexity of <code>O(log<sub>2</sub>n)</code>. And, well, you only need to implement a less-than comparison as opposed to less-then and greater-than. Here "less than" could actually mean "greater than" if the array is sorted in descending order or anything else, as long as it is a [[wp:strict total order|strict total order]] on the possible values for the array's [[type]].
+
So, why would you combine the equality and greater-than comparisons and give up early abort? Quite simple: The equality comparison would fail most of the time anyway, so it doesn't actually give that much of an advantage, considering binary search only has an overall worst case time complexity of <code>O(log<sub>2</sub>n)</code>. And, well, you only need to implement a less-than comparison as opposed to less-then and greater-than. Here "less than" could actually mean "greater than" if the array is sorted in descending order or anything else, as long as it is a [[wp:strict total order|strict total order]] on the possible values for the array's [[type]].
  
 
To figure out whether you found the value or not, an additional comparison is required:
 
To figure out whether you found the value or not, an additional comparison is required:
Line 39: Line 39:
 
   // not found
 
   // not found
 
</uscript>
 
</uscript>
You can assume that <code>Low</code> always is a valid insert index less than or equal to <code>A.Length</code>, so <code>A.Insert(Low, 1); A[Low] = Value;</code> (or <code>A.InsertItem(Low, Value);</code> in [[UE3]]) will always work correctly.
+
You can assume that ''Low'' always is a valid insert index less than or equal to <code>A.Length</code>, so <code>A.Insert(Low, 1); A[Low] = Value;</code> (or <code>A.InsertItem(Low, Value);</code> in [[UE3]]) will always work correctly.
 +
 
 +
It is guaranteed that <code>Low < A.Length</code> implies <code>A[Low] >= Value</code> and <code>Low > 0</code> implies <code>A[Low-1] < Value</code>. That means if you want to remove all values less than ''Value'', you can simply do <code>A.Remove(0, Low)</code>. To remove all values greater than or equal to ''Value'', you use <code>A.Remove(Low, A.Length - Low)</code>.
 +
 
 +
To find the first index with a value greater than ''Value'', you just need a tiny modification:
 +
<uscript>
 +
local int Low, High, Middle;
 +
 
 +
Low = 0;
 +
High = A.Length;
 +
while (Low < High) {
 +
  Middle = (High + Low) / 2;
 +
  if (A[Middle] <= Value)
 +
    Low = Middle + 1;
 +
  else
 +
    High = Middle;
 +
}
 +
</uscript>
 +
This version can be used if you want to ensure that new instances of values considered as "equal" are inserted after all previous instances, whereas the other version of the algorithm finds an insert location before all other equal instances.
 +
 
 +
Again you can assume that ''Low'' always is a valid insert index less than or equal to <code>A.Length</code>. It is guaranteed that <code>Low < A.Length</code> implies <code>A[Low] > Value</code> and <code>Low > 0</code> implies <code>A[Low-1] <= Value</code>. That means if you want to remove all values less than or equal to ''Value'', you can simply do <code>A.Remove(0, Low)</code>. To remove all values greater than ''Value'', you use <code>A.Remove(Low, A.Length - Low)</code>. Note how the "or equal to" now is part of the other side of the array.
  
 
==Finding common string prefix==
 
==Finding common string prefix==

Revision as of 02:41, 2 October 2010

On this page I will post various code snippets that are too useful to keep them to myself. Some of these probably deserve their own article pages with background information, and maybe I'll create them at some point.

Linear search one-liner

The most simple way to search for a value in an array is linear search. The following implementation is (almost) a one-liner that works on all arrays, sorted or not.

local int i;
 
i = -1;
do {} until (++i == A.Length || A[i] == Value);

Success is indicated by i < A.Length, failure by i == A.Length.

If the array is sorted and has more than just a few elements, you should consider using binary search instead.

Binary Search

The following very simply piece of code implements the binary search algorithm with deferred detection of equality, looking for Value in array A. If Value isn't in A, the code returns the index where Value would have to be inserted.

local int Low, High, Middle;
 
Low = 0;
High = A.Length;
while (Low < High) {
  Middle = (High + Low) / 2;
  if (A[Middle] < Value)
    Low = Middle + 1;
  else
    High = Middle;
}

As you can see, the advantage here is that each loop iteration only performs a single less-than comparison between the search value and an array element.

So, why would you combine the equality and greater-than comparisons and give up early abort? Quite simple: The equality comparison would fail most of the time anyway, so it doesn't actually give that much of an advantage, considering binary search only has an overall worst case time complexity of O(log2n). And, well, you only need to implement a less-than comparison as opposed to less-then and greater-than. Here "less than" could actually mean "greater than" if the array is sorted in descending order or anything else, as long as it is a strict total order on the possible values for the array's type.

To figure out whether you found the value or not, an additional comparison is required:

if (Low < A.Length && A[Low] == Value)
  // found it
else
  // not found

You can assume that Low always is a valid insert index less than or equal to A.Length, so A.Insert(Low, 1); A[Low] = Value; (or A.InsertItem(Low, Value); in UE3) will always work correctly.

It is guaranteed that Low < A.Length implies A[Low] >= Value and Low > 0 implies A[Low-1] < Value. That means if you want to remove all values less than Value, you can simply do A.Remove(0, Low). To remove all values greater than or equal to Value, you use A.Remove(Low, A.Length - Low).

To find the first index with a value greater than Value, you just need a tiny modification:

local int Low, High, Middle;
 
Low = 0;
High = A.Length;
while (Low < High) {
  Middle = (High + Low) / 2;
  if (A[Middle] <= Value)
    Low = Middle + 1;
  else
    High = Middle;
}

This version can be used if you want to ensure that new instances of values considered as "equal" are inserted after all previous instances, whereas the other version of the algorithm finds an insert location before all other equal instances.

Again you can assume that Low always is a valid insert index less than or equal to A.Length. It is guaranteed that Low < A.Length implies A[Low] > Value and Low > 0 implies A[Low-1] <= Value. That means if you want to remove all values less than or equal to Value, you can simply do A.Remove(0, Low). To remove all values greater than Value, you use A.Remove(Low, A.Length - Low). Note how the "or equal to" now is part of the other side of the array.

Finding common string prefix

For incremental encoding you need to find the common prefix of two strings. Usually you'd do that by comparing corresponding characters of both strings, starting from the front, but UnrealScript is quite inefficient for that. I found that, at least in UT2004, you could use the StrCmp() function to compare string prefixes. Now you only need to find the largest prefix length at which both string starts are still equal. Technically this is a searching operation and the easy way to do it would be a linear search. But if there's a reasonably good chance the strings have large common parts, you could even abuse binary search for this. The following code snippet finds the largest common prefix length of strings A and B:

local int Low, High, Middle;
 
Low = 0;
High = Min(Len(A), Len(B));
while (Low < High) {
  Middle = (High + Low) / 2;
  if (StrCmp(A, B, Middle) == 0)
    Low = Middle + 1;
  else
    High = Middle;
}

Looks scary? I hope so. ;-)

Collapsing whitespace

This code will remove leading and trailing whitespace from a string and simultaneously collapse any sequence of space and tab characters into a single space:

static final function string CollapseAndTrimWhitespace(string S)
{
  S = " " $ Repl(S, Chr(9), " ") $ " "; // replace tabs with spaces and enclose in space chars for collapsing
  while (InStr(S, "  ") != -1) { // need whitespace collapsing?
    S = Repl(S, "  ", " ");
  }
  return Mid(S, 1, Len(S) - 2); // remove enclosing space chars
}

I know, InStr() and Repl() are really linear time operations, causing the algorithm to be something like O(n*logn). But as usual UnrealScript's slowness adds a high constant part to them, essentially making them constant time for reasonably short strings (up to a few hundred characters), and the whole function sort-of O(logn).