I search for solutions in this order: Past Code, Unreal Source, Wiki, BUF, groups.yahoo, google, screaming at monitor. – RegularX

Legacy:CRC32

From Unreal Wiki, The Unreal Engine Documentation Site
Jump to: navigation, search

CRC32 (Wikipedia:Cyclic Redundancy Check) is an algorithm that digests data into a 32-bit checksum. Small changes of the digested data yield large changes of the produced checksum. It is (fundamentally) impossible to retrieve the original data from the calculated checksum.

Checksums have a lot of uses. The implementation below is used in a Jailbreak class that transmits statistics data to a central stats server. The client knows the secret password necessary to authenticate itself with the server; but to keep that password from being sent over the Internet in clear text, the client calculates a checksum over the transmitted stats data and the password. The server, knowing the password as well, can then validate the data and the client's authenticity by calculating the same checksum itself and comparing it to the given one.

(A more sophisticated algorithm would have the server send the client a challenge first that's also digested by the checksum algorithm. That way the server could be sure that the client really is aware of the password and not just copying a prior overheard request.)

Implementation

The following implementation in UnrealScript has been created by Mychaeel for the Jailbreak mod. Feel free to use and modify.

var private int CrcTable[256];
 
 
// ============================================================================
// CrcInit
//
// Initializes CrcTable and prepares it for use with Crc.
// ============================================================================
 
final function CrcInit() {
 
  const CrcPolynomial = 0xedb88320;
 
  local int CrcValue;
  local int IndexBit;
  local int IndexEntry;
 
  for (IndexEntry = 0; IndexEntry < 256; IndexEntry++) {
    CrcValue = IndexEntry;
 
    for (IndexBit = 8; IndexBit > 0; IndexBit--)
      if ((CrcValue & 1) != 0)
        CrcValue = (CrcValue >>> 1) ^ CrcPolynomial;
      else
        CrcValue = CrcValue >>> 1;
 
    CrcTable[IndexEntry] = CrcValue;
    }
  }
 
 
// ============================================================================
// Crc
//
// Calculates and returns a checksum of the given string. Call CrcInit before.
// ============================================================================
 
final function int Crc(coerce string Text) {
 
  local int CrcValue;
  local int IndexChar;
 
  CrcValue = 0xffffffff;
 
  for (IndexChar = 0; IndexChar < Len(Text); IndexChar++)
    CrcValue = (CrcValue >>> 8) ^ CrcTable[Asc(Mid(Text, IndexChar, 1)) ^ (CrcValue & 0xff)];
 
  return CrcValue;
  }

Discussion

Wormbo: Looks like Asc()'s unicode-support breaks the Crc() function when passing in strings that contain certain characters in UT2004: E.g. when passing in an "ä" as the parameter of a commandlet from a DOS prompt Asc() returns 65508 and 65508 ^ (something & 0xff) will always be greater than 255, which is the CrcTable's maximum possible index.

A potential fix could be:

final function int Crc(coerce string Text)
{
  local int CrcValue;
  local int IndexChar;
  local int StrLen;
 
  CrcValue = 0xffffffff;
  StrLen = Len(Text);
  for (IndexChar = 0; IndexChar < StrLen; IndexChar++)
    CrcValue = (CrcValue >>> 8) ^ CrcTable[(Asc(Mid(Text, IndexChar, 1)) ^ CrcValue) & 0xff];
 
  return CrcValue;
}

This also improves performance for large strings by by calling Len() only once.

Jimboh: Hmm, what if I want to find the crc of a stream of data?

final function int Crc(coerce string Text, int oldCrc, bool useOldCrc)
{
    local int CrcValue;
    local int IndexChar;
    local int StrLen;
 
    CrcValue = 0xffffffff;
    if (useOldCrc)
        CrcValue = oldCrc;
 
    StrLen = Len(Text);
    for (IndexChar = 0; IndexChar < StrLen; IndexChar++)
        CrcValue = (CrcValue >>> 8) ^ CrcTable[(Asc(Mid(Text, IndexChar, 1)) ^ CrcValue) & 0xff];
 
    return CrcValue;
}

Also, is there a reason why this implementation uses a string instead of a byte array? Are byte arrays slower to read or something?

Wormbo: For small amounts of data it's not a problem. For larger data, however, strings are much slower than byte arrays. If you already have your data in byte array form, by all means calculate the CRC directly from it. Whether you convert a string to byte array first or use the string version of the CRC function directly doesn't make a difference, though.

Related Links