Legacy:Package File Format

The Unreal Engine uses a single file format to store all its game-content. You may have seen many different filetypes, like .utx (textures), .unr (maps), .umx (sound) and .u (code), but from a technical standpoint there is no difference between those files; the different file endings are only used to help organize the packages in the directory structure. The following article will describe the basic structure of his fileformat. It omits many details (such as tons of constants, for example), but there’s a good reference available on the net by Antonio Cordero Balcazar (see links).

Assumptions:

This is a rather technical article. It requires you to have a basic understanding of object oriented programming as well as the will to use an hex-editor, if needed. This is NOT intended to be a full documention of the fileformat, but only a brief introduction.

The Structure of the File[edit]

Overview[edit]

Every package file can be roughly split into three logical parts. The header, the three index tables (name-table, import-table and export-table) and the data itself. But only the header has a fixed position (at offset 0), all other parts can be found anywhere within the file without irritating the engine.

Most of the time, although, the layout looks like the following:

Header
Name-Table
Import-Table
Data
Export-Table

It may be useful to read a bit about the concept of serialization, which allows you to (rather) easily store the state of objects within a file. A brief introduction can be found on the Wiki: Package File Format/Serialization

Header[edit]

This global header can be found at the beginning of the file (offset 0). It is the starting point for every operation.

offset	Type	Property	Description
0	DWORD	Signature	Always: “0x9E2A83C1”; use this to verify that you indeed try to read an Unreal-Package
4	WORD	PackageVersion	Version of the file-format; Unreal1 uses mostly 61-63, UT 67-69; However note that quite a few packages are in use with UT that have Unreal1 versions. see the appendix for more details
6	WORD	LicenseMode	This is the license number. Different for each game.
8	DWORD	Package Flags	Global package flags, i.e. if a package may be downloaded from a game server etc; described in the appendix
12	DWORD	Name Count	No. Of entries in name-table
16	DWORD	Name Offset	Offset of name-table within the file
20	DWORD	Export Count	No. Of entries in export-table
24	DWORD	Export Offset	Offset of export-table within the file
28	DWORD	Import Count	No. Of entries in import-table
32	DWORD	Import Offset	Offset of import-table within the file
After the ImportOffset, the header differs between the versions. The only interesting fact, though, is that for fileformat versions => 68, a GUID has been introduced. It can be found right after the ImportOffset:
36	16 BYTE	GUID	Unique identifier; used for package downloading from servers
older package versions have a list of GUIDs (pointed to by the same form of count/offset pair as above) in a seperate section rather than just space for one, tests reveal that ut uses the last one in the list when there is more than one but such packages do not seem to be seen in the wild.

Index Tables[edit]

The Unreal-Engine introduces two new variable-types. The first one is a rather simple string type, called NAME from now on. The second one is a bit more tricky, these CompactIndices, or INDEX later on, compresses ordinary DWORDs downto one to five BYTEs. Both types, as well as the ObjectReference, are described in the following paper: Package File Format/Data Details

Name-Table[edit]

The first and most simple one of the three tables is the name-table. The name-table can be considered an index of all unique names used for objects and references within the file. Later on, you’ll often find indexes into this table instead of a string containing the object-name.

Type	Property	Description
NAME	Object Name
DWORD	Object Flags	Flags for the object; described in the appendix

Export-Table[edit]

The export-table is an index for all objects within the package. Every object in the body of the file has a corresponding entry in this table, with information like offset within the file etc.

Type	Property	Description
INDEX	Class	Class of the object, i.e. ‘Texture’ or ‘Palette’ etc; stored as a ObjectReference
INDEX	Super	Object Parent; again a ObjectReference
DWORD	Group	Internal package/group of the object, i.e. ‘Floor’ for floor-textures; ObjectReference
INDEX	Object Name	The name of the object; an index into the name-table
DWORD	Object Flags	Flags for the object; described in the appendix
INDEX	Serial Size	Total size of the object
INDEX	Serial Offset	Offset of the object; this field only exists if the SerialSize is larger 0

Import-Table[edit]

The third table holds references to objects in external packages. For example, a texture might have a DetailTexture (which makes for the nice structure if have a very close look at a texture). Now, these DetailTextures are all stored in a single package (as they are used by many different textures in different package files). The property of the texture object only needs to store an index into the import-table then as the entry in the import-table already points to the DetailTexture in the other package.

Type	Property	Description
INDEX	Class Package	Package file in which the class of the object is defined; an index into the name-table
INDEX	Class Name	Class of the object, i.e. ‘Texture’, ‘Palette’, ‘Package’, etc; an index into the name-table
DWORD	Package	Reference where the object resides; ObjectReference
INDEX	Object Name	The name of the object; an index into the name-table

Body/Object[edit]

Each object consists of a list of properties at the beginning and the actual object itself.

Object Properties[edit]

When jumping to the offset of an object, you'll first be confronted with the object properties before the actual object starts. The format is rather straightforward. The first byte is an INDEX-type reference into the Name-Table, giving you the property's name. The second byte does the magic of telling you what kind of data follows; for example 0x02 flags a DWORD sized integer type. Then comes the actual property-data. The procedure repeats itself until the reference into the Name-Table returns 'None' (case insensitive) as the name.

That said, there are some bit-tricks to deal with arrays, booleans and such. For more info on these, as well as a full list of info-bytes, read Antonio's package docs.

Sample Objects (Texture Class)[edit]

After the properties are finished the object starts. It basically consists of a predefined set of properties. As an example, the texture class (for good old UT) will be explained below. The texture class is a native one, which means that it doesn't have a generic header in addition to its own data. The layout looks like this:

Type	Property	Description
BYTE	MipMapCount	Count of MipMaps in object

The next set of variables repeats itself for each MipMap.

Type	Property	Description
DWORD	WidthOffset	Offset in file; should be the same as SerialOffset in the Export-Table. Only if PkgVer >= 63
INDEX	MipMapSize	Size of the image data (in bytes)
n BYTEs	MipMapData	Image data; one byte per pixel; n = MipMapSize
DWORD	Width	Texture-width
DWORD	Height	Texture-height
BYTE	BitsWidth	Number of bits of Width (e.g. 10 for 1024 pixels)
BYTE	BitsHeight	Number of bits of Height (e.g. 10 for 1024 pixels)

Appendix[edit]

A. Links[edit]

http://www.acordero.org/: _The_ ressource regarding package files. A very detailed reference of the package format, the UT-Package-Tool and a Delphi-unit can be found there.
http://ut-files.com/index.php?dir=Utilities/&file=utcms_source.zip: A C++ class for reading packages. Totally free for use. [Link updated with new location]

B. Notes[edit]

The last part about the object properties and the texture class was written in a hurry. I'm sorry it took so long for me to finish that piece.

The fileformat itself, btw, has not changed between the versions of UT (except the odd new property and such). Many of the objects however have changed a lot or were replaced by enhanced types (such as my beloved texture class...).

Comments/Discussion[edit]

Jesco: I will continue after here tomorrow. Now it's time for some sleep :)

Mychaeel: Good start. :-) Have a look at UMOD/File Format too if you haven't already. A common thing like the compact index format could move to a shared page, for instance.

Jesco: Ah, I haven't noticed that, yet. Saves me the hassle to explain the compact index ;) Where should the page for the compact index be put to? I suggest making it either a subpage of UMOD/File Format or Package File Format.

Mychaeel: Putting it on a subpage of Package File Format sounds more obvious to me.

Tarquin: Other pages to grab material from / link to / etc:

Package
Package redirects to the above.
UT Package Tool (just a link to a site)

Jesco: Ok, I'll work on it later today when I come back from university. Maybe I should also mail Antonio and ask im if I could post a copy of his reference docs for all those thousands of different objects that I don't have a clue of ;)

Jesco: I haven't forgot about this article, it just went down my priority list, unfortunately.

Diki: Hey Jesco, I dont suppose you seriously havent forgotten about this article. Im trying to find more info about this topic!

RmzVoid: Where I can get codes of Object Properties types?

Diablo: @ anyone who wants to dig deeper inside unreal file format structures: take a look at this project:

http://sourceforge.net/projects/ushock/

@3DBuzz: Can someone Please upload the "UTCMS_source.zip file" to another server? The current link is dead yet interest in reading the packages is still there.

Tarquin: Alternatively, someone could paste the code into a subpage here.

Plugwash: I wan't to make a tool that makes some changes to some of the tables without changing the bulk of the file. Is there any reason not to put the tables at the end of the file after everything else (yes i realise leaving the old tables in means a bit of bloat but it shouldn't be too significant)?

Xian: Well as far as I can remember, the order is: 1. Headers, 2. Linkers, 3. NameTable (+index where it begins to be used), 4. Compiled Code, 5. Decompiled Code (aka Core.TextBuffer). Although a completely rewritten file parser would be able to read it with the NT at the end, I don't see the point. The code uses pointers to each NT element. It is way more logical to say "Name <Pawn> has pointer 4F6G" and later make a reference to it in the Compiled Code, rather than for the code to memorize all used pointers and then parse the end of the file. I'd say the logic here is the same as compiling from end to beginning (if we'd compile from beginning to end, stuff like x = x + 2 or x += 2 wouldn't work, without pre-parsing, I guess). You might be able to add new elements to the table on the condition you change the index they get used at, you include serialization (to not get a serialization error) and modify the NT size and namespace used by linkers (also used by serialization I think). Excuse my raw descriptions, but it should be pretty accurate :)

Wormbo: The locations of the name, import and export tables are specified in the file header, and the locations of other objects in the file are are specified in the export table. Where those tables or objects actually are in the file or in what order they appear is irrelevant, as long as everything is in the location mentioned by the header or export table.

Xian: True. The linker descriptions specify each linker size and its offset (i.e. names, exports and imports), setting classes within the package as exports and used classes of other packages as imports. I do like the way the current order of file contents is done, since it's pretty logical (unlike a random placement), and I guess you could shift them back and forth, but it would be readable only by your tool, so I don't see much of a point.

Side note: thinking of inserting names, I am curious how the Engine would react to finding a name that is never used (although in theory it should be ignored). There is one way to convert a string to a name, but the rule is that the name should exist in the nametable.

Anyway, back on topic, what changes did you have in mind, Plugwash ?

Plugwash If i understand the formats intentions correctly i don't belive it will matter where the table is in the file, but obviously opinion is split here so trial and error is going to be the only way to find out ;). I wan't a string replacer mainly for use in dealing with conflicting packages (two packages with the same name but different contents), to some extent its possible to change strings in place by hand (i've done it before see the workaround i posted for the credits version mismatch issue on the UT troubleshooting page) but this limits you to replacing them with another string of the same length. On the other hand i really really don't wan't to go to the trouble of writing a full package deserialiser and reserialiser.

Plugwash Yep UT doesn't seem to care if the names section is at the end, i'm just trying to clarify the situation with regards guids now ;).

BigBadaBoom: Can anyone direct me to a documentation of the ArrayProperty? I've figured out most stuff I need myself but arrays are still a total puzzle to me. :(

Dimension4: Export table structure is invalid.

Dimension4: Making a Reserializer is quite hard cuz you have to change many offsets:

ImportTable offset

ExportTable offset

All offsets in ExportTable

You got the picture :eek:

Legacy:Package File Format

Contents