Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about .drm's internal structure #1

Open
KillerBeer01 opened this issue Jul 21, 2021 · 3 comments
Open

Question about .drm's internal structure #1

KillerBeer01 opened this issue Jul 21, 2021 · 3 comments

Comments

@KillerBeer01
Copy link

Hi,

I'd like to know more about layouts of data stored in .drm files from DXHR (more specifically, in DTPData sections). I managed to adapt your code to produce more meaningful (as I hoped) dumps of their content, but still can't figure out how the data works. I suppose the key here lies in Resolvers tables that each section has, but when I try to analyze them... it just doesn't make any sense to me. I can't see any data type or length markers, any consistent scheme in how various sections refer to each other, nor any "starting point" from which a chain to data I need could be traced. The only rock solid fact I figured is that every PointerOffset address refers to a EBBEEBBE byte sequence, but that's that. There are data segments not addressed by any DataOffset, there are DataOffset addresses pointing beyond the edge of MemoryStream data field, there are EBBEEBBE's not addressed by any PointerAddress or addressed by a DataOffset, and lots of other stuff I can't make sense of. Whatever shadow of a pattern I may seem to establish looking at data in several .drm's is being promptly debunked by a different .drm where the same pattern fails despite all expectations.

I'm specifically interested in dialogue-oriented information, data fields containing audio file names and line numbers in locals.bin file. I believe that DTPData sections with flag 0x54 are hubs for sections that contain the real data, and in those .drm's I explored I could "visually" trace the connectivity... just not strictly enough to build a parser on it.

If you have any clues (or know somebody who does) about ways for meaningful data to be extracted from .drm's , I'd be extremely grateful.

Thank you!

dtpdata
localsbindata
con02haasdump
con02haasgraph

@gibbed
Copy link
Owner

gibbed commented Jul 21, 2021

Sections in DRMs are flat arbitrary structures loaded into memory, where pointers are enough space for actual pointers that get resolved (overwritten) when the section is loaded at runtime. Which is why the bytes are EB BE EB BE by default (in some cases). There's a table of resolvers which either points to a local section in the local (current) file, or a remote section in another file.

Unfortunately the DRM format doesn't really have any indication of what type of structure any given section is. The game knows this based on what it's loading.

@KillerBeer01
Copy link
Author

> pointers are enough space for actual pointers that get resolved (overwritten) when the section is loaded at runtime
You mean that the same data segments addressed by DataOffsets are each resolved into runtime memory as PointerOffsets (so that these pointers' existence does not provide any additional info at all, analyze-wise), or do those pointers link to data segments from other resolvers/sections?

>The game knows this based on what it's loading.
This much I figured. But while this logic holds for simple structures like email_database.drm, for something complex like dialogues it's not enough to know what to load, but also where from, and that information must be stored somewhere. It would be logical to store it in .drm's themselves and not in the program code, and that's why I'm trying to look for "cornerstones" from which paths to necessary info could be navigated, possibly using custom rules once they are understood.

Has there been any at all insights on fields marked "Unknown" since the release of your code, BTW?

Thanks again for your work. I imagine that analyzing it all from scratch must have been hell of an effort.

@gibbed
Copy link
Owner

gibbed commented Jul 22, 2021

Consider something like this:

struct foo
{
  int bar;
  baz* qux;
  int quux;
  quux* corge;
};

This would be stored flat in the .drm as a segment, the pointers would have nonsense values (EB BE EB BE) in the actual data, and there would be a list of each pointer offset in the struct, and how to resolve them. So there would be offsets 4 and 12 into this data, plus a local or remote resolver information.

00 00 00 00 EB BE EB BE 00 00 00 00 EB BE EB BE

There's some comments in the code that reads the resolvers.

Local resolver:

// ((value & 0xFFFFFFFF00000000) >> 32) = pointer offset
// ((value & 0x00000000FFFFFFFF) >> 0) = data offset
// buffer[pointer] = &buffer[data]

Remote resolver:

// ((value & 0x0000000000003FFF) >> 0) = section index
// ((value & 0x0000003FFFFFC000) >> 14) * 4 = pointer offset
// ((value & 0xFFFFFFC000000000) >> 38) = data offset
// buffer[pointer] = &sections[index].buffer[data]

The resolver types in my code extract the necessary parts of the resolver bitflags, so that's handled for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants