Building a .NET Disassembler (Part 2) – Reading Virtual Directories and Sections

At the end of Part 1 of this series, we had read in the DOS, COFF and PE headers. In this part, we are going to read in 3 more key pieces of data; The Virtual Directories, Sections, and the CLR Header.

Reading the Virtual Directories

Immediately following the PE header, there is a set of Virtual Directory headers. The number of directories was specified in the PE Header in the ‘DirectoryLength’ (also known as ‘NumberOfRvaAndSizes’) property. Typically the value is 16, because there are 16 defined virtual directories. Interestingly, ILDasm only dumps the names of 15 of them when you use the /ALL switch, so I am just calling the last one “Reserved”:

    public enum DataDirectoryName
    {
        Export,
        Import,
        Resource,
        Exception,
        Security,
        Relocation,
        Debug,
        Copyright,
        GlobalPtr,
        ThreadLocalStorage,
        LoadConfig,
        BoundImport,
        ImportAddressTable,
        DelayLoadImportAddressTable,
        CLRHeader,
        Reserved
    }

Each of these virtual directory headers is 16 bytes, defined as:

    public class DataDirectory
    {
        public ulong Address { get; set; }
        public ulong Size { get; set; }
    }

Reading them in is pretty straight forward:

    private IList ReadDirectoriesList(uint directoryCount)
    {
        var result = new List((int)directoryCount);
        for(var i = 0; i < directoryCount; i++)
        {
            result.Add(new DataDirectory
                           {
                               Address = _assemblyReader.ReadUInt32(),
                               Size = _assemblyReader.ReadUInt32()
                           });
        }
        return result;
    }

In a moment, we will coder reading the bytes of each virtual directory entry, but first, we need to load the Sections.

Reading the Sections

Sections are named blocks within the file. Each section has a text name that is at most 8 characters (actually, it might be at most 7 characters, since the 8th would be a null character to terminate the string), and the first character should be a period (‘.’).

The list of section headers immediately follows the list of Virtual Directory headers. Each section header has the definition:

    public class Section
    {
        public string Name { get; set; }
        public ulong VirtualSize { get; set; }
        public ulong VirtualAddress { get; set; }
        public ulong SizeOfRawData { get; set; }
        public ulong PointerToRawData { get; set; }
        public ulong PointerToRelocations { get; set; }
        public ulong PointerToLinenumbers { get; set; }
        public ushort NumberOfRelocations { get; set; }
        public ushort NumberOfLinenumbers { get; set; }
        public ulong Characteristics { get; set; }
    }

The number of sections is defined in the PE Header in the “NumberOfSections” property. The only out-of-the-ordinary field to read is the “Name” because it is stored as a null-terminated string in the file. .NET doesn’t normally store strings that way (it stores them length prefixed), so we need to be a little tricky. I snagged some code off StackOverflow to deal with it, without reverting to any unsafe code:

        private IList<Section> ReadSectionsList(int numberOfSections)
        {
            var result = new List<Section>(numberOfSections);
            for (var i = 0; i < numberOfSections; i++)
            {
                result.Add(new Section
                               {
                                   Name = ReadNullTermString(_assemblyReader, 8),
                                   VirtualSize = _assemblyReader.ReadUInt32(),
                                   VirtualAddress = _assemblyReader.ReadUInt32(),
                                   SizeOfRawData = _assemblyReader.ReadUInt32(),
                                   PointerToRawData = _assemblyReader.ReadUInt32(),
                                   PointerToRelocations = _assemblyReader.ReadUInt32(),
                                   PointerToLinenumbers = _assemblyReader.ReadUInt32(),
                                   NumberOfRelocations = _assemblyReader.ReadUInt16(),
                                   NumberOfLinenumbers = _assemblyReader.ReadUInt16(),
                                   Characteristics = _assemblyReader.ReadUInt32()
                               });
            }
            return result;
        }

        private static string ReadNullTermString(BinaryReader reader, int readLength)
        {
            var bytes = reader.ReadChars(readLength);
            return new string(bytes.TakeWhile(b => !b.Equals('\0')).ToArray());
        }

Now we have our Section headers. From there, it is easy to get the bytes that make up that section. The section “PointerToRawData” is the offset address within the file, and the “SizeOfRawData” is the length of the section in bytes, so to read the section’s bytes, simply:

        public byte[] ReadSection(BinaryReader reader, Section section)
        {
            reader.BaseStream.Seek((long)section.PointerToRawData, SeekOrigin.Begin);
            return reader.ReadBytes((int)section.SizeOfRawData);
        }

Reading the Virtual Directories (again)

I mentioned above that we needed to know the sections in order to read the virtual directories. That is because the virtual directory data is stored within a section (typically the “.text” section). The Address that we got in each Section header is really a Virtual Address, not an offset into the file itself. Each Section has a property “VirtualAddress” that specifies the virtual address of the first byte of that section. Also, the section’s “PointerToRawData” property is the file offset address of the same first byte of the section. In other words, if you imagine the file is one big byte[], then

file[section.PointerToRawData]

is the same byte that is pointed to by the

section.VirtualAddress

value.

So anyway, to find the file offset address of a virtual directory, we first need to find the section that contains the virtual directory. To do this, just loop over all the section headers, and find the one where

s => sectionHeader.VirtualAddress >= virtualDataDirectory.Address && sectionHeader.VirtualAddress + sectionHeader.SizeOfRawData <= virtualDataDirectory.Address

then you can get the relative offset of the virtual addresses, and read the actual virtual data directory’s bytes. So it looks like this:

    public byte[] ReadVirtualDirectory(BinaryReader reader, DataDirectory dataDirectory, IList<Section> sections)
    {
        // find the section whose virtual address range contains the data directory's virtual address.
        var section = sections.First(s => s.VirtualAddress >= dataDirectory.Address
            && s.VirtualAddress + s.SizeOfRawData <= dataDirectory.Address);

        // calculate the offset into the file.
        var fileOffset = section.PointerToRawData + (dataDirectory.Address - section.VirtualAddress);

        // read the virtual directory data.
        reader.BaseStream.Seek((long)fileOffset, SeekOrigin.Begin);
        return reader.ReadBytes((int)dataDirectory.Size);
    }

Ain’t that grand?

Next time …

That seems like enough for tonight. Hopefully in the next part, we will get to dealing with actual .NET dissasembly stuff, instead of just reading headers. Oh and by the way, I am actually learning this as I go, so I am doing the research, writing some code, then writing these blog posts, so sorry if I get some of it wrong here and there. I hope to put all my code on GitHub at some point too.

Update: Part 3 is here.

Advertisements
Tagged with: ,
Posted in Programming
4 comments on “Building a .NET Disassembler (Part 2) – Reading Virtual Directories and Sections
  1. […] 2 is here: Building a .NET Disassembler (Part 2) – Reading Virtual Directories and Sections Share this:TwitterFacebookLike this:LikeBe the first to like […]

  2. […] the previous part, we figured out how to read each of the “sections” out of the assembly. One of these […]

  3. c_edi says:

    Hey,

    i’m trying to follow your steps, but my code fails at the ReadVirtualDirectory code.
    I always get a InvalidOperatoinException.

    Then i replaced the Lambda with a simple foreach loop to get the first item by hand, then i realized, that there is no selection in my IList where the if statement is true.
    The first part of the statement (s.VirtualAddress >= dataDirectory.Address) is sometimes true, but the second part (s.VirtualAddress + s.SizeOfRawData <= dataDirectory.Address) didn't became true all the time.

  4. Should be this:

    // find the section whose virtual address range contains the data directory’s virtual address.
    var section = sections.First(s => s.VirtualAddress = dataDirectory.Address);

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

CodingWithSpike is Jeff Valore. A professional software engineer, focused on JavaScript, Web Development, C# and the Microsoft stack. Jeff is currently a Software Engineer at Virtual Hold Technologies.


I am also a Pluralsight author. Check out my courses!

%d bloggers like this: