from:
http://msdn.microsoft.com/zh-cn/magazine/cc301808%28en-us%29.aspx
Download the code for this article: PE.exe
(98KB)
part1:
SUMMARY
A good understanding of the Portable Executable (PE) file format leads
to a good understanding of the operating system. If you know what's in
your DLLs and EXEs, you'll be a more knowledgeable programmer. This
article, the first of a two-part series, looks at the changes to the PE
format that have occurred over the last few years, along with an
overview of the format itself.
After this update, the author discusses how the PE format fits
into applications written for .NET, PE file sections, RVAs, the
DataDirectory, and the importing of functions. An appendix includes
lists of the relevant image header structures and their descriptions.
long time ago, in a
galaxy far away, I wrote one of my first articles for
Microsoft
Systems Journal (now
MSDN®
Magazine). The article, "
Peering
Inside the PE: A Tour of the Win32 Portable Executable File Format,"
turned out to be more popular than I had expected. To this day, I still
hear from people (even within Microsoft) who use that article, which is
still available from the MSDN Library. Unfortunately, the problem with
articles is that they're static. The world of Win32® has changed quite a
bit in the intervening years, and the article is severely dated. I'll
remedy that situation in a two-part article starting this month.
You might be wondering why you should care about the executable
file format. The answer is the same now as it was then: an operating
system's executable format and data structures reveal quite a bit about
the underlying operating system. By understanding what's in your EXEs
and DLLs, you'll find that you've become a better programmer all around.
Sure, you could learn a lot of what I'll tell you by reading the
Microsoft specification. However, like most specs, it sacrifices
readability for completeness. My focus in this article will be to
explain the most relevant parts of the story, while filling in the hows
and whys that don't fit neatly into a formal specification. In addition,
I have some goodies in this article that don't seem to appear in any
official Microsoft documentation.
Bridging the Gap
Let me give you just a few examples of what has changed since I
wrote the article in 1994. Since 16-bit Windows® is history, there's no
need to compare and contrast the format to the Win16 New Executable
format. Another welcome departure from the scene is Win32s®. This was
the abomination that ran Win32 binaries very shakily atop Windows 3.1.
Back then, Windows 95 (codenamed "Chicago" at the time) wasn't
even released. Windows NT® was still at version 3.5, and the linker
gurus at Microsoft hadn't yet started getting aggressive with their
optimizations. However, there were MIPS and DEC Alpha implementations of
Windows NT that added to the story.
And what about all the new things that have come along since that
article? 64-bit Windows introduces its own variation of the Portable
Executable (PE) format. Windows CE adds all sorts of new processor
types. Optimizations such as delay loading of DLLs, section merging, and
binding were still over the horizon. There are many new things to
shoehorn into the story.
And let's not forget about Microsoft® .NET. Where does it fit in?
To the operating system, .NET executables are just plain old Win32
executable files. However, the .NET runtime recognizes data within these
executable files as the metadata and intermediate language that are so
central to .NET. In this article, I'll knock on the door of the .NET
metadata format, but save a thorough survey of its full splendor for a
subsequent article.
And if all these additions and subtractions to the world of Win32
weren't enough justification to remake the article with modern day
special effects, there are also errors in the original piece that make
me cringe. For example, my description of Thread Local Storage (TLS)
support was way out in left field. Likewise, my description of the
date/time stamp DWORD used throughout the file format is accurate only
if you live in the Pacific time zone!
In addition, many things that were true then are incorrect now. I
had stated that the .rdata section wasn't really used for anything
important. Today, it certainly is. I also said that the .idata section
is a read/write section, which has been found to be most untrue by
people trying to do API interception today.
Along with a complete update of the PE format story in this
article, I've also overhauled the PEDUMP program, which displays the
contents of PE files. PEDUMP can be compiled and run on both the x86 and
IA-64 platforms, and can dump both 32 and 64-bit PE files. Most
importantly, full source code for PEDUMP is available for download fropm
the link at the top of this article, so you have a working example of
the concepts and data structures described here.
Overview of the
PE File Format
Microsoft introduced the PE File format, more commonly known as
the PE format, as part of the original Win32 specifications. However, PE
files are derived from the earlier Common Object File Format (COFF)
found on VAX/VMS. This makes sense since much of the original Windows NT
team came from Digital Equipment Corporation. It was natural for these
developers to use existing code to quickly bootstrap the new Windows NT
platform.
The term "Portable Executable" was chosen because the intent was
to have a common file format for all flavors of Windows, on all
supported CPUs. To a large extent, this goal has been achieved with the
same format used on Windows NT and descendants, Windows 95 and
descendants, and Windows CE.
OBJ files emitted by Microsoft compilers use the COFF format. You
can get an idea of how old the COFF format is by looking at some of its
fields, which use octal encoding! COFF OBJ files have many data
structures and enumerations in common with PE files, and I'll mention
some of them as I go along.
The addition of 64-bit Windows required just a few modifications
to the PE format. This new format is called PE32+. No new fields were
added, and only one field in the PE format was deleted. The remaining
changes are simply the widening of certain fields from 32 bits to 64
bits. In most of these cases, you can write code that simply works with
both 32 and 64-bit PE files. The Windows header files have the magic
pixie dust to make the differences invisible to most C++-based code.
The distinction between EXE and DLL files is entirely one of
semantics. They both use the exact same PE format. The only difference
is a single bit that indicates if the file should be treated as an EXE
or as a DLL. Even the DLL file extension is artificial. You can have
DLLs with entirely different extensions—for instance .OCX controls and
Control Panel applets (.CPL files) are DLLs.
A very handy aspect of PE files is that the data structures on
disk are the same data structures used in memory. Loading an executable
into memory (for example, by calling LoadLibrary) is primarily a matter
of mapping certain ranges of a PE file into the address space. Thus, a
data structure like the IMAGE_NT_HEADERS (which I'll examine later) is
identical on disk and in memory. The key point is that if you know how
to find something in a PE file, you can almost certainly find the same
information when the file is loaded in memory.
It's important to note that PE files are not just mapped into
memory as a single memory-mapped file. Instead, the Windows loader looks
at the PE file and decides what portions of the file to map in. This
mapping is consistent in that higher offsets in the file correspond to
higher memory addresses when mapped into memory. The offset of an item
in the disk file may differ from its offset once loaded into memory.
However, all the information is present to allow you to make the
translation from disk offset to memory offset (see
Figure 1).
Figure 1 Offsets
When PE files are loaded into memory via the Windows loader, the
in-memory version is known as a module. The starting address where the
file mapping begins is called an HMODULE. This is a point worth
remembering: given an HMODULE, you know what data structure to expect at
that address, and you can use that knowledge to find all the other data
structures in memory. This powerful capability can be exploited for
other purposes such as API interception. (To be completely accurate, an
HMODULE isn't the same as the load address under Windows CE, but that's a
story for yet another day.)
A module in memory represents all the code, data, and resources
from an executable file that is needed by a process. Other parts of a PE
file may be read, but not mapped in (for instance, relocations). Some
parts may not be mapped in at all, for example, when debug information
is placed at the end of the file. A field in the PE header tells the
system how much memory needs to be set aside for mapping the executable
into memory. Data that won't be mapped in is placed at the end of the
file, past any parts that will be mapped in.
The central location where the PE format (as well as COFF files)
is described is WINNT.H. Within this header file, you'll find nearly
every structure definition, enumeration, and #define needed to work with
PE files or the equivalent structures in memory. Sure, there is
documentation elsewhere. MSDN has the "Microsoft Portable Executable and
Common Object File Format Specification," for instance (see the October
2001 MSDN CD under Specifications). But WINNT.H is the final word on
what PE files look like.
There are many tools for examining PE files. Among them are
Dumpbin from Visual Studio, and Depends from the Platform SDK. I
particularly like Depends because it has a very succinct way of
examining a file's imports and exports. A great free PE viewer is
PEBrowse Professional, from Smidgeonsoft (
http://www.smidgeonsoft.com).
The PEDUMP program included with this article is also very
comprehensive, and does almost everything Dumpbin does.
From an API standpoint, the primary mechanism provided by
Microsoft for reading and modifying PE files is IMAGEHLP.DLL.
Before I start looking at the specifics of PE files, it's
worthwhile to first review a few basic concepts that thread their way
through the entire subject of PE files. In the following sections, I
will discuss PE file sections, relative virtual addresses (RVAs), the
data directory, and how functions are imported.
PE File Sections
A PE file section represents code or data of some sort. While code
is just code, there are multiple types of data. Besides read/write
program data (such as global variables), other types of data in sections
include API import and export tables, resources, and relocations. Each
section has its own set of in-memory attributes, including whether the
section contains code, whether it's read-only or read/write, and whether
the data in the section is shared between all processes using the
executable.
Generally speaking, all the code or data in a section is logically
related in some way. At a minimum, there are usually at least two
sections in a PE file: one for code, the other for data. Commonly,
there's at least one other type of data section in a PE file. I'll look
at the various kinds of sections in Part 2 of this article next month.
Each section has a distinct name. This name is intended to convey
the purpose of the section. For example, a section called .rdata
indicates a read-only data section. Section names are used solely for
the benefit of humans, and are insignificant to the operating system. A
section named FOOBAR is just as valid as a section called .text.
Microsoft typically prefixes their section names with a period, but it's
not a requirement. For years, the Borland linker used section names
like CODE and DATA.
While compilers have a standard set of sections that they
generate, there's nothing magical about them. You can create and name
your own sections, and the linker happily includes them in the
executable. In Visual C++, you can tell the compiler to insert code or
data into a section that you name with #pragma statements. For instance,
the statement
#pragma data_seg( "MY_DATA" )
causes all data emitted by Visual C++ to go into a section called
MY_DATA, rather than the default .data section. Most programs are fine
using the default sections emitted by the compiler, but occasionally you
may have funky requirements which necessitate putting code or data into
a separate section.
Sections don't spring fully formed from the linker; rather, they
start out in OBJ files, usually placed there by the compiler. The
linker's job is to combine all the required sections from OBJ files and
libraries into the appropriate final section in the PE file. For
example, each OBJ file in your project probably has at least a .text
section, which contains code. The linker takes all the sections named
.text from the various OBJ files and combines them into a single .text
section in the PE file. Likewise, all the sections named .data from the
various OBJs are combined into a single .data section in the PE file.
Code and data from .LIB files are also typically included in an
executable, but that subject is outside the scope of this article.
There is a rather complete set of rules that linkers follow to
decide which sections to combine and how. I gave an introduction to the
linker algorithms in the July 1997
Under
The Hood column in
MSJ. A section in an OBJ file may be
intended for the linker's use, and not make it into the final
executable. A section like this would be intended for the compiler to
pass information to the linker.
Sections have two alignment values, one within the disk file and
the other in memory. The PE file header specifies both of these values,
which can differ. Each section starts at an offset that's some multiple
of the alignment value. For instance, in the PE file, a typical
alignment would be 0x200. Thus, every section begins at a file offset
that's a multiple of 0x200.
Once mapped into memory, sections always start on at least a page
boundary. That is, when a PE section is mapped into memory, the first
byte of each section corresponds to a memory page. On x86 CPUs, pages
are 4KB aligned, while on the IA-64, they're 8KB aligned. The following
code shows a snippet of PEDUMP output for the .text and .data section of
the Windows XP KERNEL32.DLL.
Section Table
01 .text VirtSize: 00074658 VirtAddr: 00001000
raw data offs: 00000400 raw data size: 00074800
•••
02 .data VirtSize: 000028CA VirtAddr: 00076000
raw data offs: 00074C00 raw data size: 00002400
The .text section is at offset 0x400 in the PE file and will be 0x1000
bytes above the load address of KERNEL32 in memory. Likewise, the .data
section is at file offset 0x74C00 and will be 0x76000 bytes above
KERNEL32's load address in memory.
It's possible to create PE files in which the sections start at
the same offset in the file as they start from the load address in
memory. This makes for larger executables, but can speed loading under
Windows 9
x or Windows Me. The default /OPT:WIN98 linker option
(introduced in Visual Studio 6.0) causes PE files to be created this
way. In Visual Studio® .NET, the linker may or may not use /OPT:NOWIN98,
depending on whether the file is small enough.
An interesting linker feature is the ability to merge sections. If
two sections have similar, compatible attributes, they can usually be
combined into a single section at link time. This is done via the linker
/merge switch. For instance, the following linker option combines the
.rdata and .text sections into a single section called .text:
/MERGE:.rdata=.text
The advantage to merging sections is that it saves space, both on
disk and in memory. At a minimum, each section occupies one page in
memory. If you can reduce the number of sections in an executable from
four to three, there's a decent chance you'll use one less page of
memory. Of course, this depends on whether the unused space at the end
of the two merged sections adds up to a page.
Things can get interesting when you're merging sections, as there
are no hard and fast rules as to what's allowed. For example, it's OK to
merge .rdata into .text, but you shouldn't merge .rsrc, .reloc, or
.pdata into other sections. Prior to Visual Studio .NET, you could merge
.idata into other sections. In Visual Studio .NET, this is not allowed,
but the linker often merges parts of the .idata into other sections,
such as .rdata, when doing a release build.
Since portions of the imports data are written to by the Windows
loader when they are loaded into memory, you might wonder how they can
be put in a read-only section. This situation works because at load time
the system can temporarily set the attributes of the pages containing
the imports data to read/write. Once the imports table is initialized,
the pages are then set back to their original protection attributes.
Relative
Virtual Addresses
In an executable file, there are many places where an in-memory
address needs to be specified. For instance, the address of a global
variable is needed when referencing it. PE files can load just about
anywhere in the process address space. While they do have a preferred
load address, you can't rely on the executable file actually loading
there. For this reason, it's important to have some way of specifying
addresses that are independent of where the executable file loads.
To avoid having hardcoded memory addresses in PE files, RVAs are
used. An RVA is simply an offset in memory, relative to where the PE
file was loaded. For instance, consider an EXE file loaded at address
0x400000, with its code section at address 0x401000. The RVA of the code
section would be:
(target address) 0x401000 - (load address)0x400000 = (RVA)0x1000.
To convert an RVA to an actual address, simply reverse the
process: add the RVA to the actual load address to find the actual
memory address. Incidentally, the actual memory address is called a
Virtual Address (VA) in PE parlance. Another way to think of a VA is
that it's an RVA with the preferred load address added in. Don't forget
the earlier point I made that a load address is the same as the HMODULE.
Want to go spelunking through some arbitrary DLL's data structures
in memory? Here's how. Call GetModuleHandle with the name of the DLL.
The HMODULE that's returned is just a load address; you can apply your
knowledge of the PE file structures to find anything you want within the
module.
The Data Directory
There are many data structures within executable files that need
to be quickly located. Some obvious examples are the imports, exports,
resources, and base relocations. All of these well-known data structures
are found in a consistent manner, and the location is known as the
DataDirectory.
The DataDirectory is an array of 16 structures. Each array entry
has a predefined meaning for what it refers to. The
IMAGE_DIRECTORY_ENTRY_
xxx #defines are array indexes into the
DataDirectory (from 0 to 15).
Figure 2
describes what each of the IMAGE_DATA_DIRECTORY_
xxx values
refers to. A more detailed description of many of the pointed-to data
structures will be included in Part 2 of this article.
Importing
Functions
When you use code or data from another DLL, you're importing it.
When any PE file loads, one of the jobs of the Windows loader is to
locate all the imported functions and data and make those addresses
available to the file being loaded. I'll save the detailed discussion of
data structures used to accomplish this for Part 2 of this article, but
it's worth going over the concepts here at a high level.
When you link directly against the code and data of another DLL,
you're implicitly linking against the DLL. You don't have to do anything
to make the addresses of the imported APIs available to your code. The
loader takes care of it all. The alternative is explicit linking. This
means explicitly making sure that the target DLL is loaded and then
looking up the address of the APIs. This is almost always done via the
LoadLibrary and GetProcAddress APIs.
When you implicitly link against an API, LoadLibrary and
GetProcAddress-like code still executes, but the loader does it for you
automatically. The loader also ensures that any additional DLLs needed
by the PE file being loaded are also loaded. For instance, every normal
program created with Visual C++® links against KERNEL32.DLL.
KERNEL32.DLL in turn imports functions from NTDLL.DLL. Likewise, if you
import from GDI32.DLL, it will have dependencies on the USER32,
ADVAPI32, NTDLL, and KERNEL32 DLLs, which the loader makes sure are
loaded and all imports resolved. (Visual Basic 6.0 and the Microsoft
.NET executables directly link against a different DLL than KERNEL32,
but the same principles apply.)
When implicitly linking, the resolution process for the main EXE
file and all its dependent DLLs occurs when the program first starts. If
there are any problems (for example, a referenced DLL that can't be
found), the process is aborted.
Visual C++ 6.0 added the delayload feature, which is a hybrid
between implicit linking and explicit linking. When you delayload
against a DLL, the linker emits something that looks very similar to the
data for a regular imported DLL. However, the operating system ignores
this data. Instead, the first time a call to one of the delayloaded APIs
occurs, special stubs added by the linker cause the DLL to be loaded
(if it's not already in memory), followed by a call to GetProcAddress to
locate the called API. Additional magic makes it so that subsequent
calls to the API are just as efficient as if the API had been imported
normally.
Within a PE file, there's an array of data structures, one per
imported DLL. Each of these structures gives the name of the imported
DLL and points to an array of function pointers. The array of function
pointers is known as the import address table (IAT). Each imported API
has its own reserved spot in the IAT where the address of the imported
function is written by the Windows loader. This last point is
particularly important: once a module is loaded, the IAT contains the
address that is invoked when calling imported APIs.
The beauty of the IAT is that there's just one place in a PE file
where an imported API's address is stored. No matter how many source
files you scatter calls to a given API through, all the calls go through
the same function pointer in the IAT.
Let's examine what the call to an imported API looks like. There
are two cases to consider: the efficient way and inefficient way. In the
best case, a call to an imported API looks like this:
CALL DWORD PTR [0x00405030]
If you're not familiar with x86 assembly language, this is a call
through a function pointer. Whatever DWORD-sized value is at 0x405030 is
where the CALL instruction will send control. In the previous example,
address 0x405030 lies within the IAT.
The less efficient call to an imported API looks like this:
CALL 0x0040100C
•••
0x0040100C:
JMP DWORD PTR [0x00405030]
In this situation, the CALL transfers control to a small stub. The stub
is a JMP to the address whose value is at 0x405030. Again, remember that
0x405030 is an entry within the IAT. In a nutshell, the less efficient
imported API call uses five bytes of additional code, and takes longer
to execute because of the extra JMP.
You're probably wondering why the less efficient method would ever
be used. There's a good explanation. Left to its own devices, the
compiler can't distinguish between imported API calls and ordinary
functions within the same module. As such, the compiler emits a CALL
instruction of the form
CALL XXXXXXXX
where
XXXXXXXX is an actual code address that will be filled in
by the linker later. Note that this last CALL instruction isn't through a
function pointer. Rather, it's an actual code address. To keep the
cosmic karma in balance, the linker needs to have a chunk of code to
substitute for
XXXXXXXX. The simplest way to do this is to make
the call point to a JMP stub, like you just saw.
Where does the JMP stub come from? Surprisingly, it comes from the
import library for the imported function. If you were to examine an
import library, and examine the code associated with the imported API
name, you'd see that it's a JMP stub like the one just shown. What this
means is that by default, in the absence of any intervention, imported
API calls will use the less efficient form.
Logically, the next question to ask is how to get the optimized
form. The answer comes in the form of a hint you give to the compiler.
The __declspec(dllimport) function modifier tells the compiler that the
function resides in another DLL and that the compiler should generate
this instruction
CALL DWORD PTR [XXXXXXXX]
rather than this one:
CALL XXXXXXXX
In addition, the compiler emits information telling the linker to
resolve the function pointer portion of the instruction to a symbol
named __imp_functionname. For instance, if you were calling MyFunction,
the symbol name would be __imp_MyFunction. Looking in an import library,
you'll see that in addition to the regular symbol name, there's also a
symbol with the __imp__ prefix on it. This __imp__ symbol resolves
directly to the IAT entry, rather than to the JMP stub.
So what does this mean in your everyday life? If you're writing
exported functions and providing a .H file for them, remember to use the
__declspec(dllimport) modifier with the function:
__declspec(dllimport) void Foo(void);
If you look at the Windows system header files, you'll find that they
use __declspec(dllimport) for the Windows APIs. It's not easy to see
this, but if you search for the DECLSPEC_IMPORT macro defined in
WINNT.H, and which is used in files such as WinBase.H, you'll see how
__declspec(dllimport) is prepended to the system API declarations.
PE
File Structure
Now let's dig into the actual format of PE files. I'll start from
the beginning of the file, and describe the data structures that are
present in every PE file. Afterwards, I'll describe the more specialized
data structures (such as imports or resources) that reside within a
PE's sections. All of the data structures that I'll discuss below are
defined in WINNT.H, unless otherwise noted.
In many cases, there are matching 32 and 64-bit data
structures—for example, IMAGE_NT_HEADERS32 and IMAGE_NT_HEADERS64. These
structures are almost always identical, except for some widened fields
in the 64-bit versions. If you're trying to write portable code, there
are #defines in WINNT.H which select the appropriate 32 or 64-bit
structures and alias them to a size-agnostic name (in the previous
example, it would be IMAGE_NT_HEADERS). The structure selected depends
on which mode you're compiling for (specifically, whether _WIN64 is
defined or not). You should only need to use the 32 or 64-bit specific
versions of the structures if you're working with a PE file with size
characteristics that are different from those of the platform you're
compiling for.
The MS-DOS Header
Every PE file begins with a small MS-DOS® executable. The need for
this stub executable arose in the early days of Windows, before a
significant number of consumers were running it. When executed on a
machine without Windows, the program could at least print out a message
saying that Windows was required to run the executable.
The first bytes of a PE file begin with the traditional MS-DOS
header, called an IMAGE_DOS_HEADER. The only two values of any
importance are e_magic and e_lfanew. The e_lfanew field contains the
file offset of the PE header. The e_magic field (a WORD) needs to be set
to the value 0x5A4D. There's a #define for this value, named
IMAGE_DOS_SIGNATURE. In ASCII representation, 0x5A4D is MZ, the initials
of Mark Zbikowski, one of the original architects of MS-DOS.
The
IMAGE_NT_HEADERS Header
The IMAGE_NT_HEADERS structure is the primary location where
specifics of the PE file are stored. Its offset is given by the e_lfanew
field in the IMAGE_DOS_HEADER at the beginning of the file. There are
actually two versions of the IMAGE_NT_HEADER structure, one for 32-bit
executables and the other for 64-bit versions. The differences are so
minor that I'll consider them to be the same for the purposes of this
discussion. The only correct, Microsoft-approved way of differentiating
between the two formats is via the value of the Magic field in the
IMAGE_OPTIONAL_HEADER (described shortly).
An IMAGE_NT_HEADER is comprised of three fields:
typedef struct _IMAGE_NT_HEADERS {
DWORD Signature;
IMAGE_FILE_HEADER FileHeader;
IMAGE_OPTIONAL_HEADER32 OptionalHeader;
} IMAGE_NT_HEADERS32, *PIMAGE_NT_HEADERS32;
In a valid PE file, the Signature field is set to the value 0x00004550,
which in ASCII is "PE00". A #define, IMAGE_NT_SIGNATURE, is defined for
this value. The second field, a struct of type IMAGE_FILE_HEADER,
predates PE files. It contains some basic information about the file;
most importantly, a field describing the size of the optional data that
follows it. In PE files, this optional data is very much required, but
is still called the IMAGE_OPTIONAL_HEADER.
Figure 3
shows the fields of the IMAGE_FILE_HEADER structure, with additional
notes for the fields. This structure can also be found at the very
beginning of COFF OBJ files.
Figure 4
lists the common values of IMAGE_FILE_
xxx.
Figure 5
shows the members of the IMAGE_OPTIONAL_HEADER structure.
The DataDirectory array at the end of the IMAGE_OPTIONAL_HEADERs
is the address book for important locations within the executable. Each
DataDirectory entry looks like this:
typedef struct _IMAGE_DATA_DIRECTORY {
DWORD VirtualAddress; // RVA of the data
DWORD Size; // Size of the data
};
The Section Table
Immediately following the IMAGE_NT_HEADERS is the section table.
The section table is an array of IMAGE_SECTION_HEADERs structures. An
IMAGE_SECTION_HEADER provides information about its associated section,
including location, length, and characteristics.
Figure 6
contains a description of the IMAGE_SECTION_HEADER fields. The number
of IMAGE_SECTION_HEADER structures is given by the
IMAGE_NT_HEADERS.FileHeader.NumberOfSections field.
The file alignment of sections in the executable file can have a
significant impact on the resulting file size. In Visual Studio 6.0, the
linker defaulted to a section alignment of 4KB, unless /OPT:NOWIN98 or
the /ALIGN switch was used. The Visual Studio .NET linker, while still
defaulting to /OPT:WIN98, determines if the executable is below a
certain size and if that is the case uses 0x200-byte alignment.
Another interesting alignment comes from the .NET file
specification. It says that .NET executables should have an in-memory
alignment of 8KB, rather than the expected 4KB for x86 binaries. This is
to ensure that .NET executables built with x86 entry point code can
still run under IA-64. If the in-memory section alignment were 4KB, the
IA-64 loader wouldn't be able to load the file, since pages are 8KB on
64-bit Windows.
Wrap-up
That's it for the headers of PE files. In Part 2 of this article
I'll continue the tour of portable executable files by looking at
commonly encountered sections. Then I'll describe the major data
structures within those sections, including imports, exports, and
resources. And finally, I'll go over the source for the updated and
vastly improved PEDUMP.
part2:
SUMMARY The Win32 Portable Executable
File Format (PE) was designed to be a standard executable format for use
on all versions of the operating systems on all supported processors.
Since its introduction, the PE format has undergone incremental changes,
and the introduction of 64-bit Windows has required a few more. Part 1
of this series presented an overview and covered RVAs, the data
directory, and the headers. This month in Part 2 the various sections of
the executable are explored. The discussion includes the exports
section, export forwarding, binding, and delayloading. The debug
directory, thread local storage, and the resources sections are also
covered.
|
ast month in Part
1 of this article, I began a comprehensive tour of Portable
Executable (PE) files. I described the history of PE files and the data
structures that make up the headers, including the section table. The PE
headers and section table tell you what kind of code and data exists in
the executable and where you should look to find it. This
month I'll describe the more commonly encountered sections. I'll talk a
bit about my updated and improved PEDUMP program, available in the
February 2002 download.
If you're not familiar with basic PE file concepts, you should read
Part 1 of this article first. Last month I described how a
section is a chunk of code or data that logically belongs together. For
example, all the data that comprises an executable's import tables are
in a section. Let's look at some of the sections you'll encounter in
executables and OBJs. Unless otherwise stated, the section names in Figure 1
come from Microsoft tools.
The Exports Section
When
an EXE exports code or data, it's making functions or variables usable
by other EXEs. To keep things simple, I'll refer to exported functions
and exported variables by the term "symbols." At a minimum, to export
something, the address of an exported symbol needs to be obtainable in a
defined manner. Each exported symbol has an ordinal number associated
with it that can be used to look it up. Also, there is almost always an
ASCII name associated with the symbol. Traditionally, the exported
symbol name is the same as the name of the function or variable in the
originating source file, although they can also be made to differ. Typically,
when an executable imports a symbol, it uses the symbol name rather
than its ordinal. However, when importing by name, the system just uses
the name to look up the export ordinal of the desired symbol, and
retrieves the address using the ordinal value. It would be slightly
faster if an ordinal had been used in the first place. Exporting and
importing by name is solely a convenience for programmers. The
use of the ORDINAL keyword in the Exports section of a .DEF file tells
the linker to create an import library that forces an API to be imported
by ordinal, not by name. I'll begin with the
IMAGE_EXPORT_DIRECTORY structure, which is shown in Figure 2.
The exports directory points to three arrays and a table of ASCII
strings. The only required array is the Export Address Table (EAT),
which is an array of function pointers that contain the address of an
exported function. An export ordinal is simply an index into this array
(see Figure 3).
Figure 3 The
IMAGE_EXPORT_DIRECTORY Structure
Let's go through an
example to show exports at work. Figure 4
shows some of the exports from KERNEL32.DLL. Let's say you've called
GetProcAddress on the AddAtomA API in KERNEL32. The system begins by
locating KERNEL32's IMAGE_EXPORT_DIRECTORY. From that, it obtains the
start address of the Export Names Table (ENT). Knowing that there are
0x3A0 entries in the array, it does a binary search of the names until
it finds the string "AddAtomA". Let's say that the loader finds
AddAtomA to be the second array entry. The loader then reads the
corresponding second value from the export ordinal table. This value is
the export ordinal of AddAtomA. Using the export ordinal as an index
into the EAT (and taking into account the Base field value), it turns
out that AddAtomA is at a relative virtual address (RVA) of 0x82C2.
Adding 0x82C2 to the load address of KERNEL32 yields the actual address
of AddAtomA.
Export Forwarding
A particularly
slick feature of exports is the ability to "forward" an export to
another DLL. For example, in Windows NT®, Windows® 2000, and Windows XP,
the KERNEL32 HeapAlloc function is forwarded to the RtlAllocHeap
function exported by NTDLL. Forwarding is performed at link time by a
special syntax in the EXPORTS section of the .DEF file. Using HeapAlloc
as an example, KERNEL32's DEF file would contain:
EXPORTS ••• HeapAlloc = NTDLL.RtlAllocHeap
How can you tell if a function is forwarded rather
than exported normally? It's somewhat tricky. Normally, the EAT contains
the RVA of the exported symbol. However, if the function's RVA is
inside the exports section (as given by the VirtualAddress and Size
fields in the DataDirectory), the symbol is forwarded. When a
symbol is forwarded, its RVA obviously can't be a code or data address
in the current module. Instead, the RVA points to an ASCII string of the
DLL and symbol name to which it is forwarded. In the prior example, it
would be NTDLL.RtlAllocHeap.
The Imports Section
The
opposite of exporting a function or variable is importing it. In keeping
with the prior section, I'll use the term "symbol" to collectively
refer to imported functions and imported variables. The anchor
of the imports data is the IMAGE_IMPORT_DESCRIPTOR structure. The
DataDirectory entry for imports points to an array of these structures.
There's one IMAGE_IMPORT_DESCRIPTOR for each imported executable. The
end of the IMAGE_IMPORT_DESCRIPTOR array is indicated by an entry with
fields all set to 0. Figure 5
shows the contents of an IMAGE_IMPORT_DESCRIPTOR. Each
IMAGE_IMPORT_DESCRIPTOR typically points to two essentially identical
arrays. These arrays have been called by several names, but the two most
common names are the Import Address Table (IAT) and the Import Name
Table (INT). Figure 6 shows an executable importing some APIs
from USER32.DLL.
Figure 6 Two Parallel Arrays of
Pointers
Both arrays have elements of type
IMAGE_THUNK_DATA, which is a pointer-sized union. Each IMAGE_THUNK_DATA
element corresponds to one imported function from the executable. The
ends of both arrays are indicated by an IMAGE_THUNK_DATA element with a
value of zero. The IMAGE_THUNK_DATA union is a DWORD with these
interpretations:
DWORD Function; // Memory address of the imported function DWORD Ordinal; // Ordinal value of imported API DWORD AddressOfData; // RVA to an IMAGE_IMPORT_BY_NAME with // the imported API name DWORD ForwarderString;// RVA to a forwarder string
The IMAGE_THUNK_DATA structures within the IAT lead a
dual-purpose life. In the executable file, they contain either the
ordinal of the imported API or an RVA to an IMAGE_IMPORT_BY_NAME
structure. The IMAGE_IMPORT_BY_NAME structure is just a WORD, followed
by a string naming the imported API. The WORD value is a "hint" to the
loader as to what the ordinal of the imported API might be. When the
loader brings in the executable, it overwrites each IAT entry with the
actual address of the imported function. This a key point to understand
before proceeding. I highly recommend reading Russell Osterlund's
article in this issue which describes the steps that the Windows loader
takes. Before the executable is loaded, is there a way you can
tell if an IMAGE_THUNK_DATA structure contains an import ordinal, as
opposed to an RVA to an IMAGE_IMPORT_BY_NAME structure? The key is the
high bit of the IMAGE_THUNK_DATA value. If set, the bottom 31 bits (or
63 bits for a 64-bit executable) is treated as an ordinal value. If the
high bit isn't set, the IMAGE_THUNK_ DATA value is an RVA to the
IMAGE_IMPORT_BY_NAME. The other array, the INT, is essentially
identical to the IAT. It's also an array of IMAGE_THUNK_DATA structures.
The key difference is that the INT isn't overwritten by the loader when
brought into memory. Why have two parallel arrays for each set of APIs
imported from a DLL? The answer is in a concept called binding. When the
binding process rewrites the IAT in the file (I'll describe this
process later), some way of getting the original information needs to
remain. The INT, which is a duplicate copy of the information, is just
the ticket. An INT isn't required for an executable to load.
However, if not present, the executable cannot be bound. The Microsoft
linker seems to always emit an INT, but for a long time, the Borland
linker (TLINK) did not. The Borland-created files could not be bound. In
early Microsoft linkers, the imports section wasn't all that special to
the linker. All the data that made up an executable's imports came from
import libraries. You could see this for yourself by running Dumpbin or
PEDUMP on an import library. You'd find sections with names like
.idata$3 and .idata$4. The linker simply followed its rules for
combining sections, and all the structures and arrays magically fell
into place. A few years back, Microsoft introduced a new import library
format that creates significantly smaller import libraries at the cost
of the linker taking a more active role in creating the import data.
Binding
When an executable is bound (via the Bind
program, for instance), the IMAGE_THUNK_DATA structures in the IAT are
overwritten with the actual address of the imported function. The
executable file on disk has the actual in-memory addresses of APIs in
other DLLs in its IAT. When loading a bound executable, the Windows
loader can bypass the step of looking up each imported API and writing
it to the IAT. The correct address is already there! This only happens
if the stars align properly, however. My May
2000 column contains some benchmarks on just how much load-time
speed increase you can get from binding executables. You
probably have a healthy skepticism about the safety of executable
binding. After all, what if you bind your executable and the DLLs that
it imports change? When this happens, all the addresses in the IAT are
invalid. The loader checks for this situation and reacts accordingly. If
the addresses in the IAT are stale, the loader still has all the
necessary information from the INT to resolve the addresses of the
imported APIs. Binding your programs at installation time is
the best possible scenario. The BindImage action of the Windows
installer will do this for you. Alternatively, IMAGEHLP.DLL provides the
BindImageEx API. Either way, binding is good idea. If the loader
determines that the binding information is current, executables load
faster. If the binding information becomes stale, you're no worse off
than if you hadn't bound in the first place. One of the key
steps in making binding effective is for the loader to determine if the
binding information in the IAT is current. When an executable is bound,
information about the referenced DLLs is placed into the executable. The
loader checks this information to make a quick determination of the
binding validity. This information wasn't added with the first
implementation of binding. Thus, an executable can be bound in the old
way or the new way. The new way is what I'll describe here. The
key data structure in determining the validity of bound imports is an
IMAGE_BOUND_IMPORT_DESCRIPTOR. A bound executable contains a list of
these structures. Each IMAGE_BOUND_IMPORT_DESCRIPTOR structure
represents the time/date stamp of one imported DLL that has been bound
against. The RVA of the list is given by the
IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT element in the DataDirectory. The
elements of the IMAGE_BOUND_IMPORT_DESCRIPTOR are:
- TimeDateStamp,
a DWORD that contains the time/date stamp of the imported DLL.
- OffsetModuleName,
a WORD that contains an offset to a string with the name of the
imported DLL. This field is an offset (not an RVA) from the first
IMAGE_BOUND_IMPORT_DESCRIPTOR.
- NumberOfModuleForwarderRefs,
a WORD that contains the number of IMAGE_BOUND_FORWARDER_REF structures
that immediately follow this structure. These structures are identical
to the IMAGE_BOUND_IMPORT_DESCRIPTOR except that the last WORD (the
NumberOfModuleForwarderRefs) is reserved.
In a simple
world, the IMAGE_BOUND_IMPORT_DESCRIPTORs for each imported DLL would
be a simple array. But, when binding against an API that's forwarded to
another DLL, the validity of the forwarded DLL has to be checked too.
Thus, the IMAGE_BOUND_FORWARDER_REF structures are interleaved with the
IMAGE_BOUND_IMPORT_DESCRIPTORs. Let's say you linked against
HeapAlloc, which is forwarded to RtlAllocateHeap in NTDLL. Then you ran
BIND on your executable. In your EXE, you'd have an
IMAGE_BOUND_IMPORT_DESCRIPTOR for KERNEL32.DLL, followed by an
IMAGE_BOUND_FORWARDER_REF for NTDLL.DLL. Immediately following that
might be additional IMAGE_ BOUND_IMPORT_DESCRIPTORs for other DLLs you
imported and bound against.
Delayload Data
Earlier I
described how delayloading a DLL is a hybrid approach between an
implicit import and explicitly importing APIs via LoadLibrary and
GetProcAddress. Now let's take a look at the data structures and see how
delayloading works. Remember that delayloading is not an
operating system feature. It's implemented entirely by additional code
and data added by the linker and runtime library. As such, you won't
find many references to delayloading in WINNT.H. However, you can see
definite parallels between the delayload data and regular imports data. The
delayload data is pointed to by the IMAGE_DIRECTORY_ENTRY_DELAY_IMPORT
entry in the DataDirectory. This is an RVA to an array of ImgDelayDescr
structures, defined in DelayImp.H from Visual C++. Figure 7
shows the contents. There's one ImgDelayDescr for each delayload
imported DLL. The key thing to glean from ImgDelayDescr is that
it contains the addresses of an IAT and an INT for the DLL. These
tables are identical in format to their regular imports equivalent, only
they're written to and read by the runtime library code rather than the
operating system. When you call an API from a delayloaded DLL for the
first time, the runtime calls LoadLibrary (if necessary), and then
GetProcAddress. The resulting address is stored in the delayload IAT so
that future calls go directly to the API. There is a bit of
goofiness about the delayload data that needs explanation. In its
original incarnation in Visual C++ 6.0, all ImgDelayDescr fields
containing addresses used virtual addresses, rather than RVAs. That is,
they contained actual addresses where the delayload data could be found.
These fields are DWORDs, the size of a pointer on the x86. Now
fast-forward to IA-64 support. All of a sudden, 4 bytes isn't enough to
hold a complete address. Ooops! At this point, Microsoft did the
correct thing and changed the fields containing addresses to RVAs. As
shown in Figure 7,
I've used the revised structure definitions and names. There
is still the issue of determining whether an ImgDelayDescr is using RVAs
or virtual addresses. The structure has a field to hold flag values.
When the "1" bit of the grAttrs field is on, the structure members
should be treated as RVAs. This is the only option starting with Visual
Studio® .NET and the 64-bit compiler. If that bit in grAttrs is off, the
ImgDelayDescr fields are virtual addresses.
The Resources
Section
Of all the sections within a PE, the resources are the
most complicated to navigate. Here, I'll describe just the data
structures that are used to get to the raw resource data such as icons,
bitmaps, and dialogs. I won't go into the actual format of the resource
data since it's beyond the scope of this article. The resources
are found in a section called .rsrc. The IMAGE_DIRECTORY_ENTRY_RESOURCE
entry in the DataDirectory contains the RVA and size of the resources.
For various reasons, the resources are organized in a manner similar to a
file system—with directory and leaf nodes. The resource
pointer from the DataDirectory points to a structure of type
IMAGE_RESOURCE_DIRECTORY. The IMAGE_RESOURCE_DIRECTORY structure
contains unused Characteristic, TimeDateStamp, and version number
fields. The only interesting fields in an IMAGE_RESOURCE_DIRECTORY are
the NumberOfNamedEntries and the NumberOfIdEntries. Following
each IMAGE_RESOURCE_DIRECTORY structure is an array of
IMAGE_RESOURCE_DIRECTORY_ENTRY structures. Adding the
NumberOfNamedEntries and NumberOfIdEntries fields from the
IMAGE_RESOURCE_DIRECTORY yields the count of
IMAGE_RESOURCE_DIRECTORY_ENTRYs. (If all these data structure names are
painful for you to read, let me tell you, it's also awkward writing
about them!) A directory entry points to either another
resource directory or to the data for an individual resource. When the
directory entry points to another resource directory, the high bit of
the second DWORD in the structure is set and the remaining 31 bits are
an offset to the resource directory. The offset is relative to the
beginning of the resource section, not an RVA. When a directory
entry points to an actual resource instance, the high bit of the second
DWORD is clear. The remaining 31 bits are the offset to the resource
instance (for example, a dialog). Again, the offset is relative to the
resource section, not an RVA. Directory entries can be named or
identified by an ID value. This is consistent with resources in an .RC
file where you can specify a name or an ID for a resource instance. In
the directory entry, when the high bit of the first DWORD is set, the
remaining 31 bits are an offset to the string name of the resource. If
the high bit is clear, the bottom 16 bits contain the ordinal
identifier. Enough theory! Let's look at an actual resource
section and decipher what it means. Figure 8
shows abbreviated PEDUMP output for the resources in ADVAPI32.DLL. Each
line that starts with "ResDir" corresponds to an
IMAGE_RESOURCE_DIRECTORY structure. Following "ResDir" is the name of
the resource directory, in parentheses. In this example, there are
resource directories named 0, MOFDATA, MOFRESOURCENAME, STRING, C36,
RCDATA, and 66. Following the name is the combined number of directory
entries (both named and by ID). In this example, the topmost directory
has three immediate directory entries, while all the other directories
contain a single entry. In everyday use, the topmost directory
is analogous to the root directory of a file system. Each directory
entry below the "root" is always a directory in its own right. Each of
these second-level directories corresponds to a resource type (strings
tables, dialogs, menus, and so on). Underneath each of the second-level
"resource type" directories, you'll find third-level subdirectories. There's
a third-level subdirectory for each resource instance. For example, if
there were five dialogs, there would be a second-level DIALOG directory
with five directory entries beneath it. Each of the five directory
entries would themselves be a directory. The name of the directory entry
corresponds to the name or ID of the resource instance. Under each of
these directory entries is a single item which contains the offset to
the resource data. Simple, no? If you learn more efficiently by
reading code, be sure to check out the resource dumping code in PEDUMP
(see the February 2002 code download for this article). Besides
displaying all the resource directories and their entries, it also dumps
out several of the more common types of resource instances such as
dialogs.
Base Relocations
In many locations in an
executable, you'll find memory addresses. When an executable is linked,
it's given a preferred load address. These memory addresses are only
correct if the executable loads at the preferred load address specified
by the ImageBase field in the IMAGE_FILE_HEADER structure. If
the loader needs to load the DLL at another address, all the addresses
in the executable will be incorrect. This entails extra work for the
loader. The May 2000 Under The Hood column (mentioned earlier) describes
the performance hit when DLLs have the same preferred load addresses
and how the REBASE tool can help. The base relocations tell the
loader every location in the executable that needs to be modified if
the executable doesn't load at the preferred load address. Luckily for
the loader, it doesn't need to know any details about how the address is
being used. It just knows that there's a list of locations that need to
be modified in some consistent way. Let's look at an x86-based
example to make this clear. Say you have the following instruction,
which loads the value of a local variable (at address 0x0040D434) into
the ECX register:
00401020: 8B 0D 34 D4 40 00 mov ecx,dword ptr [0x0040D434]
The instruction is at address 0x00401020 and is six bytes
long. The first two bytes (0x8B 0x0D) make up the opcode of the
instruction. The remaining four bytes hold a DWORD address (0x0040D434).
In this example, the instruction is from an executable with a preferred
load address of 0x00400000. The global variable is therefore at an RVA
of 0xD434. If the executable does load at 0x00400000, the
instruction can run exactly as is. But let's say that the executable
somehow gets loaded at address of 0x00500000. If this happens, the last
four bytes of the instruction need to be changed to 0x0050D434. How
can the loader make this change? The loader compares the preferred and
actual load addresses and calculates a delta. In this case, the delta
value is 0x00100000. This delta can be added to the value of the
DWORD-sized address to come up with the new address of the variable. In
the previous example, there would be a base relocation for address
0x00401022, which is the location of the DWORD in the instruction. In
a nutshell, base relocations are just a list of locations in an
executable where a delta value needs to be added to the existing
contents of memory. The pages of an executable are brought into memory
only as they're needed, and the format of the base relocations reflects
this. The base relocations reside in a section called .reloc, but the
correct way to find them is from the DataDirectory using the
IMAGE_DIRECTORY_ENTRY_BASERELOC entry. Base relocations are a
series of very simple IMAGE_BASE_RELOCATION structures. The
VirtualAddress field contains the RVA of the memory range to which the
relocations belong. The SizeOfBlock field indicates how many bytes make
up the relocation information for this base, including the size of the
IMAGE_BASE_RELOCATION structure. Immediately following the
IMAGE_BASE_RELOCATION structure is a variable number of WORD values. The
number of WORDs can be deduced from the SizeOfBlock field. Each WORD
consists of two parts. The top 4 bits indicate the type of relocation,
as given by the IMAGE_REL_BASED_xxx values in WINNT.H. The bottom
12 bits are an offset, relative to the VirtualAddress field, where the
relocation should be applied. In the previous example of base
relocations, I simplified things a bit. There are actually multiple
types of base relocations and methods for how they're applied. For x86
executables, all base relocations are of type IMAGE_REL_BASED_HIGHLOW.
You will often see a relocation of type IMAGE_REL_BASED_ABSOLUTE at the
end of a group of relocations. These relocations do nothing, and are
there just to pad things so that the next IMAGE_BASE_RELOCATION is
aligned on a 4-byte boundary. For IA-64 executables, the
relocations seem to always be of type IMAGE_REL_BASED_DIR64. As with x86
relocations, there will often be IMAGE_REL_BASED_ABSOLUTE relocations
used for padding. Interestingly, although pages in IA-64 EXEs are 8KB,
the base relocations are still done in 4KB chunks. In Visual
C++ 6.0, the linker omits relocations for EXEs when doing a release
build. This is because EXEs are the first thing brought into an address
space, and therefore are essentially guaranteed to load at the preferred
load address. DLLs aren't so lucky, so base relocations should always
be left in, unless you have a reason to omit them with the /FIXED
switch. In Visual Studio .NET, the linker omits base relocations for
debug and release mode EXE files.
The Debug Directory
When
an executable is built with debug information, it's customary to
include details about the format of the information and where it is. The
operating system doesn't require this to run the executable, but it's
useful for development tools. An EXE can have multiple forms of debug
information; a data structure known as the debug directory indicates
what's available. The DebugDirectory is found via the
IMAGE_DIRECTORY_ENTRY_DEBUG slot in the DataDirectory. It consists of an
array of IMAGE_DEBUG_DIRECTORY structures (see Figure 9),
one for each type of debug information. The number of elements in the
debug directory can be calculated using the Size field in the
DataDirectory. By far, the most prevalent form of debug
information today is the PDB file. The PDB file is essentially an
evolution of CodeView-style debug information. The presence of PDB
information is indicated by a debug directory entry of type
IMAGE_DEBUG_TYPE_CODEVIEW. If you examine the data pointed to by this
entry, you'll find a short CodeView-style header. The majority of this
debug data is just a path to the external PDB file. In Visual Studio
6.0, the debug header began with an NB10 signature. In Visual Studio
.NET, the header begins with an RSDS. In Visual Studio 6.0,
COFF debug information can be generated with the /DEBUGTYPE:COFF linker
switch. This capability is gone in Visual Studio .NET. Frame Pointer
Omission (FPO) debug information comes into play with optimized x86
code, where the function may not have a regular stack frame. FPO data
allows the debugger to locate local variables and parameters. The
two types of OMAP debug information exist only for Microsoft programs.
Microsoft has an internal tool that reorganizes the code in executable
files to minimize paging. (Yes, more than the Working Set Tuner can do.)
The OMAP information lets tools convert between the original addresses
in the debug information and the new addresses after having been moved. Incidentally,
DBG files also contain a debug directory like I just described. DBG
files were prevalent in the Windows NT 4.0 era, and they contained
primarily COFF debug information. However, they've been phased out in
favor of PDB files in Windows XP.
The .NET Header
Executables
produced for the Microsoft .NET environment are first and foremost PE
files. However, in most cases normal code and data in a .NET file are
minimal. The primary purpose of a .NET executable is to get the
.NET-specific information such as metadata and intermediate language
(IL) into memory. In addition, a .NET executable links against
MSCOREE.DLL. This DLL is the starting point for a .NET process. When a
.NET executable loads, its entry point is usually a tiny stub of code.
That stub just jumps to an exported function in MSCOREE.DLL (_CorExeMain
or _CorDllMain). From there, MSCOREE takes charge, and starts using the
metadata and IL from the executable file. This setup is similar to the
way apps in Visual Basic (prior to .NET) used MSVBVM60.DLL. The starting
point for .NET information is the IMAGE_COR20_HEADER structure,
currently defined in CorHDR.H from the .NET Framework SDK and more
recent versions of WINNT.H. The IMAGE_COR20_HEADER is pointed to by the
IMAGE_DIRECTORY_ENTRY_COM_DESCRIPTOR entry in the DataDirectory. Figure 10
shows the fields of an IMAGE_COR20_HEADER. The format of the metadata,
method IL, and other things pointed to by the IMAGE_COR20_HEADER will be
described in a subsequent article.
TLS Initialization
When
using thread local variables declared with __declspec(thread), the
compiler puts them in a section named .tls. When the system sees a new
thread starting, it allocates memory from the process heap to hold the
thread local variables for the thread. This memory is initialized from
the values in the .tls section. The system also puts a pointer to the
allocated memory in the TLS array, pointed to by FS:[2Ch] (on the x86
architecture). The presence of thread local storage (TLS) data
in an executable is indicated by a nonzero IMAGE_DIRECTORY_ENTRY_TLS
entry in the DataDirectory. If nonzero, the entry points to an
IMAGE_TLS_DIRECTORY structure, shown in Figure 11. It's
important to note that the addresses in the IMAGE_TLS_DIRECTORY
structure are virtual addresses, not RVAs. Thus, they will get modified
by base relocations if the executable doesn't load at its preferred load
address. Also, the IMAGE_TLS_DIRECTORY itself is not in the .tls
section; it resides in the .rdata section.
Program Exception
Data
Some architectures (including the IA-64) don't use
frame-based exception handling, like the x86 does; instead, they used
table-based exception handling in which there is a table containing
information about every function that might be affected by exception
unwinding. The data for each function includes the starting address, the
ending address, and information about how and where the exception
should be handled. When an exception occurs, the system searches through
the tables to locate the appropriate entry and handles it. The
exception table is an array of IMAGE_RUNTIME_FUNCTION_ENTRY structures.
The array is pointed to by the IMAGE_DIRECTORY_ENTRY_EXCEPTION entry in
the DataDirectory. The format of the IMAGE_RUNTIME_FUNCTION_ENTRY
structure varies from architecture to architecture. For the IA-64, the
layout looks like this:
DWORD BeginAddress; DWORD EndAddress; DWORD UnwindInfoAddress;
The format of the UnwindInfoAddress data isn't given in
WINNT.H. However, the format can be found in Chapter 11 of the "IA-64
Software Conventions and Runtime Architecture Guide" from Intel.
The PEDUMP Program
My PEDUMP program (available for
download with Part 1 of this article) is significantly improved from the
1994 version. It displays every data structure described in this
article, including:
- IMAGE_NT_HEADERS
- Imports /
Exports
- Resources
- Base relocations
- Debug
directory
- Delayload imports
- Bound import
descriptors
- IA-64 exception handling tables
- TLS
initialization data
- .NET runtime header
In
addition to dumping PE executables, PEDUMP can also dump COFF format
OBJ files, COFF import libraries (new and old formats), COFF symbol
tables, and DBG files. PEDUMP is a command-line program.
Running it without any options on one of the file types just described
leads to a default dump that contains the more useful data structures.
There are several command-line options for additional output (see Figure 12). The
PEDUMP source code is interesting for a couple of reasons. It compiles
and runs as either a 32 or 64-bit executable. So if you have an Itanium
box handy, give it a whirl! In addition, PEDUMP can dump both 32 and
64-bit executables, regardless of how it was compiled. In other words,
the 32-bit version can dump 32 and 64-bit files, and the 64-bit version
can dump 32 and 64-bit files. In thinking about making PEDUMP
work on both 32 and 64-bit files, I wanted to avoid having two copies of
every function, one for the 32-bit form of a structure and another for
the 64-bit form. The solution was to use C++ templates. In
several files (EXEDUMP.CPP in particular), you'll find various template
functions. In most cases, the template function has a template parameter
that expands to either an IMAGE_NT_HEADERS32 or IMAGE_NT_HEADERS64.
When invoking these functions, the code determines the 32 or 64-bitness
of the executable file and calls the appropriate function with the
appropriate parameter type, causing an appropriate template expansion. With
the PEDUMP sources, you'll find a Visual C++ 6.0 project file. Besides
the traditional x86 debug and release configurations, there's also a
64-bit build configuration. To get this to work, you'll need to add the
path to the 64-bit tools (currently in the Platform SDK) at the top of
the Executable path under the Tools | Options | Directories tab. You'll
also need to make sure that the proper paths to the 64-bit Include and
Lib directories are set properly. My project file has correct settings
for my machine, but you may need to change them to build on your
machine. In order to make PEDUMP as complete as possible, it
was necessary to use the latest versions of the Windows header files. In
the June 2001 Platform SDK which I developed against, these files are
in the .\include\prerelease and .\Include\Win64\crt\ directories. In the
August 2001 SDK, there's no need to use the prerelease directories,
since WINNT.H has been updated. The essential point is that the code
does build. You may just need to have a recent enough Platform SDK
installed or modify the project directories if building the 64-bit
version.
Wrap-up
The Portable Executable format is a
well-structured and relatively simple executable format. It's
particularly nice that PE files can be mapped directly into memory so
that the data structures on disk are the same as those Windows uses at
runtime. I've also been surprised at how well the PE format has held up
with all the various changes that have been thrown at it in the past 10
years, including the transition to 64-bit Windows and .NET. Although
I've covered many aspects of PE files, there are still topics that I
haven't gotten to. There are flags, attributes, and data structures that
occur infrequently enough that I decided not to describe them here.
However, I hope that this "big picture" introduction to PE files has
made the Microsoft PE specifications easier for you to understand.
|
For related articles see: Part
1 of this series Peering
Inside the PE: A Tour of the Win32 Portable Executable File Format
For
background information see: The
Common Object File Format (COFF)
|
posted on 2010-03-25 15:17
chatler 阅读(1505)
评论(0) 编辑 收藏 引用 所属分类:
OS 、
windows