elva

打造最小的PE文件

打造最小的PE文件
                     (bkbll#cnhonker.net 2005-9-18 9:01)

一. 前言.
       最近在鼓捣windows下PE文件格式, 在达到既定目标后, 对生成最小PE文件产生了兴趣, 恰好
   看到 watercloud(watercloud_at_xfocus.org)在近2年前写过一篇文章<<手工打造微型Win32
   可执行文件>>(http://www.xfocus.net/articles/200302/482.html), 我也依葫芦画瓢,打造
   一下我认为最小的PE文件,由于是初次接触PE格式,如有差错,敬请斧正.
       本文所有程序均在win2k sp4 cn和windows xp sp1 cn上测试通过.

二. PE文件格式,结构
       在winnt.h中,有PE各种结构的定义,这里就不一一列举, 仅将相关结构名列举如下:
   IMAGE_DOS_HEADER,IMAGE_NT_HEADERS,IMAGE_FILE_HEADER,IMAGE_OPTIONAL_HEADER,
   IMAGE_DATA_DIRECTORY,IMAGE_SECTION_HEADER,IMAGE_IMPORT_DESCRIPTOR
   因为目标是打造最小的PE文件,所以仅用到一个IMPORT表.
   PE整个文件框架大致如下:
  
   | IMAGE_DOS_HEADER |
                              |        Signature      |
   | IMAGE_NT_HEADER  |   ->  |   IMAGE_FILE_HEADER   |
                         | IMAGE_OPTIONAL_HEADER |  ->
                                                         | IMAGE_DATA_DIRECTORY |
                                                                ......
  
   | IMAGE_SECTION_HEADER |
        
         ........
        
   |      代码段       |                              

三. 不一样的地方
        watercloud 的PE已经比较小了,但还有几个地方我处理的不大一样:
        1. WindowsXP 可以允许PE section为1个. 试验系统是xp sp1 cn
        2. 文件对齐 windows是规定是2的幂, 当然可以比0x200小.
    当然,除了上面2点以外,我还有用到一种比较巧妙的技巧.
    
    运行PE文件,会在屏幕上打印Hello,world信息.
    
四. 打造过程.
    1. 过程一:
        最开始我们按照PE结构和顺序一步步填充结构,看能有多大:
        我们先选取对齐值为0x20.
        这里我们选MAGE_OPTIONAL_HEADER.DataDirectory个数为16个(所有都用上),但
   只用到IMPORT table.
        
        这个过程没有什么技巧,因为只用到一个section,文件对齐又小了很多,最终大小为
    496字节, 其中我们的汇编代码占了47字节.
           00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000h: 4D 5A 90 00 03 00 00 00 04 00 00 00 FF FF 00 00 ; MZ?........??..
00000010h: B8 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 ; ?......@.......
00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
00000030h: 00 00 00 00 00 00 00 00 00 00 00 00 40 00 00 00 ; ............@...
00000040h: 50 45 00 00 4C 01 01 00 00 00 00 00 00 00 00 00 ; PE..L...........
00000050h: 00 00 00 00 E0 00 0F 01 0B 01 06 00 00 00 00 00 ; ....?..........
00000060h: 00 00 00 00 00 00 00 00 B4 01 00 00 00 00 00 00 ; ........?......
00000070h: 00 00 00 00 00 00 40 00 10 00 00 00 10 00 00 00 ; ......@.........
00000080h: 04 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 ; ................
00000090h: 00 10 00 00 00 00 00 00 00 00 00 00 03 00 00 00 ; ................
000000a0h: 00 00 10 00 00 10 00 00 00 00 10 00 00 10 00 00 ; ................
000000b0h: 00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 ; ................
000000c0h: 60 01 00 00 28 00 00 00 00 00 00 00 00 00 00 00 ; `...(...........
000000d0h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
000000e0h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
000000f0h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
00000100h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
00000110h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
00000120h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
00000130h: 00 00 00 00 00 00 00 00 2E 74 65 78 74 00 00 00 ; .........text...
00000140h: 00 08 00 00 60 01 00 00 00 08 00 00 60 01 00 00 ; ....`.......`...
00000150h: 00 00 00 00 00 00 00 00 00 00 00 00 20 00 00 E0 ; ............ ..?
00000160h: 88 01 00 00 00 00 00 00 00 00 00 00 98 01 00 00 ; ?..........?..
00000170h: 90 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ?..............
00000180h: 00 00 00 00 00 00 00 00 A8 01 00 00 00 00 00 00 ; ........?......
00000190h: A8 01 00 00 00 00 00 00 6B 65 72 6E 65 6C 33 32 ; ?......kernel32
000001a0h: 2E 64 6C 6C 00 00 00 00 00 00 57 72 69 74 65 46 ; .dll......WriteF
000001b0h: 69 6C 65 00 8B 43 10 8B 40 1C 33 D2 52 68 72 6C ; ile.婥.婡.3襌hrl
000001c0h: 64 0A 68 6F 2C 77 6F 68 68 65 6C 6C 8B CC 52 54 ; d.ho,wohhell嬏RT
000001d0h: 6A 0C 51 50 68 90 01 00 00 58 03 43 08 FF 10 83 ; j.QPh?..X.C.?.?
000001e0h: C4 10 C3 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ??............
    
    2. 过程二:
       压缩一下 IMAGE_OPTIONAL_HEADER 的DataDirectory,因为只用到import表,所以
   IMAGE_OPTIONAL_HEADER 的 NumberOfRvaAndSizes 可以为 2 , 这样就减少了0x70字节,
   最终大小为384字节, 47字节为我们的汇编代码,因为这个没什么技巧,和前面差不多,所
   以不贴出文件内容了.
  
    3. 过程三:
       对比一下,我们发现 IMAGE_DOS_HEADER 的0x40大小结构,除了 e_magic 和 e_lfanew
    两个结构外,其他对我们的mini-pe 似乎没什么影响,那么这个结构没用的部分可不可以
    利用起来呢? 答案是肯定的, 我决定将 IMAGE_NT_HEADERS 和 IMAGE_DOS_HEADER 重叠
    起来, 但是因为 e_lfanew 是标记IMAGE_NT_HEADERS 偏移的唯一值, 所以这个值不能被
    覆盖, 同时因为两个头部重叠了,所以 e_lfanew 所在的文件偏移位置在 IMAGE_NT_HEADERS
    结构中应该是个可以被忽略的结构.
    我们来分析一下 IMAGE_NT_HEADERS 的头0x40大小的结构:
typedef struct _IMAGE_NT_HEADERS
{
    DWORD Signature;                     //+0
    IMAGE_FILE_HEADER FileHeader;
    IMAGE_OPTIONAL_HEADER OptionalHeader;
} IMAGE_NT_HEADERS, *PIMAGE_NT_HEADERS;

typedef struct _IMAGE_FILE_HEADER
{
    WORD Machine;                //+4
    WORD NumberOfSections;            //+6
    DWORD TimeDateStamp;            //+8
    DWORD PointerToSymbolTable;        //+12
    DWORD NumberOfSymbols;            //+16
    WORD SizeOfOptionalHeader;        //+20
    WORD Characteristics;            //+22
} IMAGE_FILE_HEADER, *PIMAGE_FILE_HEADER;

typedef struct _IMAGE_OPTIONAL_HEADER
{
    WORD    Magic;                //+24
    BYTE    MajorLinkerVersion;        //+26
    BYTE    MinorLinkerVersion;        //+27  
    DWORD   SizeOfCode;            //+28
    DWORD   SizeOfInitializedData;        //+32
    DWORD   SizeOfUninitializedData;    //+36
    DWORD   AddressOfEntryPoint;        //+40
    DWORD   BaseOfCode;             //+44
    DWORD   BaseOfData;            //+48
    DWORD   ImageBase;            //+52
    DWORD   SectionAlignment;         //+56
    DWORD   FileAlignment;            //+60
    WORD    MajorOperatingSystemVersion;    //+64
    ..........
}  
       e_lfanew 是在 IMAGE_DOS_HEADER 的0x3c = 60 处, 我们从56除往回找可以被覆盖
   又没什么用处的结构, 好像最近一个就只有 BaseOfData 了. 也就是说 e_lfanew =
   60 - 48 = 12 = 0xc.
  
   重叠后的 IMAGE_DOS_HEADER 和 IMAGE_FILE_HEADER 结构图如下:
WORD   e_magic;            //+0              
WORD   e_cblp;              //+2
WORD   e_cp;                //+4
WORD   e_crlc;              //+6  
WORD   e_cparhdr;           //+8  
WORD   e_minalloc;          //+10      
WORD   e_maxalloc; WORD e_ss;   //+12 IMAGE_NT_HEADERS.Signature    //+0
WORD   e_sp;                  //+16 IMAGE_FILE_HEADER.Machine        //+4
WORD   e_csum;                  //+18 IMAGE_FILE_HEADER.NumberOfSections//+6
WORD   e_ip; WORD e_cs;       //+20 IMAGE_FILE_HEADER.TimeDateStamp   //+8
WORD   e_lfarlc; WORD e_ovno;   //+24 IMAGE_FILE_HEADER.PointerToSymbolTable
WORD   e_res[4];         //+28 IMAGE_FILE_HEADER. NumberOfSymbols
                //+32 IMAGE_FILE_HEADER.SizeOfOptionalHeader
                    //+34 IMAGE_FILE_HEADER.Characteristics
WORD   e_oemid;                 //+36 IMAGE_OPTIONAL_HEADER.Magic
WORD   e_oeminfo;           //+38 IMAGE_OPTIONAL_HEADER.MajorLinkerVersion
                       //+39 IMAGE_OPTIONAL_HEADER.MinorLinkerVersion
WORD   e_res2[10];        //+40 IMAGE_OPTIONAL_HEADER.SizeOfCode
                //+44 IMAGE_OPTIONAL_HEADER.SizeOfInitializedData
                //+48 IMAGE_OPTIONAL_HEADER.SizeOfUninitializedData
                //+52 IMAGE_OPTIONAL_HEADER.AddressOfEntryPoint
                //+56 IMAGE_OPTIONAL_HEADER.BaseOfCode;
LONG   e_lfanew;         //+60 IMAGE_OPTIONAL_HEADER.BaseOfData;//+48
                       //+64 IMAGE_OPTIONAL_HEADER.ImageBase

    这样光重叠这部分就可以省下一点空间,最终大小为336字节,其中47字节为我们的汇编代码.
        文件内容:
           00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F        
00000000h: 4D 5A 90 00 03 00 00 00 04 00 00 00 50 45 00 00 ; MZ?........PE..
00000010h: 4C 01 01 00 00 00 00 00 00 00 00 00 00 00 00 00 ; L...............
00000020h: 70 00 0F 01 0B 01 06 00 00 00 00 00 00 00 00 00 ; p...............
00000030h: 00 00 00 00 14 01 00 00 00 00 00 00 0C 00 00 00 ; ................
00000040h: 00 00 40 00 10 00 00 00 10 00 00 00 04 00 00 00 ; ..@.............
00000050h: 00 00 00 00 04 00 00 00 00 00 00 00 00 10 00 00 ; ................
00000060h: 00 00 00 00 00 00 00 00 03 00 00 00 00 00 10 00 ; ................
00000070h: 00 10 00 00 00 00 10 00 00 10 00 00 00 00 00 00 ; ................
00000080h: 02 00 00 00 00 00 00 00 00 00 00 00 C0 00 00 00 ; ............?..
00000090h: 28 00 00 00 2E 74 65 78 74 00 00 00 00 08 00 00 ; (....text.......
000000a0h: C0 00 00 00 00 08 00 00 C0 00 00 00 00 00 00 00 ; ?......?......
000000b0h: 00 00 00 00 00 00 00 00 20 00 00 E0 00 00 00 00 ; ........ ..?...
000000c0h: E8 00 00 00 00 00 00 00 00 00 00 00 F8 00 00 00 ; ?..........?..
000000d0h: F0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ?..............
000000e0h: 00 00 00 00 00 00 00 00 08 01 00 00 00 00 00 00 ; ................
000000f0h: 08 01 00 00 00 00 00 00 6B 65 72 6E 65 6C 33 32 ; ........kernel32
00000100h: 2E 64 6C 6C 00 00 00 00 00 00 57 72 69 74 65 46 ; .dll......WriteF
00000110h: 69 6C 65 00 8B 43 10 8B 40 1C 33 D2 52 68 72 6C ; ile.婥.婡.3襌hrl
00000120h: 64 0A 68 6F 2C 77 6F 68 68 65 6C 6C 8B CC 52 54 ; d.ho,wohhell嬏RT
00000130h: 6A 0C 51 50 68 F0 00 00 00 58 03 43 08 FF 10 83 ; j.QPh?..X.C.?.?
00000140h: C4 10 C3 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ??............
    
    4. 过程四.
       现在已经小很多了,我们在前这几个结构 IMAGE_DOS_HEADER,IMAGE_NT_HEADERS
    和IMAGE_SECTION_HEADER 上,可以重叠的不多了, 那么剩下就只有IMPORT表的描述了.
  
    在前面我们用的import表的描述是这样的结构:
        import descriptor(PEDataDir->Size) bytes
        OriginalFirstThunk + 0x0000    //8 bytes
       FirstThunk(IAT) + 0x0000    //8 bytes
       "kernel32.dll"+0x0        //12+4 = 16 bytes
    0x00 +iatfunction1("WriteFile")// 2 + 10 = 12 bytes
    这样就占了    PEDataDir->Size + 8 + 8 + 16 + 12字节,好像比较大哦,那我们就对它动手吧.
    我们看看能不能把它整合到已有的结构里面去.
    先调整唯一的section的内容:
    IMAGE_SECTION_HEADER.VirtualAddress = 0x00;
    IMAGE_SECTION_HEADER.PointerToRawData = 0x00;
    这样,整个文件结构以及偏移地址就都可以使用了.
    import descriptor  所需要的最小大小是 sizeof(IMAGE_IMPORT_DESCRIPTOR) *2,
    其中要求最后4字节内容为0
    我们对比搜寻结构,发现 IMAGE_OPTIONAL_HEADER 结构从DllCharacteristics开始满足我们的要求,
        import descriptor 指向这里:
        WORD    DllCharacteristics;
    DWORD   SizeOfStackReserve;
    DWORD   SizeOfStackCommit;
    DWORD   SizeOfHeapReserve;
    DWORD   SizeOfHeapCommit;
    DWORD   LoaderFlags;
    DWORD   NumberOfRvaAndSizes;
    IMAGE_DATA_DIRECTORY DataDirectory[0];
    因为刚好第一个DataDirectory内容都是0,满足我们的要求.
    剩下的就需要找一个8字节大小的空间,要求第一个4字节改变不会影响程序运行(这里用来保存第
    一个IAT的地址),第二个4字节内容为0. 这里用来存放 IMAGE_IMPORT_DESCRIPTOR 的 OriginalFirstThunk
    和 FirstThunk ,根据需求来看,这两个地址显然可以相等.
    很幸运的是 IMAGE_IMPORT_DESCRIPTOR 自己的结构就满足这样的要求:
typedef struct _IMAGE_IMPORT_DESCRIPTOR
{
    union
    {
        DWORD   Characteristics;
        DWORD   OriginalFirstThunk;
    };
    DWORD   TimeDateStamp;
    DWORD   ForwarderChain;
    DWORD   Name;
    DWORD   FirstThunk;
} IMAGE_IMPORT_DESCRIPTOR;
    当ForwarderChain为0的时候,TimeDateStamp偏移就可以用来存放我们需要的IAT地址.
    OK,现在还剩下两个地方没有解决:dll名字以及导入函数的名字.
    和上面一样,我们还是到已有的结构里面去找可以填充的空间:
    Dll名字存放要求很简单,后面有一个'\0'结尾,然后填充内容不影响程序运行.
    IMAGE_SECTION_HEADER好像可以满足我们的要求:
    
    DWORD   PointerToRelocations;
    DWORD   PointerToLinenumbers;
    WORD    NumberOfRelocations;
    WORD    NumberOfLinenumbers;
    DWORD    Characteristics;
    一共有12字节存放我们的dll名字,同时Characteristics还有后1为可以供存放,所以一
    共有13字节,保存"Kernel32.dll"刚刚好.
    IMAGE_FILE_HEADER 结构也有12字节空间:

    DWORD TimeDateStamp;                 //+8 可以随便填
    DWORD PointerToSymbolTable;          //+12
    DWORD NumberOfSymbols;             //+16
    
    我们就在这里容纳我们的函数名就可以了.
    
    这样整个 IMAGE_IMPORT_DESCRIPTOR 结构和import表其他内容就被我们拆散整合到
    已有的结构里面去了.
    接着我们再优化一下shellcode, 使用msvcrt.dll的printf来输出信息.
    
    这样,经过精心裁减后,整个PE文件大小为224字节,其中汇编代码占了28字节.
    最终结果:
           00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F    
00000000h: 4D 5A 90 00 03 00 00 00 04 00 00 00 50 45 00 00 ; MZ?........PE..
00000010h: 4C 01 01 00 00 00 70 72 69 6E 74 66 00 00 00 00 ; L.....printf....
00000020h: 70 00 0F 01 0B 01 06 00 00 00 00 00 00 00 00 00 ; p...............
00000030h: 00 00 00 00 C0 00 00 00 00 00 00 00 0C 00 00 00 ; ....?..........
00000040h: 00 00 40 00 10 00 00 00 10 00 00 00 04 00 00 00 ; ..@.............
00000050h: 00 00 00 00 04 00 00 00 00 00 00 00 00 10 00 00 ; ................
00000060h: 00 00 00 00 00 00 00 00 03 00 6E 00 00 00 14 00 ; ..........n.....
00000070h: 00 00 00 00 00 00 AC 00 00 00 6E 00 00 00 00 00 ; ......?..n.....
00000080h: 02 00 00 00 00 00 00 00 00 00 00 00 6A 00 00 00 ; ............j...
00000090h: 14 00 00 00 2E 74 65 78 74 00 00 00 00 08 00 00 ; .....text.......
000000a0h: 00 00 00 00 00 08 00 00 00 00 00 00 6D 73 76 63 ; ............msvc
000000b0h: 72 74 2E 64 6C 6C 00 00 20 00 10 E0 00 00 00 00 ; rt.dll.. ..?...
000000c0h: 50 68 72 6C 64 0A 68 6F 2C 77 6F 68 68 65 6C 6C ; Phrld.ho,wohhell
000000d0h: 54 B0 6E 03 43 08 FF 10 83 C4 14 C3 00 00 00 00 ; T皀.C.?.兡.?...

    注意,这个PE文件是不能直接被windbg(6.3.0017.0)调试器直接启动的,要想调试
    代码可以在汇编代码前加上int 3(0xcc)来调试.
   5.过程五:    
    
    最后,来考虑一下我们文件的对齐IMAGE_OPTIONAL_HEADER.SectionAlignment和
    IMAGE_OPTIONAL_HEADER.FileAlignment, 既然要求是2的幂,那么我们完全可以
    用2的0次方即1来做我们的alignment.
    最终生成的EXE又瘦身了,大小仅为216字节,其中包括28字节的汇编代码
        
           00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000h: 4D 5A 90 00 03 00 00 00 04 00 00 00 50 45 00 00 ; MZ?........PE..
00000010h: 4C 01 01 00 00 00 70 72 69 6E 74 66 00 00 00 00 ; L.....printf....
00000020h: 70 00 0F 01 0B 01 06 00 00 00 00 00 00 00 00 00 ; p...............
00000030h: 00 00 00 00 BC 00 00 00 00 00 00 00 0C 00 00 00 ; ....?..........
00000040h: 00 00 40 00 01 00 00 00 01 00 00 00 04 00 00 00 ; ..@.............
00000050h: 00 00 00 00 04 00 00 00 00 00 00 00 00 10 00 00 ; ................
00000060h: 00 00 00 00 00 00 00 00 03 00 6E 00 00 00 14 00 ; ..........n.....
00000070h: 00 00 00 00 00 00 AC 00 00 00 6E 00 00 00 00 00 ; ......?..n.....
00000080h: 02 00 00 00 00 00 00 00 00 00 00 00 6A 00 00 00 ; ............j...
00000090h: 14 00 00 00 2E 74 65 78 74 00 00 00 00 08 00 00 ; .....text.......
000000a0h: 00 00 00 00 00 08 00 00 00 00 00 00 6D 73 76 63 ; ............msvc
000000b0h: 72 74 2E 64 6C 6C 00 00 20 00 10 E0 50 68 72 6C ; rt.dll.. ..郟hrl
000000c0h: 64 0A 68 6F 2C 77 6F 68 68 65 6C 6C 54 B0 6E 03 ; d.ho,wohhellT皀.
000000d0h: 43 08 FF 10 83 C4 14 C3                         ; C.?.兡.?

五. 后记.
    理论上来说,后面的汇编代码部分可以用任意自己的代码来填充,只要获得了kernel32.dll
    的GetProcAddress函数的地址,那书写自己控制的代码并不是问题,而代码长度部分可以由
    IMAGE_SECTION_HEADER的SizeOfRawData来控制. 我没有试过,不过相信用188字节的PE头结构
    写出的PE文件一定很cool. PE头还可以减小吗?你想,你能.
    最后祝大家中秋快乐!
六.参考.
    1.MSDN.
    2.winnt.h
    3.watercloud<<手工打造微型Win32可执行文件>>http://www.xfocus.net/articles/200302/482.html

posted on 2007-05-14 00:46 叶子 阅读(3004) 评论(0)  编辑 收藏 引用 所属分类: 技术研究


只有注册用户登录后才能发表评论。
网站导航: 博客园   IT新闻   BlogJava   知识库   博问   管理