«深入理解Android:Java虚拟机ART»–elf章的内容
As you can see from the description above, an ELF file consists of two sections – an ELF header, and file data. The file data section can consist of a program header table describing zero or more segments, a section header table describing zero or more sections, that is followed by data referred to by entries from the program header table, and the section header table. ==Each segment contains information that is necessary for run-time execution of the file, while sections contain important data for linking and relocation==. Figure 1 illustrates this schematically.

The ELF header is 32 bytes long, and identifies the format of the file. It starts with a sequence of four unique bytes that are 0x7F followed by 0x45, 0x4c, and 0x46 which translates into the three letters E, L, and F.
Among other values, the header also indicates
Debian GNU/Linux offers the readelf command that is provided in the GNU ‘binutils’ package. Accompanied by the switch -h (short version for “–file-header”) it nicely displays the header of an ELF file. Listing 3 illustrates this for the command touch.
real command:
Sdk\ndk-bundle\toolchains\llvm\prebuilt\windows-x86_64\bin\x86_64-linux-android-readelf.exe -h git\demo\xCrash\src\native\libxcrash\obj\local\x86_64\libxcrash.so
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          64 (bytes into file)
  Start of section headers:          382432 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         8
  Size of section headers:           64 (bytes)
  Number of section headers:         39
  Section header string table index: 38
The program header shows the segments used at run-time, and tells the system how to create a process image. The header from Listing 2 shows that the ELF file consists of 9 program headers that have a size of 56 bytes each table item, and the first header starts at byte 64.
Again, the readelf command helps to extract the information from the ELF file. The switch -l (short for –program-headers or –segments) reveals more details as shown in Listing 4.
real command: 
Sdk\ndk-bundle\toolchains\llvm\prebuilt\windows-x86_64\bin\x86_64-linux-android-readelf.exe -l git\demo\xCrash\src\native\libxcrash\obj\local\x86_64\libxcrash.so
Elf file type is DYN (Shared object file)
Entry point 0x0
There are 8 program headers, starting at offset 64
Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000000040 0x0000000000000040
                 0x00000000000001c0 0x00000000000001c0  R      8
  LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x00000000000135e8 0x00000000000135e8  R E    1000
  LOAD           0x0000000000013830 0x0000000000014830 0x0000000000014830
                 0x00000000000009a0 0x0000000000001648  RW     1000
  DYNAMIC        0x0000000000013a68 0x0000000000014a68 0x0000000000014a68
                 0x0000000000000240 0x0000000000000240  RW     8
  NOTE           0x0000000000000200 0x0000000000000200 0x0000000000000200
                 0x00000000000000bc 0x00000000000000bc  R      4
  GNU_EH_FRAME   0x0000000000013154 0x0000000000013154 0x0000000000013154
                 0x0000000000000494 0x0000000000000494  R      4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     10
  GNU_RELRO      0x0000000000013830 0x0000000000014830 0x0000000000014830
                 0x00000000000007d0 0x00000000000007d0  RW     10
 Section to Segment mapping:
  Segment Sections...
   00
   01     .note.android.ident .note.gnu.build-id .dynsym .dynstr .hash .gnu.version .gnu.version_d .gnu.version_r .rela.dyn .rela.plt .plt .text .rodata .eh_frame .eh_frame_hdr
   02     .fini_array .data.rel.ro .init_array .dynamic .got .got.plt .data .bss
   03     .dynamic
   04     .note.android.ident .note.gnu.build-id
   05     .eh_frame_hdr
   06
   07     .fini_array .data.rel.ro .init_array .dynamic .got .got.plt
The third part of the ELF structure is the section header. It is meant to list the single sections of the binary. The switch -S (short for –section-headers or –sections) lists the different headers. As for the touch command, there are 27 section headers, and Listing 5 shows the first four of them plus the last one, only. Each line covers the
windows real command: 
Sdk\ndk-bundle\toolchains\llvm\prebuilt\windows-x86_64\bin\x86_64-linux-android-readelf.exe -S git\demo\xCrash\src\native\libxcrash\obj\local\x86_64\libxcrash.so
Ubuntu real command:
Sdk/ndk/21.3.6528147/toolchains/llvm/prebuilt/linux-x86_64/bin/x86_64-linux-android-readelf
或使用环境变量中指定的:
~/Android/Source/android-9.0.0_r3$ which readelf
/usr/bin/readelf
There are 39 section headers, starting at offset 0x5d5e0:
Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .note.android.ide NOTE             0000000000000200  00000200
       0000000000000098  0000000000000000   A       0     0     2
  [ 2] .note.gnu.build-i NOTE             0000000000000298  00000298
       0000000000000024  0000000000000000   A       0     0     4
  [ 3] .dynsym           DYNSYM           00000000000002c0  000002c0
       0000000000000b70  0000000000000018   A       4     1     8
  [ 4] .dynstr           STRTAB           0000000000000e30  00000e30
       00000000000005fc  0000000000000000   A       0     0     1
  [ 5] .hash             HASH             0000000000001430  00001430
       0000000000000374  0000000000000004   A       3     0     8
  [ 6] .gnu.version      VERSYM           00000000000017a4  000017a4
       00000000000000f4  0000000000000002   A       3     0     2
  [ 7] .gnu.version_d    VERDEF           0000000000001898  00001898
       000000000000001c  0000000000000000   A       4     1     4
  [ 8] .gnu.version_r    VERNEED          00000000000018b4  000018b4
       0000000000000040  0000000000000000   A       4     2     4
  [ 9] .rela.dyn         RELA             00000000000018f8  000018f8
       0000000000000708  0000000000000018   A       3     0     8
  [10] .rela.plt         RELA             0000000000002000  00002000
       0000000000000990  0000000000000018  AI       3    11     8
  [11] .plt              PROGBITS         0000000000002990  00002990
       0000000000000670  0000000000000010  AX       0     0     16
  [12] .text             PROGBITS         0000000000003000  00003000
       000000000000c6f6  0000000000000000  AX       0     0     16
  [13] .rodata           PROGBITS         000000000000f700  0000f700
       0000000000002070  0000000000000000   A       0     0     16
  [14] .eh_frame         PROGBITS         0000000000011770  00011770
       00000000000019e4  0000000000000000   A       0     0     8
  [15] .eh_frame_hdr     PROGBITS         0000000000013154  00013154
       0000000000000494  0000000000000000   A       0     0     4
  [16] .fini_array       FINI_ARRAY       0000000000014830  00013830
       0000000000000010  0000000000000008  WA       0     0     8
  [17] .data.rel.ro      PROGBITS         0000000000014840  00013840
       0000000000000220  0000000000000000  WA       0     0     16
  [18] .init_array       INIT_ARRAY       0000000000014a60  00013a60
       0000000000000008  0000000000000008  WA       0     0     8
  [19] .dynamic          DYNAMIC          0000000000014a68  00013a68
       0000000000000240  0000000000000010  WA       4     0     8
  [20] .got              PROGBITS         0000000000014ca8  00013ca8
       0000000000000010  0000000000000000  WA       0     0     8
  [21] .got.plt          PROGBITS         0000000000014cb8  00013cb8
       0000000000000348  0000000000000000  WA       0     0     8
  [22] .data             PROGBITS         0000000000015000  00014000
       00000000000001d0  0000000000000000  WA       0     0     16
  [23] .bss              NOBITS           0000000000015200  00014200
       0000000000000c78  0000000000000000  WA       0     0     64
  [24] .comment          PROGBITS         0000000000000000  000141d0
       0000000000000065  0000000000000001  MS       0     0     1
  [25] .debug_str        PROGBITS         0000000000000000  00014235
       0000000000006eff  0000000000000001  MS       0     0     1
  [26] .debug_loc        PROGBITS         0000000000000000  0001b134
       0000000000014b4c  0000000000000000           0     0     1
  [27] .debug_abbrev     PROGBITS         0000000000000000  0002fc80
       00000000000026fa  0000000000000000           0     0     1
  [28] .debug_info       PROGBITS         0000000000000000  0003237a
       0000000000018d5f  0000000000000000           0     0     1
  [29] .debug_ranges     PROGBITS         0000000000000000  0004b0d9
       00000000000018b0  0000000000000000           0     0     1
  [30] .debug_macinfo    PROGBITS         0000000000000000  0004c989
       0000000000000012  0000000000000000           0     0     1
  [31] .debug_pubnames   PROGBITS         0000000000000000  0004c99b
       00000000000013f5  0000000000000000           0     0     1
  [32] .debug_pubtypes   PROGBITS         0000000000000000  0004dd90
       00000000000026ed  0000000000000000           0     0     1
  [33] .debug_line       PROGBITS         0000000000000000  0005047d
       0000000000009205  0000000000000000           0     0     1
  [34] .debug_aranges    PROGBITS         0000000000000000  00059682
       0000000000000060  0000000000000000           0     0     1
  [35] .note.gnu.gold-ve NOTE             0000000000000000  000596e4
       000000000000001c  0000000000000000           0     0     4
  [36] .symtab           SYMTAB           0000000000000000  00059700
       0000000000002058  0000000000000018          37   224     8
  [37] .strtab           STRTAB           0000000000000000  0005b758
       0000000000001cd7  0000000000000000           0     0     1
  [38] .shstrtab         STRTAB           0000000000000000  0005d42f
       00000000000001ac  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  l (large), p (processor specific)
hook 关系比较大的几个 section 是:
.dynstr:保存了所有的字符串常量信息。.dynsym:保存了符号(symbol)的信息(符号的类型、起始地址、大小、符号名称在 .dynstr 中的索引编号等)。函数也是一种符号。.text:程序代码经过编译后生成的机器指令。.dynamic:供动态链接器使用的各项信息,记录了当前 ELF 的外部依赖,以及其他各个重要 section 的起始位置等信息。.got:Global Offset Table。用于记录外部调用的入口地址。动态链接器(linker)执行重定位(relocate)操作时,这里会被填入真实的外部调用的绝对地址。.plt:Procedure Linkage Table。外部调用的跳板,主要用于支持 lazy binding 方式的外部调用重定位。(Android 目前只有 MIPS 架构支持 lazy binding).rel.plt:对外部函数直接调用的重定位信息。.rel.dyn:除 .rel.plt 以外的重定位信息。(比如通过全局函数指针来调用外部函数)graph LR
.dynamic-->当前ELF的外部依赖
.dynamic-->其他各个重要section的起始位置等信息
如果你理解了动态链接的过程,我们再回头来思考一下“.got”和“.plt”它们的具体含义。
PLT 和 GOT 记录是一一对应的,并且 GOT 表第一次解析后会包含调用函数的实际地址。既然这样,那 PLT 的意义究竟是什么呢?PLT 从某种意义上赋予我们一种懒加载的能力。当动态库首次被加载时,所有的函数地址并没有被解析。下面让我们结合图来具体分析一下首次函数调用,请注意图中黑色箭头为跳转,紫色为指针。

– 跳转 GOT 表的指令(jmp *GOT[n])。 – 为上面提到的第 0 条解析地址函数准备参数。 – 调用 PLT[0],这里 resovler 的实际地址是存储在 GOT[2] 。
在解析前 GOT[n] 会直接指向 jmp *GOT[n] 的下一条指令。在解析完成后,我们就得到了 func 的实际地址,动态加载器会将这个地址填入 GOT[n],然后调用 func。
如果你对上面的这个调用流程还有疑问,你可以参考《GOT 表和 PLT 表》这篇文章,它里面有一张图非常清晰。

当第一次调用发生后,之后再调用函数 func 就高效简单很多。首先调用 PLT[n],然后执行 jmp *GOT[n]。GOT[n] 直接指向 func,这样就高效的完成了函数调用。
总结一下,因为很多函数可能在程序执行完时都不会被用到,比如错误处理函数或一些用户很少用到的功能模块等,那么一开始把所有函数都链接好实际就是一种浪费。为了提升动态链接的性能,我们可以使用 PLT 来实现延迟绑定的功能。
对于函数运行的实际地址,我们依然需要通过 GOT 表得到,整个简化过程如下:
看到这里,相信你已经有了如何 Hack 这一过程的初步想法。这里业界通常会根据修改 PLT 记录或者 GOT 记录区分为 GOT Hook 和 PLT Hook,但其本质原理十分接近。
安卓中的动态链接器程序是 linker。源码在 这里。
动态链接(比如执行 dlopen)的大致步骤是:
mmap 预留一块足够大的内存,用于后续映射 ELF。(MAP_PRIVATE 方式)mmap 把所有类型为 PT_LOAD 的 segment 依次映射到内存中。.rel.plt, .rela.plt, .rel.dyn, .rela.dyn, .rel.android, .rela.android。动态链接器需要逐个处理这些 .relxxx section 中的重定位诉求。根据已加载的 ELF 的信息,动态链接器查找所需符号的地址(比如 libtest.so 的符号 malloc),找到后,将地址值填入 .relxxx 中指明的目标地址中,这些“目标地址”一般存在于.got 或 .data 中。DT_INIT 和 DT_INIT_ARRAY)。各 ELF 的构造函数是按照依赖关系逐层调用的,先调用被依赖 ELF 的构造函数,最后调用 libtest.so 自己的构造函数。(ELF 也可以定义自己的析构函数(destructor),在 ELF 被 unload 的时候会被自动调用)graph LR
dlopen("动态链接(执行dlopen)")-->check("检查已加载的 ELF 列表")
dlopen-->read(".dynamic section 中读取 libtest.so 的外部依赖的 ELF 列表")
dlopen-->loadEach("逐个加载列表中的 ELF。加载步骤")
loadEach-->mmap("用 mmap 预留一块足够大的内存,用于后续映射 ELF")
loadEach-->mmapPT_LOAD("读ELF的PHT用mmap把所有类型为PT_LOAD的segment依次映射到内存中")
loadEach-->dynamic("从.dynamic segment中读取各信息项,主要是各个section的虚拟内存相对地址,计算绝对地址。")
loadEach-->|relocate|relocate("逐个处理.relxxx section中的重定位诉求,找所需符号的地址,找到后将地址值填入.relxxx 中指明的目标地址.got或.data中")
loadEach-->ELFRefCount("ELF 的引用计数加一")
dlopen-->cons("逐个调用列表中 ELF 的构造函数constructor")
graph LR
issue("直接替换掉地址中的方法有三个问题")-->基地址-->|solution|maps("/proc/self/maps")
issue-->内存访问权限-->|solution|mprotect
issue-->指令缓存-->|solution|__builtin___clear_cache
总结一下 xhook 中执行 PLT hook 的流程:
PT_LOAD 且 offset 为 0 的 segment。计算 ELF 基地址。PT_DYNAMIC 的 segment,从中获取到 .dynamic section,从 .dynamic section中获取其他各项 section 对应的内存地址。.dynstr section 中找到需要 hook 的 symbol 对应的 index 值。mprotect 修改访问权限为可读也可写。mprotect 修改过内存访问权限,现在还原到之前的权限。

https://linux.die.net/man/1/readelf
https://linuxtools-rst.readthedocs.io/zh_CN/latest/tool/readelf.html
https://en.wikipedia.org/wiki/Executable_and_Linkable_Format
https://github.com/iqiyi/xHook/blob/master/docs/overview/android_plt_hook_overview.zh-CN.md