You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

142 lines
8.4 KiB

  1. 1. Design
  2. The goal of the PaX project is to research various defense mechanisms
  3. against the exploitation of software bugs that give an attacker arbitrary
  4. read/write access to the attacked task's address space. This class of bugs
  5. contains among others various forms of buffer overflow bugs (be they stack
  6. or heap based), user supplied format string bugs, etc.
  7. It is important to realize that our focus is not on the finding and fixing
  8. such bugs but rather on prevention and containment of exploit techniques.
  9. For our purposes these techniques can affect the attacked task at three
  10. different levels:
  11. (1) introduce/execute arbitrary code
  12. (2) execute existing code out of original program order
  13. (3) execute existing code in original program order with arbitrary data
  14. For example the well known shellcode injection technique belongs to (1)
  15. while the so-called return-to-libc style technique belongs to (2).
  16. Introducing code into a task's address space is possible by either creating
  17. an executable mapping or modifying an already existing writable/executable
  18. mapping. The first method can be prevented by controlling what can be mapped
  19. into the task and is beyond the PaX project, access control systems are the
  20. proper way of handling this. The second method can be prevented by not
  21. allowing the creation of writable/executable mappings at all. While this
  22. solution breaks some applications that do need such mappings, until they are
  23. rewritten to handle such mappings more carefully this is the best we can do.
  24. The details of this solution are in a separate document describing NOEXEC.
  25. Executing code (be that introduced by the attacker or already present in
  26. the task's address space) requires the ability to change the execution flow
  27. using already existing code. Such changes occur when code dereferences a
  28. function pointer. An attacker can intervene if such a pointer is stored in
  29. writable memory. Although it would seem a good idea to not have function
  30. pointers in writable memory at all, it is unfortunately not possible (e.g.,
  31. saved return addresses from procedures are on the stack), so a different
  32. approach is needed. Since the changes need to be in userland and PaX has
  33. so far been a kernel oriented project, they will be implemented in the
  34. future, see the details in a separate document.
  35. The next category of features PaX employs is a form of diversification:
  36. address space layout randomization (ASLR). The generic idea behind this
  37. approach is based on the observation that in practice most attacks require
  38. advance knowledge of various addresses in the attacked task. If we can
  39. introduce entropy into such addresses each time a task is created then we
  40. will force the attacker to guess or brute force it which in turn will make
  41. the attack attempts quite 'noisy' because any failed attempt will likely
  42. crash the target. It will be easy then to watch for and react on such
  43. events. The details of this solution are in a separate document describing
  44. ASLR.
  45. Before going into the analysis of the above techniques, let's note an often
  46. overlooked or misunderstood property of combining defense mechanisms. Some
  47. like to look at the individual pieces of a system and arrive at a conclusion
  48. regarding the effectivenes of the whole based on that (or worse, dismiss one
  49. mechanism because it is not efficient without employing another, and vice
  50. versa). In our case this approach can lead to misleading results. Consider
  51. that one has a defense mechanism against (1) and (2) such as NOEXEC and the
  52. future userland changes in PaX. If only NOEXEC is employed, one could argue
  53. that it is pointless since (2) can still be used (in practice this reason
  54. has often been used to dismiss non-executable stack approaches, which is
  55. not to be confused with NOEXEC however). If one protects against (2) only
  56. then one could equally well argue that why bother at all if the attacker
  57. can go directly for (1) and then the final conclusion comes saying that
  58. none of these defense mechanisms is effective. As hinted at above, this
  59. turns out to be the wrong conclusion here, deploying both kinds of defense
  60. mechanisms will protect against both (1) and (2) at the same time - where
  61. one defense line would fail, the other prevents that (i.e., NOEXEC can be
  62. broken by a return-to-libc style attack only and vice versa).
  63. In the following we will assume that both NOEXEC (the non-executable page
  64. feature and the mmap/mprotect restrictions) and full ASLR (using ET_DYN
  65. executables) are active in the system. Furthermore we also require that
  66. there be only PIC ELF libraries on the system and also a crash detection
  67. and reaction system be in place that will prevent the execution of the
  68. attacked program after a fixed (low) number of crashes. The possible venues
  69. of attack against such a system are as follows:
  70. - attack method (3) is possible with 100% reliability if the attacker
  71. does not need advance knowledge of addresses in the attacked task.
  72. - attack methods (2) and (3) are possible with 100% reliability if the
  73. attacker needs advance knowledge of addresses and can derive them by
  74. reading the attacked task's address space (i.e., the target has an
  75. information leaking bug).
  76. - attack methods (2) and (3) are possible with a small probability if the
  77. attacker needs advance knowledge of addresses but cannot derive them
  78. without resorting to guessing or a brute force search ('small' can be
  79. further quantified, see the ASLR documentation).
  80. - attack method (1) is possible if the attacker can have the attacked
  81. task create, write to and mmap a file. This in turn requires attack
  82. method (2), so the analysis of that applies here as well (note that
  83. although not part of PaX per se, it is recommended among others, that
  84. production systems use an access control system that would prevent
  85. this venue of attack).
  86. Based on the above it should come as no surprise that the future direction
  87. of PaX will be to prevent or at least reduce the efficiency of method (2)
  88. and eliminate or reduce the number of ways method (3) can be done (which
  89. will also help counter the other methods of course).
  90. 2. Implementation
  91. The main line of development is Linux 2.4 on IA-32 (i386) although most
  92. features already exist for alpha, ia64, parisc, ppc, sparc, sparc64 and
  93. x86_64 as well and other architectures are coming as hardware becomes
  94. available (thanks to the grsecurity and Hardened Gentoo projects). For
  95. this reason all implementation documentation is i386 specific (the generic
  96. design ideas apply to all architectures though).
  97. The non-executable page feature exists for alpha, i386, ia64, parisc, ppc,
  98. sparc, sparc64 and x86_64 while ppc64 can have the same implementation as
  99. ppc. The mips and mips64 architectures are hopeless in general as they have
  100. a unified TLB (the models with a split one will be supported by PaX). The
  101. main document on the non-executable pages and related features is NOEXEC,
  102. the two i386 specific approaches are described by PAGEEXEC and SEGMEXEC.
  103. The mmap/mprotect restrictions are mainly architecture independent, only
  104. special case handling needs architecture specific code (various pieces of
  105. code that need to be executed from writable and therefore non-executable
  106. memory, e.g., the stack or the PLT on some architectures). Here the main
  107. document is MPROTECT whereas EMUTRAMP and EMUSIGRT describe the i386
  108. specific emulations.
  109. ASLR is also mostly architecture independent, only the randomizable bits
  110. of various addresses vary among the architectures. The documents are split
  111. based on the randomized region, so RANDKSTACK and RANDUSTACK describe the
  112. feature for the kernel and user stack, respectively. RANDMMAP and RANDEXEC
  113. are about randomizing the regions used for (among others) ELF libraries
  114. and executables, respectively. The infrastructure that makes both SEGMEXEC
  115. and RANDEXEC possible is vma mirroring described by the VMMIRROR document.
  116. Since some applications need to do things that PaX prevents (runtime code
  117. generation) or make assumptions that are no longer true under PaX (e.g.,
  118. fixed or at least predictable addresses in the address space), we provide
  119. a tool called 'chpax' that gives the end user fine grained control over the
  120. various PaX features on a per executable basis.