[ubuntu-hardened] Edgy and Proactive Security

Sat Jun 3 17:54:10 BST 2006

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Now that Dapper has been released, it may be good to look at Edgy and
start setting goals.  One important goal is to increase the security
baseline of Ubuntu, not only OpenBSD-style "no services are running at
install time" but also a more durable security baseline which reduces
the threat of later introduced security holes that may appear after
installing vulnerable daemons such as Apache, MySQL, OpenSSH, and other
services.

There is already a Proactive Security Roadmap[1], created originally as
a Breezy specification but never brought to fruition.  The specification
for this details several steps that can be taken to reduce the risk of
exploitation of existing vulnerabilities.  This e-mail contains my
suggestions for first steps that should be taken to give Ubuntu users
the benefit of largely increased security.

It should first be noted that many of the security measures that may be
taken do not create any specific guarantees on their own; they must be
combined in interesting ways to create strong, quantifiable guarantees.
 As an example, implementing a non-executable stack does not provide a
security guarantee because an application may mprotect() its stack or an
attacker may execute a return-to-libc style attack; however,
implementing a non-executable stack with SELinux enforcement to prevent
mprotect(...|PROT_EXEC) on the stack AND randomized stack and mmap()
base creates a situation where certain known classes of attacks can be
quantified to only be successful in one state of a known number of
possible states, and thus only that likely to succeed per attempt.

It is important that enhancements made be examined to understand what
guarantees they do and do not make and what kind of added security
benefits they provide.  It is also important to avoid implementing
enhancements that become unmaintainable in the future.  That being said,
it is wise to examine possible directions for proactive security
implementation in Ubuntu for all desktop and server level installations
to give maximum protection to end users from security threats.

Each section here has a bulleted summary at the end for convenience.
There is also a bullet point summary of everything at the end of this.
Reading both helps, but the lazy can just skip to the end.

ADDRESS SPACE RANDOMIZATION

The easiest target is address space randomization.  Ubuntu already
includes low-entropy address space randomization on the order of 19 bits
of stack and 8 bits of mmap() base entropy.  This has been in Linux
Mainline since 2.6.12, and places the stack in a position aligned to 16
bytes across an 8MiB range of VMA; and the heap in a position aligned to
4096 bytes (1 page) across a 1MiB range additional below that.

With an executable stack, stack randomization can be easily reduced to
approximately 11 bits, or 2048 possible states.  If the stack is
executable, then the position of the stack is all that matters for an
attack; system() can be called through the GOT during attack (position
of GOT is stored in a register).

With a non-executable stack, the position of system() has to be guessed
for return-to-libc and a stack frame has to be injected into the stack.
 This requires breakage of both the 8 bits of mmap() randomization (256
states) and 19 bits stack randomization (2048 states), which multiply
straight together to give 27 bits (134 million states) to break.  The
stack randomization can be reduced to 15 bits or so by repeating and
aligning stack frames for system() in the stack during attack (8 million
states).

Note that the mmap() base has exactly 8 bits of randomization.  The
mmap() base can be over 9MiB (11 bits); but is always tied to the stack,
and thus not all of those 2304 positions can be reached in any given
state of the stack.  Because of this, the real entropy is always 8 bits.

Fedora Core, RHEL, and PaX also implement heap randomization.  Some
programs, notably Emacs, are very fragile when faced with this; most
work fine.

The main binary is not randomized; a very real attack is possible by
returning to code in the main executable which contains a call to the
desired function to exploit, such as system().  This requires more
complex instrumentation, but is very possible and nullifies security
gains from mmap() randomization.  This attack can be mitigated with
Position Independent Executables, discussed later.

Currently entropy is very low to avoid breaking VMA-intensive
applications.  On 64-bit architectures, entropy should be increased to
provide better protection.  Systematic brute forcing of daemons that
fork(), such as Apache, have been shown to be useful in breaking
high-order randomization in systems such as PaX in a mere 216
seconds[2].  An anti-brute-force mechanism to deal with this can slow or
stop the attack; a good one does not exist, a workable one exists in
GrSecurity.  I may be able to describe a better system for this later.

I have created a patch for 2.6.16 that implements a framework useful for
per-process entropy instrumentation, which can later be fleshed out into
an SELinux policy control for per-binary entropy levels; this would
allow entropy to be decreased without disabling randomization on
troublesome applications that dislike high-order entropy.  It currently
does nothing of the sort, but does allow a kernel command line parameter
to set the maximum mmap() and stack entropy to up to 1/12 of VMA space.

The patch needs wider testing; slight code adjustment to handle more
architectures; and integration into mainline.  Mainline does not seem to
be interested, an argument needs to be formed.  One possible argument is
that it would make it trivial to set the default level of entropy per
platform; 64-bit architectures can use far more entropy safely and
sanely than 32-bit.

We can make good guarantees on the probability of exploitation of memory
corruption bugs here; unfortunately, there is an easy evasion path.
Position Independent Executables can close this evasion path, allowing
us to make this awesome guarantee.  Furthermore, we cannot guarantee the
nature of any attack on ASLR alone; without further protections, we can
make no general assumption that any particular attack will be forced to
evade anything but the lower 11 bits of stack randomization.  To make
these guarantees, we need a strictly enforced memory protection policy.

Summary:

 * ASLR exists in current Ubuntu, inherited from mainline since 2.6.12
 * Current randomization is 8 bits mmap(), 19 bits stack
 * Fedora Core, RHEL, and PaX implement heap randomization
 * Heap randomization may break Emacs if implemented
 * Main executables are not randomized, but may be via PIE
 * The low entropy can be quickly broken in certain conditions with
   daemons which fork() to create children to handle connections,
   notably Apache
 * Brute force deterrents can be made for systematic brute forcing;
   however, they are problematic and create major DoS scenarios when an
   attack occurs, saving integrity and confidentiality at the expense of
   accessibility.
 * More robust brute force deterrents can be created, with more
   fine-grained protections and a fall-back contingency for major
   distributed attacks.  These are yet to be designed or implemented,
   but are VERY important, especially if we want to make any security
   guarantees.
 * High-order entropy breaks certain programs, notably Oracle on 32-bit
   due to VMA fragmentation.  Typically the TASK_SIZE on 64-bit archs
   makes this a non-issue.
 * There is no per-process or per-architecture entropy control; but this
   is semi-trivial to implement.  A patch exists to add framework for
   this.  Its maintainer (me) wants to get it into mainline so that he
   does not have to continue up-porting it in the future, and so that
   more experienced kernel hackers can port it to all other
   architectures.
 * We can make no guarantees on exploitation probability for memory
   corruption attacks until we make our binaries PIE, because there is a
   possible attack using the fixed position main executable and heap
   that allows total evasion of needing to know the stack or heap base
   for these attacks (including buffer overflows).
 * We do not make a guarantee on the nature of any given attack; that
   is, we cannot guarantee that attacks won't try direct shellcode
   injection.  Making these guarantees requires memory protection
   policies.

STACK PROTECTION

Stack Smash Protection is an easy target for Edgy, if Ubuntu uses gcc
4.1.  Fedora Core 5 utilizes FORTIFY_SOURCE and the integrated stack
smash protector derived from ProPolice (which should be
- -fstack-protector); Adamantix, Hardened Gentoo, and OpenBSD have used
ProPolice -fstack-protector for several years.  Activating these
everywhere is trivial.

Both of these protections work to provide low-overhead protection
against vanilla stack-based buffer overflows.  FORTIFY_SOURCE is mainly
compile time with some run-time wrappers around "dangerous functions"
like strcpy(); -fstack-protector is mainly run-time with direct testing
of the integrity of protected stack frames, with some compile-time work
rearranging function local variables to prevent overflows from causing
as much damage.

- -fstack-protector-all provides protection to ALL functions, but should
be avoided due to added overhead.  Realisticly, -fstack-protector should
be sufficient; it provides overhead-free compile-time protection for all
functions while only supplying added logic with run-time overhead to
functions that have a local char[] array.  -fstack-protector will not
properly protect variable-argument functions like printf(s, ...);
- -fstack-protector-all will not fully protect them, but will do a better
job at the cost of more overhead.

- -fstack-protector uses canary-based detection using a 32-bit canary
randomly generated at each program run.  This gives 1 state in 4 billion
that will provide a successful attack; all other states lead to
immediate program termination.  If the attack also relies on knowing the
position of the stack, heap, mmap() base (libraries), or main
executable, any randomization applied to them is added to this in terms
of bits of entropy; the number of total states possible is multiplied
together from each individual protection, and only one state leads to
success in any given attack.

- -fstack-protector canaries can be evaded by other types of attacks
besides vanilla buffer overflows.  Notably, format string bugs allowing
writing arbitrary values to arbitrary memory addresses can skip the
overflow step altogether and bypass the canary.  These are visible as
printf(USER_SUPPLIED_STRING) and such in code.

There may be some breakage due to FORTIFY_SOURCE or -fstack-protector.
Due to the nature of these protections, they never break anything by
design.  If something breaks, there is a genuine bug in either gcc or
the package being compiled; FORTIFY_SOURCE and -fstack-protector both
have been known to identify bugs and previously undiscovered security
holes in packages at run time or, in the case of FORTIFY_SOURCE,
sometimes at compile time.

Because there may be breakage due to bugs in the FORTIFY_SOURCE or
- -fstack-protector implementations, it is important to implement these
early and start hunting for these bugs.  Fedora Core 5 uses both of
these, so there is already plenty of testing going on in this scene.
Additionally, FORTIFY_SOURCE and -fstack-protector will help with
quality assurance in Ubuntu by occasionally exposing bugs that would
otherwise go unnoticed.

In some cases, we may just be stuck with a package that doesn't want to
work with one of these protections or the other due to bugs in their
implementation (gcc bugs).  If we can NOT get gcc fixed, then these
packages must be compiled without the protections; I *STRONGLY*
*RECOMMEND* Ubuntu maintainer policies and procedures be amended to
include a *SPECIFIC* contingency for this when these protections are
*OFFICIALLY* part of Ubuntu Linux, which *REQUIRES* packages without
protection to be *CLEARLY* *MARKED* at the *TOP* of their description in
a standard way.  This may be automated eventually.

The guarantees made by FORTIFY_SOURCE and -fstack-protector are very
narrow and unfortunately not well definable.  They do not define what
the system allows; they define a process to attempt to detect attacks
just in time.  The advantages here are very real; however, we can make
NO GENERALIZED GUARANTEES on the effectiveness of this protection.

One example of breaking -fstack-protector, a function pointer in a
structure in the local stack frame of the caller may be destroyed.  If a
pointer to this structure was passed to the current function AND the
current function calls the function pointed to before returning, then an
overflow occurring before call via said function pointer can destroy the
function pointer and lead to a successful attack.  This scenario is
HIGHLY UNLIKELY; however, the probability of this occurrence CANNOT BE
QUANTIFIED and thus we CANNOT GUARANTEE a specific level of protection.

Summary:

 * FORTIFY_SOURCE and -fstack-protector prevent stack buffer overflows
   from carrying out injected attack logic (instead the program crashes)
 * Fedora Core 5 uses both of these protections; RHEL5 will inherit
   these.  This means there is already wide testing happening for these.
 * FORTIFY_SOURCE uses code analysis at compile time and replaces
   functions like strcpy() and sprintf() for run-time protection if it
   knows an attack is possible to detect at run-time
 * -fstack-protector uses canary-based protection at run-time
 * Canaries are random and 32-bits, and thus can be broken in 1 of every
   4 billion attacks
 * Many attacks that -fstack-protector stops also are hindered by stack
   and heap randomization; the effectiveness of these protections is
   multiplied in these cases.
 * Our current best possible protection is theoretically 32 + 19 + 8
   bits, 59 bits, approximately 5.8x10^17 states with only 1 being a
   success case, assuming a stack buffer overflow on a non-executable
   stack forcing a return-to-libc.
 * Canaries can be evaded by certain attacks like printf(USER_STRING);
   however, these attacks require other forms of bugs not characteristic
   of vanilla-style buffer overflows
 * Bugs in packages may be exposed by either FORTIFY_SOURCE or
   -fstack-protector; thus, these are not only security tools, but
   valuable quality assurance tools
 * Bugs in gcc may cause miscompilation when using FORTIFY_SOURCE or
   -fstack-protector.  This should lead to fixing gcc.
 * If a bug in FORTIFY_SOURCE or -fstack-protector in gcc cannot be
   fixed, the only solution is to build the affected packages without
   protection; these packages should be marked in their description as
   NOT protected to aid system administrators in making security
   conscious decisions.
 * Unlike with ASLR, We CANNOT make guarantees based on stack smash
   protection.  We CAN benefit from these protections, just not in a
   mathematically quantifiable way.

POSITION INDEPENDENT EXECUTABLES

It was noted that an attack could be completed by which existing code in
the main executable image can be executed to call various library
functions without identifying the mmap() base, thus breaking ASLR.  It
is also possible to place the data for these function calls in the heap,
which is not currently randomized; this escapes the need for guessing
the stack's position, and gives 0 bits of entropy and a 100% successful
attack to a crafty attacker despite ASLR.

These attacks rely on the main executable base and heap base being
known; thus, the logical way to stop them is by destroying the guarantee
of the address of the heap and main executable.  This can be done by
building the main executable as a position independent executable.

An example PIE can be seen by examining /lib/libc.so.6, which is a
library but also an executable.  Hardened Gentoo and Adamantix both use
a full PIE base, all programs compiled as PIE unless they have a
specific issue (such as hand-coded assembly etc) which breaks them;
Fedora Core 5 uses PIE for exposed daemons such as Apache, and this will
be inherited by RHEL5.

Instrumenting PIE into a distribution is somewhat complex.  Maintainers
of distributions such as Hardened Gentoo and Adamantix spend a
significant amount of time fixing the few applications that break when
compiled as PIE and submitting patches to mainline.  The benefits of
having position independent executables, however, are quite significant
as stated before.  Furthermore, once the initial trial-by-fire is done,
maintaining the distribution in a workable state can be made trivial
while packages that cannot be made PIE are fixed and/or reported upstream.

PIE has a near negligible overhead on many platforms, including x86-64.
 Unfortunately on x86, position independent code costs 0.99% in
overhead; and further, the -fomit-frame-pointer optimization, which
supplies a 5% performance boost, becomes ineffective.  Thus the overall
performance hit is 6%.

The overhead of PIE is restricted to code in the main executable binary
image ONLY.  A quick look with oprofile after about a half hour shows
that 10% of CPU usage for the entire system occurs in this code; 8% of
this is in X, followed by The Battle for Wesnoth (0.54%), oprofiled
(0.477%), gtk-gnutella (0.2581%), and metacity (0.0757%).  This means
overall the performance loss is 6% * 10% == 0.6%.  This makes sense;
most work is done in libraries, including compression, decompression,
encoding, painting of graphical controls (GTK+), memory management,
database access (libmysql), and image manipulation.  Libraries are not
modified by PIE compilation, and are PIC already anyway.

It would be wise and very useful to examine individual programs at a
closer distance to see how they distribute their workload.  Some good
targets are Gimp, Gaim, Firefox, Evolution, Konqueror, gnome-panel,
nautilus, metacity, and OpenOffice.org.  I suspect that in terms of
their own execution these may range anywhere from 0.1% to 30% in time
spent executing in the main executable, which translates to a 0.6% to
1.8% performance hit (and 1.8% is still 26 minutes added execution time
per 24 hours of solid 100% CPU maxing; or about 1 extra minute per hour
of maxing your CPU).

X may unfortunately suffer the most; but it also has interesting
permissions (privileged IO) that once compromised allow it to usurp
better-than-kernel access to the system (the SMM attack), making it
highly important to protect.  Other interesting programs that definitely
need protection include Gaim, Firefox, and Thunderbird, all of which
have an executable stack even on x86-64; although really PIE should just
be applied to everything due to the negligible overhead it causes.

PIE actually allows us to make a security guarantee on the order that an
attacker will not be able to successfully perform memory corruption
based exploits more than 1 in 2^(STACK_RANDOM_BITS + MMAP_RANDOM_BITS)
times.  This guarantee is probabilistic.  We should possibly decrease

Summary:

 * Stack/mmap() ASLR can be completely evaded by using the main
   executable and the heap for an attack.
 * This evasion can be avoided by making the main executable and heap
   move around with the mmap() base, which is what PIE allows.
 * glibc is an example of a PIE; Hardened Gentoo, Adamantix, and Fedora
   Core already employ PIE.
 * Implementing PIE can be complex, as a small but significant number of
   packages break from it.
 * Once deployed, maintaining a PIE-based distribution should be a
   negligible task.
 * PIE fixes should be submitted upstream.  Packages that don't build
   PIE should be reported upstream.
 * PIE can cause a 1% overhead on x86, and nullify a 5% performance gain
   from the -fomit-frame-pointer optimization, giving a total 6%
   performance hit on x86.
 * Performance overhead on x86-64 is negligible (measured 0.02% total),
   trust me we don't care.
 * Total system time spent in main executable images is apparently 10%;
   this gives the total performance hit of PIE as 0.6% on x86.
 * 8% of that 10% is apparently Xorg.
 * Individual processes should be profiled for better estimates on the
   real impact of PIE.  It is not likely to exceed 1.8% for any given
   application.
 * a 1.8% added overhead is about 1 minute added execution time per hour
   of spinning the CPU at 100% usage.  Honestly, do we care given the
   benefit?
 * We can make a guarantee on the likelihood of exploitation of certain
   classes of attacks (buffer overflows, double free()s, heap
   corruption..) when PIE binaries are executed with proper ASLR.  It's
   a very good guarantee.  We need PIE for this because there is an
   evasion path with fixed position executables.

MEMORY PROTECTION POLICIES

Memory protections are very helpful in preventing exploitation of
existing known and more interestingly unknown security holes in
applications.  In particular, preventing data memory from being writable
and program code from being executable gives a known security benefit;
PaX, OpenBSD W^X, and Exec Shield all bank on this principle in some way
to provide increased security.  PaX finds its place in Hardened Gentoo
and Adamantix; W^X and Exec Shield, formed on the same principle, are
effected in OpenBSD, Fedora Core, and RHEL.

W^X and Exec Shield both provide simple code segment limit tracking,
separating executable memory from non-executable memory by simply making
low addresses executable and higher addresses not.  This typically has
the result of leaving the stack non-executable; everything else is
executable.  With PaX or with any CPU that has a hardware NX bit, any
page anywhere in memory can be non-executable.

Simply providing an NX bit, via hardware or emulation, is not enough.
It is true that even Exec Shield style emulation is useful in protecting
the stack; however, in classic form, any program can mprotect() its
stack or any other memory executable if it choses.  Currently Gaim and
anything Gecko-based are known to do this, even on x86-64[3][4][5].
Supplying facilities for non-executable memory does not guarantee or
even imply that programs benefit from it.

It is possible via simple SELinux policy to prevent a process from
turning its stack into an executable segment.  It is also possible to
exercise this control over the heap and other mappings, preventing
anonymous or writable file-backed mmap() segments and shared memory from
being executable.  Another obvious target is library and main executable
code, which should never be altered at run-time.

This type of policy is implemented in Fedora Core 5, although it is
disabled by default; I experienced breakage of Metacity with a forced NX
stack, but not much other difficulty.  It will predictably break Java
and Mono, and some of the stricter policy is likely to break any 3D apps
using nVidia's GLX library.  When such breakage is experienced, the
applications should be fixed to work properly under policy; however, it
is entirely possible as a work-around to place the applications in a
less restrictive SELinux role that does not enforce this memory
protection policy.

Implementing this kind of protection policy can lead to true data-code
separation, a configuration in which executable memory is non-alterable,
and non-executable memory never becomes executable.  Executable memory
is code, and non-executable memory is data; only data should ever be
alterable.  Memory being alterable at one point and executable
afterwards is data-code confused; this should obviously never happen,
aside from a few special cases such as with Java, Mono, or Qemu.  Other
cases are usually bugs (like nVidia's GLX library trying to squeeze a
little added benchmark performance with run-time generated code).

True data-code separation is present in PaX already, ingrained in the
kernel rather than exported to policy.  This is used by Hardened Gentoo
and Adamantix, and requires a handful of exceptions be made to get
everything working.  As stated before, this protection model can be
implemented with SELinux, and exceptions can be made via policy.

The advantage of true data-code separation is that guarantees can be
made about which programs are protected.  Programs under the full policy
can not make their stacks or heaps executable, and thus are immune to
buffer overflow and double-free() attacks that end in shellcode
injection on the stack or heap.  These attacks may be re-instrumented to
use a return-to-libc style attack; at this point, they must defeat stack
and mmap() randomization to succeed.

Summary:

 * Using memory protections properly can prevent vulnerabilities from
   being exploited.
 * An NX bit or a form of emulation are necessary to utilize useful
   memory protections.
 * W^X and Exec Shield provide non-guaranteed NX approximation to
   possibly protect the stack (mapping an executable page above the
   stack makes the whole stack executable).
 * Many x86 processors and almost all other architectures now come with
   a hardware NX bit Linux can use.
 * Programs may not utilize memory protections properly on their own; no
   guarantees can be made simply because memory protections are there.
 * A protection policy can be put in place via SELinux to enforce a
   memory protection policy, forcing the stack, heap, anonymous and
   writable mmap() segments, and shared memory to be non-executable and
   the main executable and library .text segments to be non-writable.
 * This kind of enforcement leads to data-code separation, the clear
   distinction between memory to be executed (code) and memory to be
   altered and read but NOT executed (data).
 * Data-code separation may break a few programs, notably Java, Mono,
   Qemu, and anything linking to nVidia's broken GLX implementation
   instead of Mesa.
 * SELinux policy can target known-breaking and known-broken apps to
   reduce enforcement of data-code separation until they are fixed.
 * Data-code separation gives a system-wide guarantee on how memory
   protections are applied; this allows us to give a guarantee that
   certain classes of attacks--particularly those involving memory
   corruption--against protected programs will be forced to use a
   return-to-libc style exploit, which is protected by address space
   randomization.

SUMMARY

Here is a bullet point summary of protections discussed.

 * Address Space Layout Randomization
  * Already present to a light degree in mainline 2.6.12
  * No main executable or heap randomization currently
  * Gives a probabilistic guarantee on rate of success exploiting
    programs via various methods
  * Evadable via return-to-main and heap tricks
  * Attacks leading to heap execution are not protected against, as
    there is no heap randomization
  * Ignoring ret2main, buffer overflow attacks on the stack cannot be
    guaranteed to not use shellcode; this reduces our guarantee in this
    case to stack randomization possibly minus 8 bits.
  * Supplimented by PIE and memory protection policies to give better
    guarantees.
  * Randomization can be increased, especially per-architecture
 * ProPolice/FORTIFY_SOURCE
  * Already present in gcc 4.1 and used in FC5, Hardened Gentoo,
    Adamantix, and OpenBSD
  * Provides compile-time and run-time buffer overflow protection for
    stack-based buffers
  * Exposes bugs in programs and in many cases squelches attacks on
    previously unknown or unpatched bugs
  * Helpful and genuinely useful, but gives NO SOLID GUARANTEES; the
    details of individual vulnerabilities must be examined to show if
    each is protected by this
 * Position Independent Executables
  * Used by Hardened Gentoo and Adamantix system-wide
  * Fedora Core 5 uses PIE for certain servers like bind and apache
  * Incurs 6% execution overhead in code in the main executable only
    on x86
   * Overhead incurred is also only valid in the executable; this may
     only affect 10% of code executed, so overall this is likely 0.6%
   * Every 1.8% of overhead adds 1 minute of execution time per hour of
     execution spent with the CPU constantly at 100% load
  * Overhead on other architectures is negligible
  * Initial deployment of PIE can be complex for maintainers
  * PIE gives ASLR stronger guarantees
   * The main executable now moves around with the mmap() base
   * The heap base can be easily randomized as well, although the code
     used must be added to the kernel
   * Any memory corruption attack has to guess at least the 8 bits of
     mmap() offset AND either the heap entropy level OR the stack
     entropy.  This is our GUARANTEED minimum complexity.
    * This guarantee can ONLY be made with proper memory protection
      policies
     * Without these policies, consider it possible to break this
       guarantee with a stack smash (format string bug can evade
       ProPolice-style canary)
     * Stack smash style attack only needs to break stack entropy minus
       8 bits, i.e. 11 bits on i386, 2048 states.
    * i386
     * 16 bits of mmap() are easily possible usually; some programs will
       break, so this needs to be evaluated and possibly
       policy-controlled for robustness.
     * The heap can be randomized to 8 bits separate from mmap() to
       follow current trends; but it is also usually safe to increase
       this to 16 bits of entropy.
     * These give 16 bits (65536 states) or 32 bits (4 billion states)
       of total minimum entropy that needs to be broken per attack,
       guaranteed, for the attack to succeed.  Launching 4 billion
       attacks to succeed once is quite noisy....
    * Other archs
     * on x86-64 and other 64-bit, we can raise this to 43 bits safely
       (8.8x10^12 states) for each of heap and mmap()
   * Any memory corruption attack that can't use the heap to store
     attack instrumentation data will require guessing the 19 bits of
     stack and 8 bits of mmap()
    * This is not a general guarantee.
 * Memory protection policies
  * Implemented originally at kernel level by PaX
  * Implemented currently via SELinux policy options
  * SELinux policy implementing this is in Fedora Core 5, and shows some
    minor breakage
  * PaX is used in Hardened Gentoo and Adamantix, and also shows some
    minor breakage
  * The few apps that break can have these policies disabled for them
    via SELinux
   * Java, Mono, Qemu, anything linking to nVidia GLX instead of Mesa.
     You may find others.
  * Improves guarantees of non-exploitability given by ASLR
   * Injection of code into the stack or heap is NOT POSSIBLE
   * Altering of existing executable code is NOT POSSIBLE
   * Return-to-libc and return-to-main attacks MUST be used in most if
     not all classical memory corruption attacks
   * A stack smash and shellcode injection is simply not possible, and
     our minimum guarantee TRULY becomes what is given in the PIE
     section (mmap() + heap randomization)
  * This protection is easily disabled at boot time, with associated
    loss of security guarantees.

In the end, if played right, most if not all attacks in the below
classes can be easily made to have a 1 in 4 billion success rate on i386
and 1 in 7.7x10^25 on x86-64, guaranteed.

 * Stack smashes (buffer overflows on the stack)
 * Heap smashes (buffer overflows on the heap)
 * Pointer overwriting
 * Double free()

(Possibly format string bugs as well; I have not examined all possible
uses of format string bugs, but am rather certain that, as they read and
write to arbitrary memory addresses, they must be very sensitive to
changing address space layouts)

Other attacks that may result in remote code execution include
off-by-one errors and integer overflows; however, these can be used in
other interesting ways in some situations.  It CAN be guaranteed that
using these attacks to perform remote code execution is affected by the
same guarantee of probabilistic exploitation.

I believe the first step should be the easiest-- implementing
FORTIFY_SOURCE and -fstack-protector via gcc 4.1 in Ubuntu.  Although
this gives no solid guarantees, it does give very real benefits (in the
same way a shield gives no guarantees you won't get arrowed in the
crotch, but does protect your chest, abdomen, and face from normal attacks)

I have a patch for kernel 2.6.16 to provide quick and dirty control over
mmap() and stack entropy.  This patch makes it easy to implement
per-architecture base entropy as well as policy-based per-execve()
(read: per-binary) entropy if anyone wants to add SELinux hooks and
appropriate code.  Eventually this patch needs to be argued into
mainline, but this is rather difficult.

PIE will be rather painful to implement I'm sure.  Once it's in place,
maintaining it should be much easier.  It will give the benefit of
closing a potential quick-and-dirty ASLR evasion hole.

An enhanced memory protection policy will be a huge asset, as it greatly
increases guaranteed protection (from 2048 states to 64k or 4 billion,
depending on heap/mmap() entropy).  Implementing this will require a
little policy work, but in general should also be easy.  It is mostly a
bug-hunting game from there, finding packages that don't like the policy
and reducing restrictions on them or patching them to work.

Also somebody needs to get heap entropy into mainline.  Fedora Core/RHEL
both have this, as does PaX (thus Hardened Gentoo and Adamantix).  This
would be easier if the entropy control patch got into mainline; heap
randomization breaks a few notable programs, so control over it is kind
of needed.

Questions?  Comments?  Flames?

[1] https://wiki.ubuntu.com/UbuntuDownUnder/BOFs/ProactiveSecurityRoadmap
[2] http://www.stanford.edu/~blp/papers/asrandom.pdf
[3] https://launchpad.net/distros/ubuntu/+source/gaim/+bug/34129
[4] https://launchpad.net/distros/ubuntu/+source/firefox/+bug/34131
[5] https://launchpad.net/distros/ubuntu/+source/firefox/+bug/34132

- --
All content of all messages exchanged herein are left in the
Public Domain, unless otherwise explicitly stated.

    Creative brains are a valuable, limited resource. They shouldn't be
    wasted on re-inventing the wheel when there are so many fascinating
    new problems waiting out there.
                                                 -- Eric Steven Raymond

    We will enslave their women, eat their children and rape their
    cattle!
                  -- Bosc, Evil alien overlord from the fifth dimension
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iQIVAwUBRIG+rws1xW0HCTEFAQJkbw//ep+q/DNjyOcVNbC1kKKoH9GaXJFTbT2F
SycxkViXxJbKTWiDilfETBYB2VjOvepIuMSl73rFKY8O8wWipf5+klTHrDuTW3Tq
2+wsFlKZ38iG02LYmP7nXADC4Ovjt7S3H0LNbFYCdKRWVMViU4tFn9/uvGpSuEmF
Dhyhzk+honmtLqjDvbW5iF1V7cVzcNq2fKnkFooKUbZcPW0+kU/tiLaF7TkWnsva
7O0h9L637uMYlycffuS7ZeoQBWitSVAvjms008Cw8YYstfB/Bp4qCXBKYnVq1bud
dj0K+nCgcEFqWE2xraDAl5JT4GsLy/Qk3S0oJ1RI4ZMo49acX5ywonVyzWy/cPLL
RlnS2nr7CVs78RzZHwZkpCideXlc/cASUtVNWb5CJffbrZtk72EkJHS3R4Iakwp+
OeyAtYyq+X5jRwf+q4q/k2UCWqirKlrPZhysvHUevdjj5dqc6LrVYpizxue8iVdg
yzwqH6OSzH0dbNByb3lDWWgC1n/ADXrlfDLyOqXy/6qF7RxVmnVYYP009s9BLD2k
yHxtLPhEGpnU/faH5VSBAauX8hUdeIfpScdv88uvhV9M+LJq7KIyV/2pR6bB3JFy
GgvbrCKxsSVsosUVx2jaRZqn76VyhkgoYuqFhsZZy4I6P1iy6pUDfJTT0u5a+lap
c7EhcuwpQuo=
=27Cz
-----END PGP SIGNATURE-----