When Hypervisor Met Snapshot Fuzzing

source: https://null2root.github.io/blog/2022/07/21/When-Hypervisor-Met-Snapshot-Fuzzing.html

1. Introduction

Hypervisor was known as hard target to fuzz over several years. Even though, lots of prior pioneers( Peter Hlavaty, Chaitin Tech, StarLabs, Peleg Hadar and Ophir Harpaz and many others ) doing amazing work to overcome this limit and found interesting bugs.

But it is not an easy topic for some fresh newbie like me to starts from the bottom. I think manual code( or binary ) auditing and static analysis could be only considerable options if I start my research a few years ago without What The Fuzz.

What The Fuzz( a.k.a WTF ) is a snapshot-based fuzzer targeting Windows Ring-3 and Ring-0 component. WTF’s snapshot is based on memory dump both kernel and user-land. Because of that, WTF can not emulate any functionality which requires anything not within in-memory and can not create new process or thread because memory dump limited on single process. But WTF support breakpoint Handler which you can set breakpoint on any address within memory dump and if fuzz execution reaches that address, pre-defined breakpoint handler will be executed. Based on this, we can trying to overcome some limitation on WTF such as file access.

I really love WTF’s flexibility and its potential, and I am going to show one of example usage of WTF on targeting Virtualbox to prove how awesome it is.

2. Developing Fuzz Module

First, you should define your own fuzz module for your target. There are few examples on github( fuzzer_hevd.cc, fuzzer_tlv_server.cc ) and blog post( ret2systems, Doar-e ).

My Target was Virtualbox’s SVGA component.

SVGA is something like “JavaScript of hypervisors“. Because of its complex nature, many hypervisor bugs are oriented by SVGA. And that’s the reason why I choose it as a first target.

VirtualBox have a function called vmsvgaR3FifoLoop. It waits until guest submit new command data through GPA. So this is a good spot( or should I call it source? ) to take a snapshot.

I set a breakpoint on vmsvgaR3FifoLoop+4E2 to take snapshot. It is same position as here, start of switch-case routine for SVGA and SVGA3D command.

blahblah1

After creating snapshot, I had to decide make it run multiple times in single execution or not. Because SVGA is consist of various commands like define, update, delete something…, I thought fuzz campaign must handle multiple SVGA commands during single round.

First, I had to define structure to contains multiple testcases. Luckily, 0vercl0k already write nice example to doing that. So I did same thing as below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
#define SVGA_CMD_BODY_MAX 0xA000

const uint32_t SvgaCmdList[] = {1, 3, 19, 22, 25, 29, 30, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 1040, 1041, 1042, 1043, 1044, 1046, 1047, 1048, 1049, 1050, 1051, 1052, 1053, 1054, 1055, 1056, 1057, 1058, 1059, 1060, 1061, 1062, 1063, 1064, 1065, 1066, 1067, 1068, 1069, 1070, 1071, 1072, 1073, 1074, 1075, 1076, 1077, 1078, 1079, 1080, 1081};

const uint32_t SvgaCmdListLen = 58;

struct SvgaCmd_t {
uint32_t SvgaCmdCode;
std::vector<uint8_t> SvgaBody;
NLOHMANN_DEFINE_TYPE_INTRUSIVE(SvgaCmd_t, SvgaCmdCode, SvgaBody);
};

struct SvgaCmds_t {
std::vector<SvgaCmd_t> SvgaCmds;
NLOHMANN_DEFINE_TYPE_INTRUSIVE(SvgaCmds_t, SvgaCmds);
};

struct {
std::deque<SvgaCmd_t> SvgaCmds;
CpuState_t Context;
uint8_t SvgaCmdBody[SVGA_CMD_BODY_MAX];
uint32_t SvgaCmdBodySize;

void RestoreGprs(Backend_t *B) {
const auto &C = Context;
B->Rsp(C.Rsp);
B->Rip(C.Rip);
B->Rax(C.Rax);
B->Rbx(C.Rbx);
B->Rcx(C.Rcx);
B->Rdx(C.Rdx);
B->Rsi(C.Rsi);
B->Rdi(C.Rdi);
B->R8(C.R8);
B->R9(C.R9);
B->R10(C.R10);
B->R11(C.R11);
B->R12(C.R12);
B->R13(C.R13);
B->R14(C.R14);
B->R15(C.R15);
}
} GlobalState;

SvgaCmds_t Deserialize(const uint8_t *Buffer, const size_t BufferSize) {
const auto &Root = json::json::parse(Buffer, Buffer + BufferSize);
return Root.get<SvgaCmds_t>();
}

bool InsertTestcase(const uint8_t *Buffer, const size_t BufferSize) {
GlobalState.SvgaCmds.clear();

const auto &Root = Deserialize(Buffer, BufferSize);
for(auto SvgaCmd : Root.SvgaCmds) {
GlobalState.SvgaCmds.emplace_back(std::move(SvgaCmd));
}

return true;
}

It is almost identical as example.

Next question was, how can I insert data into snapshot? I had to find a insertion point and proper memory address. Luckily( again ), VirtualBox using a function called vmsvgaR3FifoGetCmdPayload to receive command data from guest to host. I define a breakpoint handler in Init() callback function as below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
//
// unsigned __int8 *__fastcall vmsvgaR3FifoGetCmdPayload(
// unsigned int cbPayloadReq,
// volatile unsigned int *pFIFO,
// unsigned int offCurrentCmd,
// unsigned int offFifoMin,
// unsigned int offFifoMax,
// unsigned __int8 *pbBounceBuf,
// unsigned int *pcbAlreadyRead,
// PDMTHREAD *pThread,
// VGAState *pThis,
// VMSVGAR3STATE *pSVGAState,
// PDMDEVINSR3 *pDevIns)
//
if(!g_Backend->Setbreakpoint("VBoxDD!vmsvgaR3FifoGetCmdPayload", [](Backend_t *Backend) {
uint32_t cbPayloadReq = Backend->GetArg(0);
Gva_t pbBounceBuf_gva = Backend->GetArgGva(5);
Gva_t pcbAlreadyRead_gva = Backend->GetArgGva(6);

if(cbPayloadReq > GlobalState.SvgaCmdBodySize) {
DebugPrint("cbpayloadReq({:#x}) > SvgaCmdBodySize({:#x}), restore context and goto next round\n", cbPayloadReq, GlobalState.SvgaCmdBodySize);
return GlobalState.RestoreGprs(Backend);
}

if(cbPayloadReq > u32PbBounceBufMaxSize) {
const uint64_t RetAddr = Backend->VirtRead8(Gva_t(Backend->Rsp()));
DebugPrint("check this, RetAddr = {:#x}\n", RetAddr);
std::abort();
}

if(!Backend->VirtWriteDirty(pbBounceBuf_gva, GlobalState.SvgaCmdBody, cbPayloadReq)) {
fmt::print("Failed to write pbBounceBuf, pbBounceBuf = {:#x}, cbPayloadReq = {:#x}, SvgaCmdBodySize = {:#x}\n",
pbBounceBuf_gva.U64(), cbPayloadReq, GlobalState.SvgaCmdBodySize);
std::abort();
}

if(!Backend->VirtWriteStructDirty(pcbAlreadyRead_gva, &cbPayloadReq)) {
fmt::print("Faile to write pcbAlreadyRead, pcbAlreadyRead = {:#x}\n", pcbAlreadyRead_gva.U64());
std::abort();
}

Backend->SimulateReturnFromFunction(pbBounceBuf_gva.U64());

})) {
fmt::print("Failed to Setbreakpoint on VBoxDD!vmsvgaR3FifoGetCmdPayload\n");
return false;
}

I also had to define end point of execution. After some reversing, I found two spots and using it as end point.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
const Gva_t SvgaLoopEnd = Gva_t(g_Dbg.GetSymbol("VBoxDD!vmsvgaR3FifoLoop") + 0x24AB);
if(!g_Backend->Setbreakpoint(SvgaLoopEnd, [](Backend_t *Backend) {
DebugPrint("loop end reached, restore context\n");
GlobalState.RestoreGprs(g_Backend);
})) {
fmt::print("Failed to Setbreakpoint on SvgaLoopEnd\n");
return false;
}

const Gva_t SvgLoopEnd2 = Gva_t(g_Dbg.GetSymbol("VBoxDD!vmsvgaR3FifoLoop") + 0x24B2);
if(!g_Backend->Setbreakpoint(SvgLoopEnd2, [](Backend_t *Backend) {
DebugPrint("loop end2 reached, restore context\n");
GlobalState.RestoreGprs(g_Backend);
})) {
fmt::print("Failed to Setbreakpoint on SvgLoopEnd2\n");
return false;
}

As I said above, I create a snapshot on vmsvgaR3FifoLoop+4E2. So if I restore register context, next execution flow starts from there. Because of that, I had to parse new testcase using breakpoint Handler.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
const Gva_t SvgaLoopStart = Gva_t(g_Dbg.GetSymbol("VBoxDD!vmsvgaR3FifoLoop") + 0x4e2);
if(!g_Backend->Setbreakpoint(SvgaLoopStart, [](Backend_t *Backend) {
if(GlobalState.SvgaCmds.size() == 0) {
DebugPrint("testcase deque empty! goto next round...\n");
return Backend->Stop(Ok_t());
}

auto &Testcase = GlobalState.SvgaCmds.front();

while (Testcase.SvgaCmdCode == 1045) {
GlobalState.SvgaCmds.pop_front();

if(GlobalState.SvgaCmds.size() == 0) {
DebugPrint("testcase deque empty during cmd filtering! goto next round...\n");
return Backend->Stop(Ok_t());
}

Testcase = GlobalState.SvgaCmds.front();
}

Backend->Rbx(Testcase.SvgaCmdCode);

DebugPrint("SvgaCmdCode = {:#x}\n", Testcase.SvgaCmdCode);

if(Testcase.SvgaCmdCode >= 1040) {

DebugPrint("Have to avoid AssertBreak(pHdr->size < pThis->svga.cbFIFO), write {:#x} on first DWORD\n", Testcase.SvgaBody.size());

const uint32_t Svga3dCmdSize = Testcase.SvgaBody.size();
if(Svga3dCmdSize >= 0x200000) {
fmt::print("Svga3dCmdSize({:#x}) > cbFIFO, abort\n", Svga3dCmdSize);
std::abort();
}

memcpy(&GlobalState.SvgaCmdBody[0], &Svga3dCmdSize, 4);
memcpy(&GlobalState.SvgaCmdBody[4], Testcase.SvgaBody.data(), Testcase.SvgaBody.size());
GlobalState.SvgaCmdBodySize = Testcase.SvgaBody.size() + 4;
}
else {
memcpy(GlobalState.SvgaCmdBody, Testcase.SvgaBody.data(), Testcase.SvgaBody.size());
GlobalState.SvgaCmdBodySize = Testcase.SvgaBody.size();
}

GlobalState.SvgaCmds.pop_front();
})) {
fmt::print("Failed to Setbreakpoint on SvgaLoopStart\n");
return false;
}

After some investigation, I found CreateDeviceEx call in vmsvga3dContextDefine keep causing CR3 context switching and I didn’t found any way to handling it using breakpoint handler. So I just blacklisting it( SVGA_3D_CMD_CONTEXT_DEFINE = 1045 ).

I disclose almost every part of fuzz module except mutation part. I just use libfuzzer mutator to mutate body data of SVGA command and pick one of command in array randomly.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
explicit CustomMutator_t(std::mt19937_64 &Rng, const size_t TestcaseMaxSize)
: Rng_(Rng), TestcaseMaxSize_(TestcaseMaxSize) {

// set maximum size for multiple (mutated) testcase( Default = 1MB )
ScratchBuffer__ = std::make_unique<uint8_t[]>(_1MB);
ScratchBuffer_ = {ScratchBuffer__.get(), _1MB};

BodyMutator_ = make_unique<LibfuzzerMutator_t>(Rng, TestcaseMaxSize);
}

// skip for brevity...

std::string Mutate(uint8_t *Data, const size_t DataLen, const size_t MaxSize) {
auto Root = Deserialize(Data, DataLen);
auto &SvgaCmds = Root.SvgaCmds;

for(auto &SvgaCmd : SvgaCmds) {

//
// 50%
//
if(GetUint32(0, 1) == 1) {
uint32_t SvgaCmdRandomIdx = GetUint32(0, SvgaCmdListLen);
SvgaCmd.SvgaCmdCode = SvgaCmdList[SvgaCmdRandomIdx];
}

memcpy(GlobalMutBuffer, SvgaCmd.SvgaBody.data(), SvgaCmd.SvgaBody.size());
size_t NewTestcaseSize = BodyMutator_->Mut_.Mutate(GlobalMutBuffer, SvgaCmd.SvgaBody.size(), MaxSize);
SvgaCmd.SvgaBody.resize(NewTestcaseSize);
memcpy(SvgaCmd.SvgaBody.data(), GlobalMutBuffer, NewTestcaseSize);
}

json::json Serialized;
to_json(Serialized, Root);
return Serialized.dump();
}

I also define generator function. It is very useful if you are too lazy to create random input like me ^~^.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
std::string GenerateTestcase() {
SvgaCmds_t Root;
const auto N = GetUint32(1, 10);

for (size_t Idx = 0; Idx < N; Idx++) {
SvgaCmd_t SvgaCmd;

SvgaCmd.SvgaCmdCode = SvgaCmdList[GetUint32(0, SvgaCmdListLen)];
SvgaCmd.SvgaBody.resize(GetUint32(0x100, 0x400));

if(GetUint32(0, 1) == 0) {
for(int i = 0; i < SvgaCmd.SvgaBody.size(); i++) {
SvgaCmd.SvgaBody[i] = 0x41;
}
}
else {
for(int i = 0; i < SvgaCmd.SvgaBody.size(); i++) {
SvgaCmd.SvgaBody[i] = (uint8_t)GetUint32(0, 255);
}
}

Root.SvgaCmds.emplace_back(SvgaCmd);
}

json::json Serialized;
to_json(Serialized, Root);
return Serialized.dump();
}

std::string GetNewTestcase(const Corpus_t &Corpus) override {

//
// 20%
//
if(GetUint32(1, 5) == 1) {
return GenerateTestcase();
}

const Testcase_t *Testcase = Corpus.PickTestcase();

if (!Testcase) {
fmt::print("The corpus is empty, generate random one\n");
return GenerateTestcase();
}

//
// Copy the input in a buffer we're going to mutate.
//

memcpy(ScratchBuffer_.data(), Testcase->Buffer_.get(),
Testcase->BufferSize_);

// return Mutate(ScratchBuffer_.data(), Testcase->BufferSize_, TestcaseMaxSize_);

return Mutate(ScratchBuffer_.data(), Testcase->BufferSize_, u32PbBounceBufMaxSize - sizeof(uint32_t));
}

Aaaaannnd, this is everything you need to fuzz! I skip some formal code for brevity, but I think you can easily find what you need to define fully working fuzz module.

…..Ooooh wait, I forgot something. After some hours of struggle, I define a blacklist which cause CR3 context switching. I just put it on the end of Init() function.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
bool SetupVBoxBlacklistHooks() {
if(!g_Backend->SetBreakpoint("VBoxRT!RTLogLoggerEx", [](Backend_t *Backend) {
Backend->SimulateReturnFromFunction(0);
})) {
fmt::print("Failed to SetBreakpoint on VBoxRT!RTLogLoggerEx\n");
std::abort();
}

if(!g_Backend->SetBreakpoint("VBoxVMM!PDMCritSectLeave", [](Backend_t *Backend) {
Backend->SimulateReturnFromFunction(0);
})) {
fmt::print("Failed to SetBreakpoint on VBoxVMM!PDMCritSectLeave\n");
std::abort();
}

if(!g_Backend->SetBreakpoint("VBoxC!util::AutoLockBase::release", [](Backend_t *Backend) {
Backend->SimulateReturnFromFunction(0);
})) {
fmt::print("Failed to SetBreakpoint on VBoxC!util::AutoLockBase::release\n");
std::abort();
}

if(!g_Backend->SetBreakpoint("VBoxC!util::AutoLockBase::acquire", [](Backend_t *Backend) {
Backend->SimulateReturnFromFunction(0);
})) {
fmt::print("Failed to SetBreakpoint on VBoxC!util::AutoLockBase::acquire\n");
std::abort();
}

if(!g_Backend->SetBreakpoint("VirtualBoxVM!UIFrameBufferPrivate::NotifyChange", [](Backend_t *Backend) {
Backend->SimulateReturnFromFunction(0);
})) {
fmt::print("Failed to SetBreakpoint on VirtualBoxVM!UIFrameBufferPrivate::NotifyChange\n");
std::abort();
}

if(!g_Backend->SetBreakpoint("VBoxRT!SUPSemEventWaitNoResume", [](Backend_t *Backend) {
Backend->SimulateReturnFromFunction(0);
})) {
fmt::print("Failed to SetBreakpoint on VBoxRT!SUPSemEventWaitNoResume\n");
std::abort();
}

if(!g_Backend->SetBreakpoint("d3d9!CBaseDevice::Release", [](Backend_t *Backend) {
Backend->SimulateReturnFromFunction(0);
})) {
fmt::print("Failed to SetBreakpoint on d3d9!CBaseDevice::Release\n");
}

if(!g_Backend->SetBreakpoint("d3d9!CD3DBase::Clear", [](Backend_t *Backend) {
Backend->SimulateReturnFromFunction(0);
})) {
fmt::print("Failed to SetBreakpoint on d3d9!CD3DBase::Clear\n");
}

if(!g_Backend->SetBreakpoint("d3d9!CMipMap::SetAutoGenFilterType", [](Backend_t *Backend) {
Backend->SimulateReturnFromFunction(0);
})) {
fmt::print("Failed to SetBreakpoint on d3d9!CMipMap::SetAutoGenFilterType\n");
}

if(!g_Backend->SetBreakpoint("d3d9!CMipMap::GenerateMipSubLevels", [](Backend_t *Backend) {
Backend->SimulateReturnFromFunction(0);
})) {
fmt::print("Failed to SetBreakpoint on d3d9!CMipMap::GenerateMipSubLevels\n");
}

if(!g_Backend->SetBreakpoint("USER32!GetDC", [](Backend_t *Backend) {
Backend->SimulateReturnFromFunction(0x1337);
})) {
fmt::print("Failed to SetBreakpoint on USER32!GetDC\n");
}

return true;
}

Yep. That’s really everything. I use this fuzz module several hours and it founds interesting crash.

3. Vulnerability( TL;DR )

Crash occurred in vmsvga3dSurfaceCopy function( PageHeap needed ).

blahblah2

This function trying to copy surface data from one to another using surface id and there’s no boundary check between surface, so it become exploitable wildcopy vulnerability in heap memory.

spot the bug

This vulnerability patched in 6.1.36 release at July, 2022.

4. Conclusion

I think importance of snapshot fuzzing is, it makes researcher to focus on target itself.

Unlike other fuzzers based on runtime and DBI are often create (very) unreasonable side effect or need lots of time to create working harness. The concept of snapshot fuzzing makes it possible to reduce this waste of time.


When Hypervisor Met Snapshot Fuzzing
https://usmacd.com/cn/2022-07-21-When-Hypervisor-Met-Snapshot-Fuzzing/
作者
henices
发布于
2022年7月21日
许可协议