Skip to content

Linux Kernel: Vulnerability in the eBPF verifier register limit tracking

Moderate
rcorrea35 published GHSA-hfqc-63c7-rj9f Jul 16, 2024

Package

Kernel (Linux)

Affected versions

< 6.8

Patched versions

None

Description

Summary

A bug in the verifier’s register limit tracking was found by using https://github.com/google/buzzer that allows an attacker to trick the eBPF verifier into thinking a register has a value different from the one it takes when executing the program.
Using this bug, an LPE exploit was written allowing the attacker to gain an arbitrary kernel memory R/W primitive that can then be used to gain a full system compromise.

Severity

Moderate -

Proof of Concept

This is a complex operation that involves following the possible branches a bpf program can take and calculate all the possible states a register can take for all given possible branches a program can take.
To do this, the verifier has a bpf_reg_state structure. Particularly, the verifier keeps track of the possible minimum and maximum values a register can take when it is interpreted as an uint64_t, int64_t, uint32_t, int32_t.
Additionally, the verifier keeps track of all the bits that are known to be set on a register, this is tracked in the var_off field of bpf_reg_state . Var_off works by keeping a u64 value that acts as a mask, for any given bit that is known to the verifier, the mask will take the value of 0 at that bit’s position and the bit’s value will be tracked in the value member of the var_off.
To better illustrate this, here are a few examples from the verifier’s perspective of the limits of a register for a few given operations. For simplicity we will only illustrate the limits for when the register gets interpreted as a signed 32 bit value. U64, S64 and U32 values would follow a similar idea:


Operation S32 min value S32 max value Var off (value, mask) Explanation
R1 = input() S32_MIN (0x80000000) S32_MAX (0x7FFFFFFF) (0, 0xFFFFFFFF) If R1 takes an arbitrary value from the user, the verifier knows nothing about it, therefore the limits are the absolute maximum/minimum.Note that the mask has all 1’s indicating no bit has known value
R1 = 1337 1337 1337 (1337, 0x0) If the verifier is initialized to a constant value, then everything about it is known, and the limits and var off are adjusted accordingly
R1 = input()R1 |= 2 0x80000002 S32_MAX (0x7FFFFFFF) (2, 0xFFFFFFFD)(hex(FD) == binary(1111 1101) Here R1 has the second bit set via the or operation. This gives the verifier enough information to update the possible minimum value as well as marking one bit in the var off mask as known

After doing some changes to buzzer and implementing a new fuzzing strategy guided by coverage, we noticed the following log

REG INVARIANTS VIOLATION (true_reg1): range bounds violation u64=[0xffffffff, 0xfffffffe] s64=[0xffffffff, 0xfffffffe] u32=[0xffffffff, 0xfffffffe] s32=[0xffffffff, 0xfffffffe] var_off=(0xffffffff, 0x0)

REG INVARIANTS VIOLATION (true_reg2): range bounds violation u64=[0xffffffffffffffff, 0xfffffffffffffffe] s64=[0xffffffffffffffff, 0xfffffffffffffffe] u32=[0xffffffff, 0xfffffffe] s32=[0xffffffff, 0xfffffffe] var_off=(0xffffffffffffffff, 0x0)

REG INVARIANTS VIOLATION (false_reg2): const tnum out of sync with range bounds u64=[0x0, 0xffffffffffffffff] s64=[0x8000000000000000, 0x7fffffffffffffff] u32=[0x0, 0xffffffff] s32=[0x80000000, 0x7fffffff] var_off=(0xffffffffffffffff, 0x0)

Upon closer inspection, it became clear that there was a problem with the limit tracking of the registers; The minimum value is greater than the maximum value. And while this ended up not being too relevant for the bug that turned into a vuln, it led us in the right direction.
This error code is produced here, which in turn is called by the set_min_max function.
After reducing the bpf poc produced by buzzer, the program responsible for producing this error message looked similar to this (note: the read from map is oversimplified because it is not relevant to the bug):

R1 = read_from_map() // The verifier knows nothing about R1

R1 |= 2 // The verifier knows that bit 2 is set but knows nothing about the rest

if R1 != 0x7ffffffd goto b1:
Exit // False branch

b1:

R0 = 0 // True branch
Exit

After the or operation, the verifier knows the following about R1

Operation S32 min value S32 max value var_off
R1 = input()R1 |= 2 0x80000002 S32_MAX (0x7FFFFFFF) (2, 0xFFFFFFFD)(hex(FD) == binary(1111 1101)

When the verifier analyzes a branch operation, it splits the register states into the registers of the true branch (true_reg1, true_reg2) and the registers of the false branch (false_reg1, false_reg2). 

For the program above, since the second value in the conditional is a constant, the verifier creates a “fake” register initialized to the value of 0x7ffffffd


Further Analysis

While this exploit gives a powerful primitive to read/write arbitrary kernel memory. That said, the vulnerability is only reachable if an attacker has the CAP_BPF capability. This can be either an application or a container which as of now is not as widespread.

Timeline

Date reported: 06/07/2024
Date fixed: 06/13/2024
Date disclosed: 07/15/2024

Severity

Moderate

CVE ID

CVE-2024-41003

Weaknesses

No CWEs

Credits