This article explores the pgrx framework, which simplifies the creation of custom PostgreSQL extensions to bring more logic closer to your database. Traditionally, writing such extensions required familiarity with C and a deep understanding of PostgreSQL internals, which could be quite challenging. pgrx lowers the barrier and allows developers to use Rust, known for its safety and performance, making the process of creating efficient and safe database extensions much more accessible.

pg_sysload

When working with large datasets and migrations (as discussed in How Not to Change PostgreSQL Column Type), or during resource-intensive maintenance tasks, you'll want to optimise speed and minimise disruption to other processes. One way to control the pace of batch operations is to consider the load on the underlying system.

Many Unix-based systems (we will focus on Linux) provide a valuable metric called the system load average. This average consists of three values: the 1-minute, 5-minute, and 15-minute load averages. The load average is not normalised for the number of CPU cores, so a load average of 1 on a single-core system means full utilisation, while on a quad-core system, it indicates 25% utilisation.

In many cases, the system load average is also an excellent indicator of how ongoing operations are impacting a busy database cluster. In this article, we will create a PostgreSQL extension with a function called sys_loadavg() that retrieves this load information. We will use the /proc/loadavg file (part of the proc filesystem), which exposes underlying system details.

Getting Started with pgrx

Before we start, ensure you have:

With these prerequisites in place, you can install pgrx itself and create a new extension skeleton:

cargo install --locked cargo-pgrx
cargo pgrx new pg_sysload
cd pg_sysload

This gives you a complete environment for developing your own PostgreSQL extensions in Rust. While Rust might seem daunting at first (especially with its async and other features), the language itself is quite powerful. Its ease of extension creation makes it worth a second look.

Setting Up the Extension

While the pgrx new command sets up a basic template, we will create a more sophisticated extension. Its primary logic involves parsing the /proc/loadavg file, extracting its values, and returning them to the user.

The logic itself is straightforward:

#[pg_extern]
fn sys_loadavg() -> Option<Vec<f64>> {
    // Read the contents of the /proc/loadavg
    let mut file = match fs::File::open("/proc/loadavg") {
        Ok(file) => file,
        Err(err) => {
            pgrx::error!("Error reading /proc/loadavg: {}", err);
        }
    };
    let mut contents = String::new();
    if let Err(err) = file.read_to_string(&mut contents) {
        pgrx::error!("Error reading /proc/loadavg: {}", err);
    }

    // Extract the load average fields
    let fields = contents.split_whitespace().collect::<Vec<_>>();
    if fields.len() >= 3 {
        Some(
            fields[..3]
                .iter()
                .filter_map(|s| f64::from_str(s).ok())
                .collect(),
        )
    } else {
        pgrx::error!("Invalid format in /proc/loadavg");
    }
}

To limit usage to systems supporting the proc filesystem, we check for the presence of the file when the extension is loaded:

#[pg_guard]
fn _PG_init() {
    INIT.call_once(|| {
        let loadavg_available = fs::metadata("/proc/loadavg").is_ok();
        if !loadavg_available {
            pgrx::error!("/proc/loadavg not found. Extension cannot load.");
        }
    });
}

This basic check prevents users from loading the extension on incompatible systems. With minor details omitted, that's almost all there is to it.

You can find the complete lib.rs file in the accompanying GitHub repository.

Running the Extension

Where pgrx truly shines is in all the heavy lifting it does for you. It leverages the cargo command. To initialise the extension, run:

cargo pgrx init

This downloads and prepares PostgreSQL source code for versions 12 to 16 (at the time of writing). After a bit of waiting (hopefully successfully, if you have all the system dependencies), run the extension:

cargo pgrx run

You'll get a psql prompt for a dedicated PostgreSQL instance. Create and use the extension:

CREATE EXTENSION pg_sysload;

Now you can try the newly exposed function:

SELECT sys_loadavg();
   sys_loadavg
------------------
 {1.17, 0.57, 0.32}
(1 row)

And that's all. Your very first PostgreSQL extension is working.

Throttling Batch Processing

With the extension ready, let's revisit our original goal: throttling long-running batch data processing. The data from sys_loadavg can be used in various ways:

  • Dynamic Sleep Times: Insert calculated sleep intervals based on the system load.
  • Dynamic Batch Sizes: Adjust the number of rows processed per batch based on available resources.

Here's an example:

-- Sleep for 5 seconds multiplied by the 1-minute load of the system
SELECT pg_sleep((sys_loadavg())[1] * 5);

-- Assume 20 cores and for each one available, add 100 rows to process
SELECT ... LIMIT (SELECT (20 - (sys_loadavg())[1])::int * 100);

This allows you to fine-tune processing for unsupervised operation, automatically adapting to the current system load.

Conclusion

And there you have it! You've just built your first PostgreSQL extension using Rust and the pgrx framework. I have always found the prospect of writing an extension quite daunting (probably because I'm long gone from the C-ecosystem), but this wasn't so bad. You've now got a handy tool for monitoring system load, perfect for keeping tabs on how your database processes long-running migrations.

This is just scratching the surface. We haven't even touched on the really fun stuff – extending PostgreSQL's internals with custom data types, operators, or indexes. You could build extensions that transform or aggregate data, hook into external APIs, or create background workers, bringing various business logic directly into the database engine. And yes, tapping into the database's core can be a bit risky, and it is always important to assess risks and recovery options. But whatever you choose to create, remember, with pgrx at your side, you've got the power of Rust to keep things safe and sound.