order process part 1: Increasing purity and testability

order process part 1: Increasing purity and testability
Photo by eberhard ๐Ÿ– grossgasteiger / Unsplash

In this article I want to show you how to increase function purity, modularity and testability in inherently impure programs, based on a realistic code example.

๐Ÿค 
The following code snippets are not necessarily best practices. Each snippet is supposed to highlight one specific characteristic, without mixing in too many concepts at once. However, the code will get progressively better over time.

The task

Create a function that takes an input string as follows:

Hamburger, 6.95, 2
Cheeseburger, 7.95, 2
Fries, 3.25, 3

The first column is the name, the second column the net price, the third column the amount of ordered items.

Based on this input, and a tax rate of 8%, print a receipt that would looks like this:

You ordered 2 x Hamburger. That makes $15.01 (incl. $1.11 tax)
You ordered 2 x Cheeseburger. That makes $17.17 (incl. $1.27 tax)
You ordered 3 x Fries. That makes $10.53 (incl. $0.78 tax)
The total is $42.71 (incl. $3.16 tax)

Ex1: Initial Script

For now, we just create a proof of concept:

<?php

function printReceipt(): void
{
    $orderString = <<<TEXT
        Hamburger, 6.95, 2
        Cheeseburger, 7.95, 2
        Fries, 3.25, 3
        TEXT;

    $lines = explode("\n", $orderString);
    $order = [];
    foreach ($lines as $line) {
        $parts = array_map('trim', explode(',', $line));
 
        $name = $parts[0];
        $price = (float)$parts[1];
        $amount = (int)$parts[2];
        $order[] = [$name, $price, $amount];
    }
    
    $taxRate = 0.08;
    $grossTotal = 0;
    $taxTotal = 0;

    foreach ($order as $item) {
        // Destructure array to get easy access to each variable
        [$name, $price, $amount] = $item;

        // Calculate totals and taxes for the current item
        $netTotal = $price * $amount;
        $tax = $netTotal * $taxRate;
        $gross = $netTotal + $tax;

        // Update sum totals
        $grossTotal += $gross;
        $taxTotal += $tax;

        // Print the receipt line with proper formatting
        echo sprintf(
            "You ordered %d x %s. That makes \$%.2f (incl. \$%.2f tax)\n",
            $amount,
            $name,
            $gross,
            $tax
        );
    }

    // Print totals with proper formatting
    echo sprintf(
        "The total is \$%.2f (incl. \$%.2f tax)\n",
        $grossTotal,
        $taxTotal
    );
}

printReceipt();

This script does the job ๐ŸŽ‰, albeit only for one given $orderString and $taxRate.

Ex2: extracting stringToOrder

Before we continue with other things, let's make the code just a little bit more modular and accept an input.

<?php

function stringToOrder(string $input): array
{
    $lines = explode("\n", $orderString);
    $order = [];
    foreach ($lines as $line) {
        $parts = array_map('trim', explode(',', $line));
 
        $name = $parts[0];
        $price = (float)$parts[1];
        $amount = (int)$parts[2];
        $order[] = [$name, $price, $amount];
    }

    return $order;
}

function printReceipt(string $input): void
{
    $taxRate = 0.08;
    $order = stringToOrder($input);
    
    $grossTotal = 0;
    $taxTotal = 0;

    foreach ($order as $item) {
        // Destructure array to get easy access to each variable
        [$name, $price, $amount] = $item;

        // Calculate totals and taxes for the current item
        $netTotal = $price * $amount;
        $tax = $netTotal * $taxRate;
        $gross = $netTotal + $tax;

        // Update sum totals
        $grossTotal += $gross;
        $taxTotal += $tax;

        // Print the receipt line with proper formatting
        echo sprintf(
            "You ordered %d x %s. That makes \$%.2f (incl. \$%.2f tax)\n",
            $amount,
            $name,
            $gross,
            $tax
        );
    }

    // Print totals with proper formatting
    echo sprintf(
        "The total is \$%.2f (incl. \$%.2f tax)\n",
        $grossTotal,
        $taxTotal
    );
}

$orderString = <<<TEXT
    Hamburger, 6.95, 2
    Cheeseburger, 7.95, 2
    Fries, 3.25, 3
    TEXT;

printReceipt($orderString);

The code works exactly as before, but it's a bit more tidy and easier to read. The trained reader may notice it still has quite a few problems.

But for now I want to address one glaring issue:

โš ๏ธ
The entire function is impure and largely untestable. As it is now, you cannot write a simple script to verify the function works as expected, instead you have to run it and look at the output.

The reason for its untestability is that it relies on side-effects (in this case echo) to do its job. A surefire way to find such a function is its void return type.

Ex3: making it testable

Let's change that and make the function testable:

<?php

function generateReceipt(string $input): string
{
    $order = stringToOrder($input);
    $taxRate = 0.08;

    $grossTotal = 0;
    $taxTotal = 0;
    $receipt = "";

    foreach ($order as $item) {
        // Destructure array to get easy access to each variable
        [$name, $price, $amount] = $item;

        // Calculate totals and taxes for the current item
        $netTotal = $price * $amount;
        $tax = $netTotal * $taxRate;
        $gross = $netTotal + $tax;

        // Update sum totals
        $grossTotal += $gross;
        $taxTotal += $tax;

        // Append the receipt line with proper formatting
        $receipt .= sprintf(
            "You ordered %d x %s. That makes \$%.2f (incl. \$%.2f tax)\n",
            $amount,
            $name,
            $gross,
            $tax
        );
    }

    // Append totals with proper formatting
    $receipt .= sprintf(
        "The total is \$%.2f (incl. \$%.2f tax)\n",
        $grossTotal,
        $taxTotal
    );

    return $receipt;
}

$input = <<<TEXT
    Hamburger, 6.95, 2
    Cheeseburger, 7.95, 2
    Fries, 3.25, 3
    TEXT;

$receipt = generateReceipt($input);

echo $receipt;

What we changed:

  • renamed printReceipt() to generateReceipt()
  • removed all the echo calls from the function
  • instead of each echo, we append to a string that will be returned in the end
  • changed the return type to string

We can now write tests for the everything except the last line in the script, where we have to trust that echo works as intended. Everything else can be proven to work.

This maneouver of extracting and displacing side effects to the edges of your code is called "Functional Core, Imperative Shell", and precisely what we need to increase testability. Note the the generateReceipt() isn't strictly pure yet, but at least pure enough to write some tests for it.

Ex4: writing a test

Here is a simple demonstration of that:

$input = $orderString = <<<TEXT
    Hamburger, 6.95, 2
    Cheeseburger, 7.95, 2
    Fries, 3.25, 3
    TEXT;
    
$receipt = generateReceipt($input);

$expectedOutput = <<<TEXT
    You ordered 2 x Hamburger. That makes $15.01 (incl. $1.11 tax)
    You ordered 2 x Cheeseburger. That makes $17.17 (incl. $1.27 tax)
    You ordered 3 x Fries. That makes $10.53 (incl. $0.78 tax)
    The total is $42.71 (incl. $3.16 tax)
    TEXT;

assert($receipt === $expectedOutput)

echo $receipt;

In fact, this is how a real unit test for this function could look like, minus the echo at the end. It's that easy.

Ex5: more pure* and realistic (with random exceptions)

For our final iteration, let's make the generateReceipt() function truly pure and include error handling for the impure parts of the program.

<?php

function stringToOrder(string $input): array
{
    $lines = explode("\n", $orderString);
    $order = [];
    foreach ($lines as $line) {
        $parts = array_map('trim', explode(',', $line));
 
        $name = $parts[0];
        $price = (float)$parts[1];
        $amount = (int)$parts[2];
        $order[] = [$name, $price, $amount];
    }

    return $order;
}

function generateReceipt(array $order, float $taxRate): string
{
    $grossTotal = 0;
    $taxTotal = 0;
    $receipt = "";

    foreach ($order as $item) {
        // Destructure array to get easy access to each variable
        [$name, $price, $amount] = $item;

        // Calculate totals and taxes for the current item
        $netTotal = $price * $amount;
        $tax = $netTotal * $taxRate;
        $gross = $netTotal + $tax;

        // Update sum totals
        $grossTotal += $gross;
        $taxTotal += $tax;

        // Append the receipt line with proper formatting
        $receipt .= sprintf(
            "You ordered %d x %s. That makes \$%.2f (incl. \$%.2f tax)\n",
            $amount,
            $name,
            $gross,
            $tax
        );
    }

    // Append totals with proper formatting
    $receipt .= sprintf(
        "The total is \$%.2f (incl. \$%.2f tax)\n",
        $grossTotal,
        $taxTotal
    );

    return $receipt;
}

function getInput(): array
{
    if (mt_rand(0, 9) === 0) {
        throw new RuntimeException("Could not fetch order");
    }
    
    return <<<TEXT
        Hamburger, 6.95, 2
        Cheeseburger, 7.95, 2
        Fries, 3.25, 3
        TEXT;
}

function getTaxRate(): float
{
    if (mt_rand(0, 9) === 0) {
        throw new RuntimeException("Could not fetch taxRate");
    }
    
    return 0.08;
}

try {
    $input = getInput();            // impure, can fail
    $taxRate = getTaxRate();        // impure, can fail
    $order = stringToOrder($input); // impure, can fail
    
    // if we reached this point, we now have a valid $order
    // and can perform pure operations without side effects ๐Ÿ™‚
    
    $receipt = generateReceipt($order, $taxRate);        // pure
} catch (RuntimeException $e) {
    $receipt = "An error occurred: " . $e->getMessage(); // pure
}

echo $receipt; // impure

In this code, there are pure and impure parts.

All the impure parts:

  • getInput() has a random chance to throw an exception. Imagine it reads from a file or database. It can also return different values when called with the same parameters.
  • getTaxRate() has a random change to throw an exception. Let's imagine the value is read from a config file or database. It can potentially return different values when called multiple times with the same parameters.
  • orderToString($input) always returns the same value given the same parameters. However, if the input string does not have exactly the correct format, it will throw an exception or OutOfBounds error, which is a side effect.
  • echo $receipt. Echo does not compute a value, its only function is to produce a side effect.

The pure parts:

  • The generateReceipt() function is pure*, because:
    • It never produces any side effects (file IO, db access, web requests, echo, throwing exceptions*)
    • It does not modify its input parameters
    • It always produces the same result given the same parameters

In other words: Everything that handles input from and output to the real world is impure. Everything else (our core business logic) is pure.

This makes our generateReceipt() function exceptionally easy to test, and even lets us further process the result, e.g. with a decorator, before printing it.

While the code could still be more modular (e.g. splitting generateReceipt() into smaller sub functions), we do now have a test-case and can try to refactor the code more in the future without worrying about breaking anything. I'll leave this as an exercise for the reader.

Hopefully you can see why separating out the impure parts of your code to the edges is not only very helpful, but also not all that hard.


๐Ÿ˜’
Bad news: the current version of generateReceipt() is still not really pure and can run into unexpected errors.

Why it's still not pure

Some may complain, that the generateReceipt() function in our latest example performs a whole lot of mutation inside of it. Since all of this "shady business" stays inside however, it can still be considered pure from the outside.

The other issue is more pressing: generateReceipt() accepts an $order of type array. Our implementation assumes that $order is an array of arrays, and each of the sub-arrays has three fields: a string, a float and an int. Our function signature neither communicates nor enforces these constraints, so it is possible to call the function (at runtime) in such a way that it runs into a crash (side effect).

Therefore I would argue:

๐Ÿ’ก
To actually guarantee purity and predictable code, you also need type-safety. Namely, the ability of a language to precisely express the structure of its inputs and outputs at compile time.

It is possible to write pure (idempotent, side-effect free) code in PHP, however since the language is not compiled it is impossible to guarantee any given function meets these criteria, before shipping bugs. The type system is also significantly weaker than alternatives like Scala, Kotlin, Java and Rust, all of which have generics, for example. (generic types allow you to express nested type signatures, like List[OrderItem] as opposed to just array)

Good news: With tools like PHPStan you can add a "compile" step (checking the soundness of your program) into a pipeline that prevents you from merging bad code. It also allows you to express (and enforce) generic types via annotations. I cannot recommend this static analyzer enough, if you are a professional developer!

Next up

  • porting our generateReceipt() function to Scala 3 and easily building much more type-safety into it
  • Adding PHPStan to a real project, and how it can help you not only catch bugs earlier but also make your developer experience so much better

Consider subscribing to the newsletter if you would like to be notified when new articles release!