order process part 1: Increasing purity and testability
In this article I want to show you how to increase function purity, modularity and testability in inherently impure programs, based on a realistic code example.
The task
Create a function that takes an input string as follows:
Hamburger, 6.95, 2
Cheeseburger, 7.95, 2
Fries, 3.25, 3
The first column is the name, the second column the net price, the third column the amount of ordered items.
Based on this input, and a tax rate of 8%, print a receipt that would looks like this:
You ordered 2 x Hamburger. That makes $15.01 (incl. $1.11 tax)
You ordered 2 x Cheeseburger. That makes $17.17 (incl. $1.27 tax)
You ordered 3 x Fries. That makes $10.53 (incl. $0.78 tax)
The total is $42.71 (incl. $3.16 tax)
Ex1: Initial Script
For now, we just create a proof of concept:
<?php
function printReceipt(): void
{
$orderString = <<<TEXT
Hamburger, 6.95, 2
Cheeseburger, 7.95, 2
Fries, 3.25, 3
TEXT;
$lines = explode("\n", $orderString);
$order = [];
foreach ($lines as $line) {
$parts = array_map('trim', explode(',', $line));
$name = $parts[0];
$price = (float)$parts[1];
$amount = (int)$parts[2];
$order[] = [$name, $price, $amount];
}
$taxRate = 0.08;
$grossTotal = 0;
$taxTotal = 0;
foreach ($order as $item) {
// Destructure array to get easy access to each variable
[$name, $price, $amount] = $item;
// Calculate totals and taxes for the current item
$netTotal = $price * $amount;
$tax = $netTotal * $taxRate;
$gross = $netTotal + $tax;
// Update sum totals
$grossTotal += $gross;
$taxTotal += $tax;
// Print the receipt line with proper formatting
echo sprintf(
"You ordered %d x %s. That makes \$%.2f (incl. \$%.2f tax)\n",
$amount,
$name,
$gross,
$tax
);
}
// Print totals with proper formatting
echo sprintf(
"The total is \$%.2f (incl. \$%.2f tax)\n",
$grossTotal,
$taxTotal
);
}
printReceipt();
This script does the job ๐, albeit only for one given $orderString and $taxRate.
Ex2: extracting stringToOrder
Before we continue with other things, let's make the code just a little bit more modular and accept an input.
<?php
function stringToOrder(string $input): array
{
$lines = explode("\n", $orderString);
$order = [];
foreach ($lines as $line) {
$parts = array_map('trim', explode(',', $line));
$name = $parts[0];
$price = (float)$parts[1];
$amount = (int)$parts[2];
$order[] = [$name, $price, $amount];
}
return $order;
}
function printReceipt(string $input): void
{
$taxRate = 0.08;
$order = stringToOrder($input);
$grossTotal = 0;
$taxTotal = 0;
foreach ($order as $item) {
// Destructure array to get easy access to each variable
[$name, $price, $amount] = $item;
// Calculate totals and taxes for the current item
$netTotal = $price * $amount;
$tax = $netTotal * $taxRate;
$gross = $netTotal + $tax;
// Update sum totals
$grossTotal += $gross;
$taxTotal += $tax;
// Print the receipt line with proper formatting
echo sprintf(
"You ordered %d x %s. That makes \$%.2f (incl. \$%.2f tax)\n",
$amount,
$name,
$gross,
$tax
);
}
// Print totals with proper formatting
echo sprintf(
"The total is \$%.2f (incl. \$%.2f tax)\n",
$grossTotal,
$taxTotal
);
}
$orderString = <<<TEXT
Hamburger, 6.95, 2
Cheeseburger, 7.95, 2
Fries, 3.25, 3
TEXT;
printReceipt($orderString);
The code works exactly as before, but it's a bit more tidy and easier to read. The trained reader may notice it still has quite a few problems.
But for now I want to address one glaring issue:
The reason for its untestability is that it relies on side-effects (in this case echo) to do its job. A surefire way to find such a function is its void return type.
Ex3: making it testable
Let's change that and make the function testable:
<?php
function generateReceipt(string $input): string
{
$order = stringToOrder($input);
$taxRate = 0.08;
$grossTotal = 0;
$taxTotal = 0;
$receipt = "";
foreach ($order as $item) {
// Destructure array to get easy access to each variable
[$name, $price, $amount] = $item;
// Calculate totals and taxes for the current item
$netTotal = $price * $amount;
$tax = $netTotal * $taxRate;
$gross = $netTotal + $tax;
// Update sum totals
$grossTotal += $gross;
$taxTotal += $tax;
// Append the receipt line with proper formatting
$receipt .= sprintf(
"You ordered %d x %s. That makes \$%.2f (incl. \$%.2f tax)\n",
$amount,
$name,
$gross,
$tax
);
}
// Append totals with proper formatting
$receipt .= sprintf(
"The total is \$%.2f (incl. \$%.2f tax)\n",
$grossTotal,
$taxTotal
);
return $receipt;
}
$input = <<<TEXT
Hamburger, 6.95, 2
Cheeseburger, 7.95, 2
Fries, 3.25, 3
TEXT;
$receipt = generateReceipt($input);
echo $receipt;
What we changed:
- renamed printReceipt() to generateReceipt()
- removed all the
echo
calls from the function - instead of each echo, we append to a string that will be returned in the end
- changed the return type to string
We can now write tests for the everything except the last line in the script, where we have to trust that echo
works as intended. Everything else can be proven to work.
This maneouver of extracting and displacing side effects to the edges of your code is called "Functional Core, Imperative Shell", and precisely what we need to increase testability. Note the the generateReceipt() isn't strictly pure yet, but at least pure enough to write some tests for it.
Ex4: writing a test
Here is a simple demonstration of that:
$input = $orderString = <<<TEXT
Hamburger, 6.95, 2
Cheeseburger, 7.95, 2
Fries, 3.25, 3
TEXT;
$receipt = generateReceipt($input);
$expectedOutput = <<<TEXT
You ordered 2 x Hamburger. That makes $15.01 (incl. $1.11 tax)
You ordered 2 x Cheeseburger. That makes $17.17 (incl. $1.27 tax)
You ordered 3 x Fries. That makes $10.53 (incl. $0.78 tax)
The total is $42.71 (incl. $3.16 tax)
TEXT;
assert($receipt === $expectedOutput)
echo $receipt;
In fact, this is how a real unit test for this function could look like, minus the echo at the end. It's that easy.
Ex5: more pure* and realistic (with random exceptions)
For our final iteration, let's make the generateReceipt() function truly pure and include error handling for the impure parts of the program.
<?php
function stringToOrder(string $input): array
{
$lines = explode("\n", $orderString);
$order = [];
foreach ($lines as $line) {
$parts = array_map('trim', explode(',', $line));
$name = $parts[0];
$price = (float)$parts[1];
$amount = (int)$parts[2];
$order[] = [$name, $price, $amount];
}
return $order;
}
function generateReceipt(array $order, float $taxRate): string
{
$grossTotal = 0;
$taxTotal = 0;
$receipt = "";
foreach ($order as $item) {
// Destructure array to get easy access to each variable
[$name, $price, $amount] = $item;
// Calculate totals and taxes for the current item
$netTotal = $price * $amount;
$tax = $netTotal * $taxRate;
$gross = $netTotal + $tax;
// Update sum totals
$grossTotal += $gross;
$taxTotal += $tax;
// Append the receipt line with proper formatting
$receipt .= sprintf(
"You ordered %d x %s. That makes \$%.2f (incl. \$%.2f tax)\n",
$amount,
$name,
$gross,
$tax
);
}
// Append totals with proper formatting
$receipt .= sprintf(
"The total is \$%.2f (incl. \$%.2f tax)\n",
$grossTotal,
$taxTotal
);
return $receipt;
}
function getInput(): array
{
if (mt_rand(0, 9) === 0) {
throw new RuntimeException("Could not fetch order");
}
return <<<TEXT
Hamburger, 6.95, 2
Cheeseburger, 7.95, 2
Fries, 3.25, 3
TEXT;
}
function getTaxRate(): float
{
if (mt_rand(0, 9) === 0) {
throw new RuntimeException("Could not fetch taxRate");
}
return 0.08;
}
try {
$input = getInput(); // impure, can fail
$taxRate = getTaxRate(); // impure, can fail
$order = stringToOrder($input); // impure, can fail
// if we reached this point, we now have a valid $order
// and can perform pure operations without side effects ๐
$receipt = generateReceipt($order, $taxRate); // pure
} catch (RuntimeException $e) {
$receipt = "An error occurred: " . $e->getMessage(); // pure
}
echo $receipt; // impure
In this code, there are pure and impure parts.
All the impure parts:
- getInput() has a random chance to throw an exception. Imagine it reads from a file or database. It can also return different values when called with the same parameters.
- getTaxRate() has a random change to throw an exception. Let's imagine the value is read from a config file or database. It can potentially return different values when called multiple times with the same parameters.
- orderToString($input) always returns the same value given the same parameters. However, if the input string does not have exactly the correct format, it will throw an exception or OutOfBounds error, which is a side effect.
echo $receipt
. Echo does not compute a value, its only function is to produce a side effect.
The pure parts:
- The generateReceipt() function is pure*, because:
- It never produces any side effects (file IO, db access, web requests, echo, throwing exceptions*)
- It does not modify its input parameters
- It always produces the same result given the same parameters
In other words: Everything that handles input from and output to the real world is impure. Everything else (our core business logic) is pure.
This makes our generateReceipt() function exceptionally easy to test, and even lets us further process the result, e.g. with a decorator, before printing it.
While the code could still be more modular (e.g. splitting generateReceipt() into smaller sub functions), we do now have a test-case and can try to refactor the code more in the future without worrying about breaking anything. I'll leave this as an exercise for the reader.
Hopefully you can see why separating out the impure parts of your code to the edges is not only very helpful, but also not all that hard.
Why it's still not pure
Some may complain, that the generateReceipt() function in our latest example performs a whole lot of mutation inside of it. Since all of this "shady business" stays inside however, it can still be considered pure from the outside.
The other issue is more pressing: generateReceipt() accepts an $order of type array. Our implementation assumes that $order is an array of arrays, and each of the sub-arrays has three fields: a string, a float and an int. Our function signature neither communicates nor enforces these constraints, so it is possible to call the function (at runtime) in such a way that it runs into a crash (side effect).
Therefore I would argue:
It is possible to write pure (idempotent, side-effect free) code in PHP, however since the language is not compiled it is impossible to guarantee any given function meets these criteria, before shipping bugs. The type system is also significantly weaker than alternatives like Scala, Kotlin, Java and Rust, all of which have generics, for example. (generic types allow you to express nested type signatures, like List[OrderItem]
as opposed to just array
)
Good news: With tools like PHPStan you can add a "compile" step (checking the soundness of your program) into a pipeline that prevents you from merging bad code. It also allows you to express (and enforce) generic types via annotations. I cannot recommend this static analyzer enough, if you are a professional developer!
Next up
- porting our generateReceipt() function to Scala 3 and easily building much more type-safety into it
- Adding PHPStan to a real project, and how it can help you not only catch bugs earlier but also make your developer experience so much better
Consider subscribing to the newsletter if you would like to be notified when new articles release!