Exhaustiveness checking in Rust, Java, PHPStan

Exhaustiveness checking in Rust, Java, PHPStan
Photo by Viktor Talashuk / Unsplash

Intro: Simple enums and exhaustiveness checking in PHPStan

If you use enums (enum = enumeration of all possible states) then you can perform exhaustiveness checking.

Before we proceed to Rust and Java, I am using PHP with PHPStan to give a quick introduction to the concepts at hand.

enum GraphicsPreset {
    case Ultra;
    case High;
    case Medium;
}

/** @return Collection<int, GraphicsSetting> */
function getGraphicsSettings(GraphicsPreset $preset): Collection
{
    $baseSettings = getBaseSettings();

    $presetSettings = match($preset) {
        case GraphicsPreset::Ultra => getUltraSettings(),
        case GraphicsPreset::High => getHighSettings(),
        case GraphicsPreset::Medium => getMediumSettings(),
    };

    return $baseSettings->merge($presetSettings);
}

$settings = getGraphicsSettings(GraphicsPreset::High);

In this code, we have a getGraphicsSettings function which accepts a GraphicsPreset parameter (must be one of the enum cases). So without any assertions or validation logic, we already limited the input to three possible states.

Next, we compile a list of settings based on the selected preset, and return that list. The match block in the middle is exhaustive because it handles every possible input.

Before enums were introduced in PHP, you could rely on a good old class hierarchy to limit the input types.

class Ultra extends GraphicsSetting

However, PHP enums give us the following advantages:

  • syntactic sugar. Creating another subtype is just one line of code, in the same file. Rather than having to create new classes and files
  • the enum cases are instantiated as singletons and performing === equality check on them is based on value, rather than memory address (unlike === check between regular objects)
  • exhaustiveness checking: In a regular class (if it is not final), you can always add more child classes that can extend it. For enums, the opposite is true. You cannot create additional subtypes outside of the enum definition. This is good news for static analysis

If you use static analysis (e.g. PHPStan), you can beneft from the following. Let's say we add a new possible state:

enum GraphicsPreset {
    case Ultra;
    case High;
    case Medium;
    case Low;
}

Now, if we "compile" (run phpstan), we will receive a violation:

$presetSettings = match($preset) {
    case GraphicsPreset::Ultra => getUltraSettings(),
    case GraphicsPreset::High => getHighSettings(),
    case GraphicsPreset::Medium => getMediumSettings(),
};
^^^^^^^^^^^ match statement does not exhaustively check all possibilities

To fix this error, we have to handle the new case:

$presetSettings = match($preset) {
    case GraphicsPreset::Ultra => getUltraSettings(),
    case GraphicsPreset::High => getHighSettings(),
    case GraphicsPreset::Medium => getMediumSettings(),
    case GraphicsPreset::Low => getLowSettings(),
}; //👌

Why you should avoid a default case

The above adds a really nice property to your code. All possible paths are very explicit, navigable, and when you add more possibilities, the language will remind you to handle it everywhere.

If you add a default case, the last property is gone. Now whenever you add something new, you are no longer kept in the loop about which places need updating, because the default case already catches everything. So technically, 'exhaustiveness checking' is still in place, but by using a default case you make it somewhat useless and error prone. 

Sometimes, you need a default case, for example here:

$number = mt_rand(1, PHP_INT_MAX);

match ($number) {
    case 1 => doSomething(),
    case 2 => doSomethingElse(),
    case 3 => doYetAnotherThing(),
    default => handleTheBillionOtherCases(),
};

But when this happens, you have to ask yourself: Why am I using an integer/String or other data type with countless possible states as a control mechanism? Maybe it is better to introduce an enum and get rid of default.

Does it work without enums?

Technically yes, PHPStan is often smart enough to do something like this:

$word = match (mt_rand(1, 4)) {
   case 1 => 'one',
   case 2 => 'two',
   case 3 => 'three',
};
^^^^^^^^^^ phpstan error: handle case 4
$word = match (mt_rand(1, 4)) {
   case 1 => 'one',
   case 2 => 'two',
   case 3 => 'three',
   case 4 => 'four',
}; //👌

Enums EXTREME edition: Rust

Over the last year I have used the programming languages Rust and Scala quite extensively and both of them are heavily investing in enums. I will not show Scala code here, but I want to point out it was tremendously helpful in helping me learn other languages, including Rust, Java and yes, even PHPStan.

I want to show you an example of how Rust takes enums and pattern matching to the next level:

enum CreditCardType {
    Visa(String),       // card number
    MasterCard(String), // card number
    Amex(String),       // card number
}

enum BankTransferType {
    SEPA { iban: String, bic: String },
    SWIFT { account: String, swift_code: String },
}

enum CryptoType {
    Bitcoin(String),     // wallet address
    Ethereum(String),    // wallet address
    Custom { name: String, address: String },
}

enum PaymentMethod {
    CreditCard(CreditCardType),
    BankTransfer(BankTransferType),
    Crypto(CryptoType),
}

fn describe_payment(method: PaymentMethod) -> String {
    match method {
        PaymentMethod::CreditCard(card) => match card {
            CreditCardType::Visa(number) => format!("Paid with Visa: {number}"),
            CreditCardType::MasterCard(number) => format!("Paid with MasterCard: {number}"),
            CreditCardType::Amex(number) => format!("Paid with Amex: {number}"),
        },
        PaymentMethod::BankTransfer(transfer) => match transfer {
            BankTransferType::SEPA { iban, bic } => {
                format!("Paid via SEPA transfer. IBAN: {iban}, BIC: {bic}")
            }
            BankTransferType::SWIFT { account, swift_code } => {
                format!("Paid via SWIFT. Account: {account}, SWIFT: {swift_code}")
            },
        },
        PaymentMethod::Crypto(crypto) => match crypto {
            CryptoType::Bitcoin(addr) => format!("Paid with Bitcoin. Wallet: {addr}"),
            CryptoType::Ethereum(addr) => format!("Paid with Ethereum. Wallet: {addr}"),
            CryptoType::Custom { name, address } => {
                format!("Paid with {name}. Address: {address}")
            }
        },
    }
}

fn main() {
    let method = PaymentMethod::BankTransfer(BankTransferType::SEPA {
        iban: "DE89 3704 0044 0532 0130 00".to_string(),
        bic: "COBADEFFXXX".to_string(),
    });

    let description = describe_payment(method);
    
    println!("{}", description);
}

You can run this code here and play around with it: https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=ee18c2efcbd6b373cc734012637c7a7b

The above code shows the following:

  • enums with multiple properties (e.g. BankTransferType::SEPA has iban:String and bic:String)
  • Wrapper enums (CreditCardType::Visa(String) and CreditCardType::MasterCard(String))
  • enum cases with different amount of properties, and different property names within the same group
  • nested enums - PaymentMethod::CreditCard, PaymentMethod::BankTransfer, PaymentMethod::Crypto all contain another enum with multiple cases. To construct one you have to be very specific, e.g.
    • let method = PaymentMethod::CreditCard(CreditCardType::Visa("1234 5678"))
  • nested pattern matching and deconstruction, as in this snippet:
match method {
    PaymentMethod::CreditCard(card) => match card {
        CreditCardType::Visa(number) => println!("Paid with Visa: {number}"),

Honestly after trying this style of programming it is very hard to go back.

Java can do this too?

Java allows similar style in newer versions, with sealed interfaces, records and switch expressions:

public class Main {
    public static void main(String[] args) {
        PaymentMethod method = new BankTransferPayment(new SEPA("DE89...", "COBADEFFXXX"));
        describePayment(method);
    }

    public static void describePayment(PaymentMethod method) {
        String result = switch (method) {
            case CreditCardPayment(var card) -> switch (card) {
                case Visa(var number)       -> "Paid with Visa: " + number;
                case MasterCard(var number) -> "Paid with MasterCard: " + number;
                case Amex(var number)       -> "Paid with Amex: " + number;
            };

            case BankTransferPayment(var transfer) -> switch (transfer) {
                case SEPA(var iban, var bic) -> "Paid via SEPA. IBAN: " + iban + ", BIC: " + bic;
                case SWIFT(var acc, var swift) -> "Paid via SWIFT. Account: " + acc + ", SWIFT: " + swift;
            };

            case CryptoPayment(var crypto) -> switch (crypto) {
                case Bitcoin(var addr)         -> "Paid with Bitcoin. Address: " + addr;
                case Ethereum(var addr)        -> "Paid with Ethereum. Address: " + addr;
                case CustomCrypto(var name, var addr) -> "Paid with " + name + ". Address: " + addr;
            };
        };

        System.out.println(result);
    }
}

// ----- Interfaces -----
sealed interface PaymentMethod permits CreditCardPayment, BankTransferPayment, CryptoPayment {}
sealed interface CreditCard permits Visa, MasterCard, Amex {}
sealed interface BankTransfer permits SEPA, SWIFT {}
sealed interface Crypto permits Bitcoin, Ethereum, CustomCrypto {}

// ----- Top-Level Records (implementing PaymentMethod) -----
record CreditCardPayment(CreditCard card) implements PaymentMethod {}
record BankTransferPayment(BankTransfer transfer) implements PaymentMethod {}
record CryptoPayment(Crypto crypto) implements PaymentMethod {}

// ----- Nested Record Types -----
record Visa(String number) implements CreditCard {}
record MasterCard(String number) implements CreditCard {}
record Amex(String number) implements CreditCard {}

record SEPA(String iban, String bic) implements BankTransfer {}
record SWIFT(String account, String swiftCode) implements BankTransfer {}

record Bitcoin(String address) implements Crypto {}
record Ethereum(String address) implements Crypto {}
record CustomCrypto(String name, String address) implements Crypto {}

Here you can run the code and play around with it: https://tinyurl.com/javaadt

Notice how we have all the same nice properties of the code here:

  • differently shaped datatypes that all adhere to the same interface / supertype
  • ergonomic destructuring of the data via pattern matching (switch expressions), including nested expressions
  • exhaustiveness checking (thanks to the sealed keyword): if a branch is not covered you get a compiler error. If you have a branch that cannot be reached you get a warning.
  • You cannot accidentally call swift on a SEPA BankTransfer, because it does not exist on that subtype. So you avoid all kinds of type errors and misunderstandings

Wow this is so cool! Surely PHP cannot do this?

Well... that's what I thought first, because in PHP, enums cannot hold data, let alone be nested.

Quick refresher:

enum PaymentType {
   case Card;
   case Crypto;
   case BankTransfer;
}

This is how deep the modeling goes with PHP enums. 😢 What about the card types? What about the different types of BankTransfer? What about the different sets of data you need to handle each case?

However: PHP has an extremely potent union type system. In fact, some other languages like Rust and Scala call their enums tagged union.

Behold, you can perform exhaustiveness checking on union types with PHPStan:

function giveMeSomething(): A | B | C {}

function doThing1(A $input) {}
function doThing2(B $input) {}
function doThing3(C $input) {}

$x = giveMeSomething();

match (true) {
    case $x instanceof A => doThing1($x),
    case $x instanceof B => doThing2($x),
    case $x instanceof C => doThing3($x),
}; // no need for default case, type A | B | C is exhaustively matched

Here is how we can combine enums and union types to model the complex PaymentMethod datatypes:

<?php

enum PaymentMethodKind {
    case CreditCard;
    case BankTransfer;
    case Crypto;
}

enum CreditCardType {
    case Visa;
    case MasterCard;
    case Amex;
}

enum BankTransferType {
    case SEPA;
    case SWIFT;
}

enum CryptoType {
    case Bitcoin;
    case Ethereum;
    case Custom;
}

final readonly class SEPA {
    public function __construct(
        public string $iban,
        public string $bic
    ) {}
}

final readonly class SWIFT {
    public function __construct(
        public string $account,
        public string $swiftCode
    ) {}
}

final readonly class CreditCard {
    public function __construct(
        public CreditCardType $type,
        public string $number
    ) {}
}

final readonly class BankTransfer {
    public function __construct(
        public BankTransferType $type,
        public SEPA | SWIFT $details
    ) {}
}

final readonly class Crypto {
    public function __construct(
        public CryptoType $type,
        public string $address,
        public ?string $name = null
    ) {}
}

final readonly class PaymentMethod {
    public function __construct(
        public PaymentMethodKind $kind,
        public CreditCard | BankTransfer | Crypto $payload
    ) {}
}

function describeCreditCard(CreditCard $card): string {
    return match ($card->type) {
        CreditCardType::Visa       => "Paid with Visa: {$card->number}",
        CreditCardType::MasterCard => "Paid with MasterCard: {$card->number}",
        CreditCardType::Amex       => "Paid with Amex: {$card->number}",
    };
}

function describeBankTransfer(BankTransfer $bt): string {
    return match (true) {
        $bt->details instanceof SEPA => "Paid via SEPA. IBAN: {$bt->details->iban}, BIC: {$bt->details->bic}",
        $bt->details instanceof SWIFT => "Paid via SWIFT. Account: {$bt->details->account}, SWIFT: {$bt->details->swiftCode}",
    };
}

function describeCrypto(Crypto $crypto): string {
    return match ($crypto->type) {
        CryptoType::Bitcoin  => "Paid with Bitcoin. Address: {$crypto->address}",
        CryptoType::Ethereum => "Paid with Ethereum. Address: {$crypto->address}",
        CryptoType::Custom   => "Paid with {$crypto->name}. Address: {$crypto->address}",
    };
}

function describePayment(PaymentMethod $method): string {
    $payload = $method->payload;

    return match (true) {
        $payload instanceof CreditCard    => describeCreditCard($payload),
        $payload instanceof BankTransfer  => describeBankTransfer($payload),
        $payload instanceof Crypto        => describeCrypto($payload),
    };
}

$payment = new PaymentMethod(
    PaymentMethodKind::BankTransfer,
    new BankTransfer(BankTransferType::SEPA, new SEPA('DE89...', 'COBADEFFXXX'))
);

echo describePayment($payment);

You can play around with the code here: https://phpstan.org/r/036e7355-80a1-44c5-99cc-70e66ef3bf54. I commented out line 73 to force an error (to prove the power of the static analysis of PHPStan).

Conclusion

All three languages can effectively make use of datatype modeling, enums and pattern matching to create highly correct programs. With highly correct programs, I mean that they are validated to a high degree before running for the first time, extremely resistant against typos, prevent you from forgetting to handle something, and lead to an extremely high degree of clarity about all possible states and execution paths your program can take.

Rust and Java can do this without any third party packages, while PHP needs PHPStan or another static analyzer (that your team may not be familiar with). Also, interestingly enough, the Java code needed the fewest lines, while PHP needed more than twice as many. Talk about "scripting".

Ending Thoughts

In Rust, enums and pattern matching are a core pillar of the language design, and the language embraces multi-paradigm programming as well as putting many classes, functions and enums into the same file.

In Java and PHP on the other hand, each class, interface, enum and so on is supposed to go into its own file. So even if the above snippets work, you would violate coding standards of these languages unless you create all these additional files and folders. In practice this causes a lot of friction, especially if the team members are not sold on the benefits of compile time guarantees.

For quick reference, here are all three code snippets side by side: