I want to write about petal. It’s a small functional language that makes effects explicit through a simple monadic type system, compiled to LLVM IR, with a fuzzer that generates test properties.

The language - petal

A small functional langauge that

  • makes effects explicit through a simple monadic type system
  • compiles to LLVM IR via closure conversion
  • has a built-in property fuzzer

The Plan - three phases

Phase 1: Core langauge

Implement parser, type checker, closure conversion, llvm codegen. In this phase, only deal with pure functions.

Phase 2: Effect system

Add a small set of monads to the type system. Candidates are: Maybe, IO, Maybe use Do-notation desugaring as a compiler pass. Type checker can be extended to track and enforce effect types.

Phase 3: Type-directed fuzzer

Build a fuzzer that uses type information to generate meaningful tests.

pub enum Expr {
    Var(String),
    Num(i64),
    Bool(bool),
    Cond(Box<Expr>, Box<Expr>, Box<Expr>),
    Lambda(String, Type, Box<Expr>),
    App(Box<Expr>, Box<Expr>),
    Let(String, Box<Expr>, Box<Expr>),
    Add(Box<Expr>, Box<Expr>),
}

Comes with simple tyles.

pub enum Type {
    Int,
    Bool,
    Fun(Box<Type>, Box<Type>)
}

I used fmt::Display to implement pretty-printing.

impl fmt::Display for Type {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        match self {
            Type::Int => write!(f, "Int"),
            Type::Bool => write!(f, "Bool"),
            Type::Fun(t1, t2) => write!(f, "{t1} -> {t2}"),
        }
    }
}

impl fmt::Display for Expr {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        match self {
            Expr::Var(s) => write!(f, "{s}"),
            Expr::Num(n) => write!(f, "{n}"),
            Expr::Bool(b) => write!(f, "{b}"),
            Expr::Lambda(x, t, e) => write!(f, "λ{x}: {t}. {e}"),
            Expr::App(e1, e2) => write!(f, "({e1})({e2})"),
            Expr::Add(e1, e2) => write!(f, "({e1} + {e2})"),
            Expr::Let(x, e1, e2) => write!(f, "let {x} = {e1} in {e2}"),
            Expr::Cond(cond, e1, e2) => write!(f, "if {cond} then {e1} else {e2}")
        }
    }
}

But this is too generic. I wanted to make it a little more special. After all, I have a PL background!

So I decided to add a little bit of richness in the types. That richness is refinement types — types annoted with logical predicates that contain values like

x : { n : Int | n > 0 }   -- positive integers only

Plus, a type-directed fuzzer built into the compiler.

Rather than a separate fuzzing tool, build a fuzzer directly into Petal that uses the type information from the type checker to generate well-typed random programs automatically. The insite is if you know the type of an exrpession, you can generate valid inputs for it without random guessing, which is exactly the problem my previous work at PLR-Prague was solving for R, but now solved using types.

Refinement types

Refinement types tell you what values are valid for each expression. A type-directed fuzzer uses that information to generate valid inputs automatically. These two features are deeply complementary. The richer the type system, the smarter my fuzzer can be!

The result is a langauge where:

  • the type checker statically guarantees properties about values (refinement types)
  • the fuzzer uses those same type annotations to generate meaningful test inputs automatically
  • I can test whether the compiler correctly preserves those properties through compilation

A rough phased plan

Phase 1: get the foundation right

  • Parser, type checker, closure conversion, LLVM codegen
  • A working minimal ML-like language

Phase 2: add refinement types Start simple. The simplest useful refinement type system tracks integer contraints:

x : { n : Int | n > 0 }       -- positive integers
y : { n : Int | n >= 0 }      -- non-negative integers
z : { n : Int | 0 <= n < 10 } -- bounded integers

This requires

  • Extending the Type enum to carry predicates
  • A constraint checker that verifies refinements at type-checking time
  • Deciding how much to verify statically vs dynamically
    • Full static verification may require a SMT solver, like Z3, which could be ambitious and interesting

Phase 3: Type-directed fuzzer Once I add refinement types, build a fuzzer that

  • Reads the type of an expression
  • Generates random values that satisfy the refinement predicate
  • Uses those values to automatically construct well-typed test programs
  • Runs the generated programs through the compiler and checks for crashes, incorrect outputs, or violated refinements

Don’t get excited too soon

I ask myself this question: for a simple language like Petal, would refinement types and built-in fuzzer be even meaningful?

No. For a tiny language, the programs you can write are so simple that refinement types don’t buy you much. Knowing that n > 0 in a language that only does arithmetic and booleans doesn’t prevent many interesting bugs. The fuzzer also has a limited surface area to explore. There’s not much that can go wrong in a language this small. The refinement types + built-in fuzzer combination is somewhat artificial.

Yes. The value of Petal wouldn’t be the language itself. It would be the ideas it demonstraits. I can think of it less as a useful tool but more as a proof of concept. The questions it ccould answer are

  • Can refinement type information automatically derive a smart fuzzer?
  • Does type-directed fuzzing find bugs that untyped fuzzing misses?
  • How do you propagate refinement constraints through compilation and verify they’re preserved in the generated LLVM IR?

So that’s the direction I’d like to take: Petal as a research prototype for exploring the intersection of refinement types and type-directed fuzzing.

Existing languages

To complete this note, I’d like to add the existing languages with refinement types and type-directed fuzzing.

Refinement types

  • Liquid Haskell: the most well-known. Adds refinement types to Haskell via SMT solving (Z3). Types like {v: Int | v > 0} are verified statically. Very influential in the PL research community

  • Rust: doesn’t have refinement types natively, but there are research projects adding them (e.g. Flux, which adds Liquid Haskell-style refinements to Rust)

  • Stainless: refinement types for Scala

Dependently typed languages

They can express refinement-like properties, but that’s not the same thing as being refinement type systems.

  • F*: Microsoft Research’s dependently typed language used for verified cryptography and systems code. More powerful than Liquid Haskell but also more complex

  • Idris: a dependently typed language where types can express arbitrary properties. More of a proof assistant than a practical language

Built-in / type-directed fuzzing

  • QuickCheck (Haskell): the most influential. Generates random test inputs using type class instances. You define a generator for each type and QuickCheck uses it automatically. This is the closest thing to type-directed fuzzing in wide use

  • Hypothesis (Python): a modern fuzzing library inspired by QuickCheck
  • proptest (Rust): similar to QuickCheck for Rust