Typesafe F# configuration binding

At Symbolica we’re building a symbolic execution service that explores every reachable state of a user’s program and verifies assertions at each of these states to check that the program is correct. By default it will check for common undefined behaviours, such as out-of-bounds memory reads or divide by zero, but it can also be used with custom, application specific, assertions too just like the kind you’d write in a unit test. Seen from this perspective it’s kind of like FsCheck (or Haskell’s QuickCheck or Python’s Hypothesis), but much more exhaustive and without the randomness.

As much as we like finding bugs with Symbolica, we prefer to not write any in the first place. Our first line of defence is a strong type system, so that we can try to design types that make invalid states impossible and let the compiler tell us off when we make a mistake. For that reason we’ve opted to build our service using F# as it also interops nicely with the core part of our symbolic executor which is written in C# .

One of the many things we love about F# is that, by default, it doesn’t permit null as a regular value. This feature eliminates a whole class of errors caused by null values, most notably the cursed NullReferenceException . Another feature that we like is having access to the vast wealth of .NET libraries that exist. However, many of these are written in C# and so they are often places where null values can sneak into an F# program through the backdoor at runtime.

One area where this was frequently biting us was the binding of configuration data using the Microsoft.Extensions.Configuration library. Due to this and other problems that we’ll go into below, we created a safer alternative for configuration binding for F# projects called Symbolica.Extensions.Configuration.FSharp and open-sourced it on GitHub.

The problems with

Microsoft.Extensions.Configuration.Binder

The best way to highlight the shortcomings of the defacto config binder is with an example. Let’s say we want to model some logging options in our code. We might start out with a simple record type like this to represent the options.

type LoggingOptions = 
    { Level: string
      Sink: string }

Where Level represents how verbose we want the logging output to be, e.g. "Debug" or "Error" etc and Sink is where we want to send the logs, for example it might be "Console" or "File" .

Let’s test this out with a little fsx script that we can run with FSI.

# r "nuget: Microsoft.Extensions.Configuration.Binder"

open Microsoft.Extensions.Configuration

type LoggingOptions = { Level: string; Sink: string }

let config =
    ConfigurationBuilder()
        .AddInMemoryCollection(
            [ "Logging:Level", "Debug"
              "Logging:Sink", "Console" ]
            |> Map.ofList
        )
        .Build()

config.GetSection("Logging").Get<LoggingOptions>()

Here we’re just seeding the in memory configuration provider with a dictionary of config data and then attempting to retrieve and bind the LoggingOptions .

Problem 1: Mutable data and

null values

If we run the above script you might be expecting it to print out a LoggingOptions with a Level of "Debug" and a Sink of "Console" . However, we actually hit a different problem. The above script throws the following exception.

System.InvalidOperationException: Cannot create instance of type ‘FSI_0008+LoggingOptions’ because it is missing a public parameterless constructor.

That’s because an F# record doesn’t contain a parameterless constructor, because all of the record’s properties must be properly initialised and null isn’t an allowed value. To make matters worse, the defacto binder mandates that the properties of the type being bound must be settable too, breaking immutability and making the use of a record to model options kind of pointless.

There are two typical workarounds to this:

Define a mutable class instead of a record for the options type, like we would in C# .
Add the [<CLIMutable>] attribute to the LoggingOptions record.

Neither of these are particularly pleasing. The first one means we have to give up on having immutable options types and the rest of the code base has to deal with the added complexity of potential mutability. The second is basically a hack which provides a mutable backdoor at runtime to our immutable type.

Using [<CLIMutable>] actually opens up a can of worms because our types are now deceiving us. Our simple record purports to be immutable and never contain null values and so in the rest of the code base we program as if this is the case. On the other hand the config binder isn’t abiding by these compile time invariants and may in fact initialise the record’s properties as null at runtime.

To see this in action, let’s rerun the above example, but this time with the [<CLIMutable>] attribute added to the LoggingOptions and a missing value for the Level In the raw config. The modified script looks like this.

# r "nuget: Microsoft.Extensions.Configuration.Binder"

open Microsoft.Extensions.Configuration

[<CLIMutable>]
type LoggingOptions = { Level: string; Sink: string }

let config =
    ConfigurationBuilder()
        .AddInMemoryCollection([ "Logging:Sink", "Console" ] |> Map.ofList)
        .Build()

config.GetSection("Logging").Get<LoggingOptions>()

Running it produces this output.

val it: LoggingOptions = { Level = null
                           Sink = "Console" }

We see that the type system has lied to us because the value of Level was actually null at runtime. In this case it’s relatively harmless, but in a real application it’s likely that we’ll have a more complex hierarchy of option types and so we’d end up trying to dereference a potentially null object leading to the dreaded NullReferenceException .

When working in F# we’d rather the config binder returned a Result if the config couldn’t be parsed and allow us to use an Option type for config data that is, well, optional. Which leads us to the next problem.

Problem 2: No native support for binding DUs

As the defacto binder uses reflection to bind the raw config to “strongly typed objects”, it only has support for a limited set of types. This includes all the primitive types, like int and string and a few of the common BCL collection types like List and Dictionary . This is frustrating for both C# and F# developers that wish to use more complex types to model their options.

Particularly frustrating for F# developers though is that this means it doesn’t support discriminated unions (DUs) and therefore doesn’t support types like Option . To highlight this let’s imagine we wanted to improve our LoggingOptions so that the Level was restricted to a discrete set of values. To do this we’ll create a DU called LoggingLevel and use it as the type for the Level property.

# r "nuget: Microsoft.Extensions.Configuration.Binder"

open Microsoft.Extensions.Configuration

[<RequireQualifiedAccess>]
type LogLevel =
    | Debug
    | Info
    | Warning
    | Error

[<CLIMutable>]
type LoggingOptions = { Level: LogLevel; Sink: string }

let config =
    ConfigurationBuilder()
        .AddInMemoryCollection(
            [ "Logging:Level", "Debug"
              "Logging:Sink", "Console" ]
            |> Map.ofList
        )
        .Build()

config.GetSection("Logging").Get<LoggingOptions>()

We’re now supplying a config dictionary that looks correct, it has properties for both of "Logging:Level" and "Logging:Sink" , so let’s run it and see what the output is.

val it: LoggingOptions = { Level = null
                           Sink = "Console" }

So we can see here that the binder has silently failed to bind the Level property now that its type is LoggingLevel .

If we want to bind more complex type, we’ll first have to bind to a simple type, like a string , and then write a parser ourselves to turn that into a LoggingLevel . That’s a slippery slope because it then probably means having something like a ParsedLoggingConfig which we create from the more loosely typed LoggingConfig . Resulting in us needing to define a fair amount of config parsing “boilerplate” anyway.

Problem 3: Parse, don’t validate

The defacto binder doesn’t really give us much help when our configuration is faulty. We can write some [options validators](https://docs.microsoft.com/en-us/aspnet/core/fundamentals/configuration/options?view=aspnetcore-5.0# options-validation) and wire these up with DI, but as Alexis King has taught us - parse, don’t validate.

In short, “parse, don’t validate” tells us that it’s better to parse data into a type, that once constructed must be valid, than it is to read the data into a more loosely typed object and then run some post-validation actions over the values to make sure they’re correct. The primary reason being that if we know that our type only permits valid values, then we no longer have to wonder whether or not it’s already been validated.

The defacto configuration binder doesn’t make it easy to adhere to this. It’s easy to forget to register a validator for the options and then when they’re accessed at runtime we instead get a rather unhelpful null value, like we observed earlier. What we’d prefer is for the compiler to prevent us from making such a mistake, by enforcing validation through the type system.

To give a specific example, let’s imagine we want to be able to restrict the logging level to only have the values, "Info" , "Debug" , "Warning" and "Error" . We’ve already seen we can’t use a DU to model this. So we have no way of knowing whether or not Level is valid when we come to use it, all we know is that it’s a string. So if we want to be sure, we’re forced to keep validating the logging level at every point of use.

A better binder for F#

Given these shortcomings we decided to write our own config binder with the following design goals in mind:

Binding failures should be expected and be reflected in the type returned from the binder. We should be made to deal with the unhappy path.
Binding should not break immutability.
Binding should work for all types including complex user defined types.
Binding should be composable, such that if I can bind a type X which is then later used within Y , I should be able to reuse the binder for X when defining the binder for Y .
Error reporting should be greedy and descriptive so that developers can quickly fix as many errors as possible when binding fails.

To that end we opted to write a binder that didn’t use any reflection. The trade-off we’re making here is that we’re forced to be much more explicit when we bind a type and so we end up with what some people might consider to be boilerplate. However, we’d personally rather have code that is explicit than have to read through documentation to discover the implicit behaviours of something magic, because when the magic thing breaks we usually spend more time debugging that than we would have spent writing the explicit “boilerplate” to begin with.

Also, thanks to the composable nature of functional programming languages and the power of F# ’s computation expressions it’s possible to be both explicit and terse. It’s probably best appreciated with an example. So let’s see how we’d bind the above LoggingOptions using our new approach.

# r "nuget: Symbolica.Extensions.Configuration.FSharp"

open Microsoft.Extensions.Configuration
open Symbolica.Extensions.Configuration.FSharp

[<RequireQualifiedAccess>]
type LogLevel =
    | Debug
    | Info
    | Warning
    | Error

module LogLevel =
    let bind =
        Binder(
            fun (s: string) -> s.ToLowerInvariant()
            >> (function
            | "info" -> Success LogLevel.Info
            | "debug" -> Success LogLevel.Debug
            | "warning" -> Success LogLevel.Warning
            | "error" -> Success LogLevel.Error
            | _ -> Failure ValueError.invalidType<LogLevel>)
        )

type LoggingOptions = { Level: LogLevel; Sink: string }

let bindConfig =
    Bind.section
        "Logging"
        (bind {
            let! level = Bind.valueAt "Level" LogLevel.bind
            and! sink = Bind.valueAt "Sink" Bind.string
            return { Level = level; Sink = sink }
         })

let config =
    ConfigurationBuilder()
        .AddInMemoryCollection(
            [ "Logging:Level", "Debug"
              "Logging:Sink", "Console" ]
            |> Map.ofList
        )
        .Build()

bindConfig
|> Binder.eval config
|> BindResult.mapFailure(fun e -> e.ToString())

Running this script produces the following output.

val it: BindResult<LoggingOptions,string> = 
    Success { Level = Debug
              Sink = "Console" }

From this example we can see that it’s successfully bound our more complex LoggingOptions type that contains a DU . There’s also zero magic, the binding process is clear to see and simple to customise. Let’s check that it’s met our design goals.

Failures are expected - We can see this by the fact that right at the end, after we’ve called eval on the Binder , it’s produced a BindResult .
Binding doesn’t break immutability - No [<CLIMutable>] required here.
Binding works for complex types - Binding a DU was no problem. We were also able to make it case insensitive just through a little function composition with ToLowerInvariant .
Binding is composable - We defined the binder for the LogLevel in isolation to the overall config binder.
Error reporting is greedy and informative - Let’s simulate some failures and see what happens.

Let’s run the script again but this time with the following input config.

[ "Logging:Level", "Critical" ]

So that the Level is invalid and the Sink is missing. We get the following output.

Failure(
  “@'Logging':
    all of these:
      @'Level':
        Value: 'Critical'
        Error:
          Could not parse value as type 'LogLevel'.
      @'Sink':
        The key was not found.”)

It’s shown us all of the paths in the config for which it found errors and what those errors are.

The Implementation Details

At the heart of all of this is a Binder<'config, 'value, 'error> type. This type is just a wrapper around a function of the form 'config -> BindResult<'a,'error> . For the category theory inclined, it’s just a reader monad whose return type has been specialised to a BindResult .

The BindResult type is very similar to a regular F# Result except that its applicative instance will accumulate errors, whereas the regular Result will typically short-circuit on the first error it encounters.

Binder and BindResult are defined generically to keep them as flexible as possible. However at some point we want to provide some specialisations for the common binding scenarios. There are really two primary specialisations to consider; one for binding sections and another for binding values.

Section binders are of the form Binder<# IConfiguration, 'a, Error> and value binders are of the form Binder<string, 'a, ValueError> . By fixing 'error to the custom types Error and ValueError it’s easy to compose Binder s and also ensure that the errors can be properly accumulated in both applicative and alternative computations.

One of the primary specialisations comes from the bind [applicative computation expression](https://docs.microsoft.com/en-us/dotnet/fsharp/whats-new/fsharp-50# applicative-computation-expressions). We saw in the example above how bind lets us compose a Binder for an IConfigurationSection by binding its properties using existing Binder s and at the same time ensures all binding errors from this section are accumulated. The bind CE gives us a declarative looking DSL for defining new binders for our application specific config objects.

In the Bind module the library also provides various combinators for building new Binder s. Such as Bind.section and Bind.valueAt which take an existing Binder and bind them to a section or a value at a particular key, which are typically used inside a bind CE. It also contains many binders for types like int , bool

System.DateTime and System.Uri as well as more complex structures like List and IDictionary .

Try it out

The code is available on GitHub and you can install the library via NuGet. If you want to see even more sophisticated examples that shows how to do things like handle optional values, deal with alternatives and bind units of measure then check out the IntegrationTests. Of course if there’s something that you think is missing then open an issue or a pull request. I’m sure there are plenty of other Binder s that we can add to the Bind module to cover other common .NET types.

Future Improvements

If you want to use things like IOptionsSnapshot then it requires interaction with the IServiceCollection and a call to Configure<MyOptionsType>(configureAction) . Unfortunately the way that Microsoft have designed this means that a parameterless public constructor is required on the options type being configured so that an instance can be passed to configureAction , which goes against our design principles here. So currently this library won’t play nicely with things like reactive options updates. If this is something that you’d like then it should be possible to provide a way around this by providing an alternative IOptionsFactory , so please open an issue and let us know. See the [README](https://github.com/Symbolica/Symbolica.Extensions.Configuration.FSharp# usage-with-di) for more details.

This post is also available on DEV.