Finding solutions using genetic algorithms in F#

A genetic algorithm simulates the process of evolution in order to find an optimal solution to a problem.  This sort of algorithm can be used when there isn’t a single perfect answer, but software can be built to search for the best solution given the amount of time we’re willing to spend on it.

Here, we will solve a simple problem to introduce the structure of a genetic algorithm in F#.  We want the machine to find a path around an obstruction.  Say you have a grid and the starting point is the origin, and there is a line on the X-axis from (0,-2) to (0,+2) that the algorithm needs to find a way around.  Not an incredibly difficult problem, there are certainly many ways to solve it and may possible solutions, but I’ll use it to introduce the mechanics of solving such a problem.

First, a few definitions (types).  A “trait” is a characteristic, and a list of these characteristics makes up an “individual”, which might be a solution to our problem.  Given that our problem is to navigate a grid, our traits will be four operations – Forward, Up, Down, and Back.  For a genetic algorithm, you start with a pool of individuals, which we will refer to as a “population”.

type Trait =
    | Forward
    | Up
    | Down
    | Back
type Individual = Trait list
type Population = Individual list

A genetic algorithm will combine two (or sometimes more) individuals in one generation to create the next generation of individuals.  We’ll refer to those as “parents”.

type Parents = Individual * Individual

The general process is to take an initial population, evaluate each individual against some fitness function to rank each of them, then combine the best performers into a new population – a generation.  Repeat until an optimal solution is found amongst the individuals in a population.  Prevent stagnation between generations by adding mutation, which can “tweak” individuals a bit so each generation will differ from the previous at least a little bit.

To begin the search for a solution, we need to create some initial population of individuals.  Individuals can be generated completely at random (as we will here) or with some predefined logic.

let createIndividual (rng:Random) (size:int) : Individual =
    let rec buildList numLeftToAdd lst =
        match numLeftToAdd with
        | 0 -> lst
        | left -> 
            let nextRandom = 
                match rng.Next(4) with
                | 0 -> Forward
                | 1 -> Up
                | 2 -> Down
                | 3 -> Back
                | n -> failwithf "No operation mapped for %i" n
            nextRandom :: lst |> buildList (left - 1)
    buildList size List.empty

So what did that do?  It recursively builds a list, prepending a random trait to be beginning of it.  Next, we need a function to create a population of a given size.  This function should accept the population size and a function to create individuals for the population.

let createPopulation populationSize newIndividual : Population = 
    let rec buildPopulation numLeft population =
        match numLeft with
        | 0 -> population
        | n -> (newIndividual ()) :: population |> buildPopulation (n - 1)
    buildPopulation populationSize List.empty

Not a lot to this, as it’s delegating the hard work to the newIndividual function.  And now we need to run our simulation.  This would accept a candidate individual, run them through the simulation, and return the result for that candidate.  This can be represented a Result record that contains the final coordinate and the candidate, which we will need if this is one of the members of the population that is selected for the next generation.

type Coordinate = { X:int; Y:int }

type Result = { FinalPosition:Coordinate; Candidate:Individual }

let runSimulation initialPosition (candidate: Individual) =
    let newPosition coordinate operation =
        match operation with
        // First the case of moving forward but hitting the obstacle 
        | Forward when coordinate.X + 1 = 3 
            && coordinate.Y >= -2 
            && coordinate.Y <= 2 -> coordinate // goes nowhere
         // With no obstruction, go ahead and move forward
        | Forward -> { coordinate with X=(coordinate.X + 1) }
        // Also can't move backward and hit the obstacle
        | Back when coordinate.X - 1 = 3
            && coordinate.Y >= -2
            && coordinate.Y <= 2 -> coordinate // goes nowhere
        // Go back without obstruction
        | Back -> { coordinate with X=(coordinate.X - 1) }
        // Can always go up or down - the block is a line.
        | Up -> { coordinate with Y=(coordinate.Y + 1) }
        | Down -> { coordinate with Y=(coordinate.Y - 1) }
    /// Fold over each operation going from initial to final position
    let finalCoordinate = 
        candidate |> List.fold newPosition initialPosition
    { FinalPosition=finalCoordinate; Candidate=candidate }

Now we determine the fitness for a result.  The goal here is to move as far to the right as possible, so the position along the x-axis is effectively the “score” for an individual’s result.

type Fitness = { X_Position:int; Result:Result }

let fitness result =
    { X_Position=result.FinalPosition.X; Result=result }

A whole population can be evaluated by running the simulation and the evaluating the fitness of each individual. We can compose the runSimulation and fitness functions, and pass each member of the population through this composition to get the fitness of each.  Each member starts at the same initial position – the origin – (0,0).

let evaluateGeneration initialPosition (generation:Population) =
    |> (initialPosition |> runSimulation >> fitness)

After evaluating the population, we will want to create a new population for the next generation. This consists of selecting the best individuals, mutating some, and crossing them with each other in some way to result in a new population.

First, we can define a crossover function:

let crossover crossoverPoint (parents:Parents) =
    let parent1, parent2 = 
        parents |> fst,
        parents |> snd
        parent1 |> List.take crossoverPoint
        parent2 |> List.skip crossoverPoint
    ] |> List.concat
    : Individual

This will concatenate the first part of one individual with the second part of another list to create a new individual. This is one of the areas you may decide to try some different strategies to see what works best. You could interleave traits from the two individuals, you could combine traits from more than two, but the general point is that you need to build a new individual with building blocks from the parents. A little randomness is often good here as well. If you always crossover parents at the middle, then after a few generations they are always swapping the same traits back and the algorithm won’t evolve as well.

Next, we need mutation. Without some diversity, these same traits will always be selected as the best and never really get any better. By adding mutation, different individuals can be pushed to the top so they crossover with others, preventing stagnation and moving the population forward. The mutation below will create an entirely new individual in the population, although this is another place where different strategies should be attempted.

let mutation newIndividual (population:Population) =
    |> List.mapi (fun idx individual -> 
        if idx % 13 = 0 then // replace 13th member of the population
            newIndividual ()

If we just wanted to skip mutation altogether, we could have this sort of mutation function:

let noMutation (population: Population) = population

Armed with a crossover function to make an Individual from Parents and a mutation function to modify some elements of a population, we can take the results for one generation and apply the mutation and crossover to get the next generation.

We want the better part of the current population, and we will make our next generation by combining the best two with the better half of the population.

  1. Get better half + 2 for the parents
  2. Apply the mutation function to the better half
  3. Crossover the best two with the rest of the population to get a new population
let nextGeneration (mutation:Population -> Population) (crossover:Parents -> Individual) (generationWithResults:Fitness list) =
    let betterHalf : Population =
        |> List.sortByDescending (fun fitness -> fitness.X_Position)
        |> List.truncate (generationWithResults.Length / 2 + 2)
        |> (fun fitness -> // and get just best of the population

    // Apply some mutation before the crossover.
    betterHalf |> mutation
    |> function // take the first two and crossover with the rest.
    | first::second::rest ->
            rest |> (fun prevGen -> crossover (first, prevGen) )
            rest |> (fun prevGen -> crossover (second, prevGen) )
        ] |> List.concat
    | _ -> failwithf "Population is too small (%i) for another generation." betterHalf.Length

It’s important to note that the mutation should generally occur before the crossover. If it is applied afterwards, then you just have a few unusual candidates that will probably perform poorly and be eliminated before they ever crossover with others.

With initial generation, fitness, crossover, and mutation in place, let’s write put it together. We state our goal, make an evolve function that processes a generation and then continues until it finds an optimal solution, up to a maximum number of generations.

let main argv =
    /// Some randomness for generating individuals' traits.
    let rng = Random ()

    /// We'll have 25 moves to get around the obstacle and head to the right.
    let numberOfMoves = 25

    /// Start at the origin.
    let initialPosition = { X=0; Y=0 }

    /// make a function to create an individual
    let newIndividual () = createIndividual rng numberOfMoves

    /// define a function to mutate
    let mutate = mutation newIndividual

    /// Function that will evolve a population through generations
    let rec evolve max current population =
        let randomPoint = rng.Next(0, numberOfMoves)
        let crossoverAtPoint = crossover randomPoint 
        match current with
        | i when i = max -> ()
        | i -> 
            let results = population |> evaluateGeneration initialPosition
            let best = results |> List.sortByDescending (fun fitness -> fitness.X_Position) |> List.head
            printfn "gen %i results: %A" i best
            let nextGen = results |> nextGeneration mutate crossoverAtPoint
            nextGen |> evolve max (i + 1)

    let initialPopulationSize = 35
    /// make the initial candidates
    let initialPopulation = newIndividual |> createPopulation initialPopulationSize
    printfn "Initial candidates: %A" initialPopulation
    /// Run 1000 iterations starting with the initial population
    evolve 1000 0 initialPopulation

Running the application will generate a population and run it through generations to find an optimal solution. Here are a few sample generations:

gen 0 results: {X_Position = 3;
 Result =
  {FinalPosition = {X = 3;
                    Y = -8;};
   Candidate =
    [Back; Down; Back; Down; Forward; Forward; Up; Forward; Back; Back; Forward;
     Back; Down; Forward; Down; Down; Forward; Forward; Back; Down; Forward;
     Forward; Down; Down; Down];};}

Wow, kind of just wandering aimlessly, going backwards a bunch. But by the 4th generation, it figured out to go forward a lot more. Still hit that wall a few times, wasting moves.

gen 4 results: {X_Position = 7;
 Result =
  {FinalPosition = {X = 7;
                    Y = 2;};
   Candidate =
    [Forward; Forward; Back; Forward; Forward; Up; Forward; Up; Up; Up; Forward;
     Down; Forward; Up; Forward; Forward; Down; Forward; Up; Back; Down; Down;
     Down; Up; Forward];};}

It starts to look figure things out, going up or down early on, then going forward a lot.

gen 112 results: {X_Position = 13;
 Result =
  {FinalPosition = {X = 13;
                    Y = -8;};
   Candidate =
    [Down; Forward; Forward; Down; Down; Forward; Forward; Down; Up; Down; Back;
     Forward; Forward; Forward; Forward; Forward; Down; Forward; Down; Forward;
     Down; Down; Forward; Forward; Forward];};}

I let it go 1000 times. Probably not so necessary, but here’s where it ended up, heading down below the obstruction then mostly forward after that.

gen 999 results: {X_Position = 17;
 Result =
  {FinalPosition = {X = 17;
                    Y = -2;};
   Candidate =
    [Down; Forward; Down; Down; Forward; Forward; Forward; Up; Forward; Forward;
     Up; Forward; Up; Forward; Forward; Forward; Forward; Forward; Forward; Down;
     Forward; Forward; Forward; Down; Forward];};}
Not too complex, and we’ve evolved a set of data that meets our goal. Interestingly, we can change our obstruction or add more obstructions to our simulation and see the algorithm evolve differently.
A quick summary:
  1. Define the traits that make up an individual.
  2. Define functions to create individuals and an initial population
  3. Define a function to simulate what you want the individuals to attempt
  4. Define a fitness function to determine which individuals perform the best
  5. Define a crossover function and a mutation function to create better generations
  6. Evolve the individuals through generations until you achieve an optimal result.

EntityFramework in F# on .NET Core

There is a lot of scattered information these days on using EntityFramework with .NET Core, especially when it comes to F#, so I put this together as a very quick example of the steps.

First, Set up your .NET Core .fsproj with needed packages:
dotnet add package Microsoft.EntityFrameworkCore
dotnet add package Npgsql.EntityFrameworkCore.PostgreSQL or whatever DB driver you use

In some module, make your DB types and a DbContext to hold them:

open Microsoft.EntityFrameworkCore

// First, your "code-first" models:
[<Table("some_things",Schema="public")>] // Schema needed for Postgres, YMMV
type SomeThing () = 
  member val ID : System.Guid = System.Guid() with get, set
  member val Name : string = "" with get, set

// Then a type to inherit from DbContext with properties for your models:
type Db (connectionString : string) =
  inherit DbContext()

  val mutable Somethings : DbSet<SomeEntity>
  member public this.SomeThings
    with get() = this.somethings
    and set s = this.somethings <- s

Finally, use the DbContext with an F# query expression, and in an async expression if you like:

async {
  use db = new Db (connectionString)
  let q = query {
    for s in db.Somethings do
    select s
  let! records = q.ToListAsync () |> Async.AwaitTask

I hope this helps readers to get started quickly with the EntityFramework on F# and .NET Core.

Communication between Docker containers

There is a default bridge network that exists on all docker hosts, and any containers you launch will be added to that bridge by default.  You can see it with:

$ docker network inspect bridge

With no containers running, you’ll see nothing listed here under “Containers”, but once you start a container, it will be on that bridge:

$ docker run -itd postgres

$ docker network inspect bridge

// result snipped for brevity
"Containers": {
 "f94674a1a0df545bb64058f3b763426466e4f0daa80397504581d537a4907e11": {
 "Name": "prickly_lovelace",
 "EndpointID": "d3d721da42c5b2d7b50266e49112c0590af80c01d96b272c83e2fcd05e9c2e73",
 "MacAddress": "02:42:ac:11:00:02",
 "IPv4Address": "",
 "IPv6Address": ""

Now start a container with bash, and you can ping

$ docker run -it debian /bin/bash
root@176b02a2a6e2:/# ping -c 1

PING ( 56 data bytes
64 bytes from icmp_seq=0 ttl=64 time=0.136 ms

That’s cool…they can see each other, and they can actually connect over any ports that are exposed in the Dockerfile, 5432 in this case.

root@176b02a2a6e2:/# </dev/tcp/ && echo connected


On the default bridge network, you link two containers with the --link option, which does little more than add a /etc/hosts entry:

$ docker run -it --rm  --link prickly_lovelace:pg debian /bin/bash
root@84e87ecb1060:/# ping -c 1 pg

PING pg ( 56 data bytes
64 bytes from icmp_seq=0 ttl=64 time=0.101 ms

root@84e87ecb1060:/# cat /etc/hosts | grep pg pg da01e1088f69 prickly_lovelace


Docker also exposed port 5432 on the bridge network only.  It’s not available on the host, but another container on that network can connect to it.

$ docker run -it --rm  --link prickly_lovelace:pg postgres psql -h pg -U postgres

This dropped us in a psql shell connected on one container to the postgres container on another.

You may have heard that --link is deprecated.  It is, as you only really use that on the default bridge network to connect containers by name.  To really isolate your applications, you should create a bridge, and you get DNS entries for each docker container.

$ docker network create --driver bridge mynet

Now you have a network and you can inspect it, e.g. docker network inspect myna.  When launching containers on that bridge, if you specify the name, then other containers on the same bridge can resolve by name using Docker’s embedded DNS server:

$ docker run -itd --name pgserver --network mynet postgres

Here you named the container “pgserver”, put it on the network “mynet”, and launched it from the “postgres” image.  You can verify that a container launched in “mynet” can connect, while a container launched elsewhere will not.

$ docker run -it --rm postgres psql -h pgserver -U postgres

psql: could not translate host name "pgserver" to address: Name or service not known

$ docker run -it --rm --network mynet postgres psql -h pgserver -U postgres

psql (9.6.2)
Type "help" for help.
postgres=# \q

Looks great, we have network isolation and hostname resolution between containers in our bridge network!

We were able to connect to 5432 since that is exposed in the Dockerfile for the postgres image, but what if you deployed an application to a standard image that doesn’t expose any ports?  For example, you might mount a directory containing your application in a container launched from the microsoft/dotnet:runtime image, and wish to expose the port for your application.  To do this, pass --expose=<port> as an option to docker run.  As a very simplistic example, the following will install python3 on a container based off debian and start the simple http server:

$ docker run -itd --network mynet --expose 8000 --name littlehttp debian /bin/bash -c "apt-get update && apt-get install -y python3 && python3 -m http.server"

Once the container is done installing python3, it will listen on port 8000, which you can connect to with another container.

$ docker run -it --rm --network mynet debian /bin/bash -c "apt-get update && apt-get install -y curl && curl http://littlehttp:1500"

You can expose a port for a container attached to the bridge network on the host network instead:

$ docker run -itd --network mynet --expose 8000 -p --name littlehttp debian /bin/bash -c "apt-get update && apt-get install -y python3 && python3 -m http.server"

In the above, we tell the container to expose 8000 and then bind host port 1234 to that container’s port 8000.  Now we can browse to our docker host’s port 1234 to connect to the simple http service from the container.

When it’s time to tear down the application, after removing the containers, you can remove the “mynet” bridge network for them with docker network remove mynet.


Building .NET Core applications in TeamCity

Continuous integration is a critical part of the development lifecycle move changes quickly from code through testing and packaging to deployment.  These steps apply to TeamCity, but the work is really performed in docker, so could be done on nearly any CI:

It’s challenging to expect the TeamCity build agents to be up to date with the latest runtime.  If you can’t count on that, then running command line tasks with Docker is your best bet.

First, the build.  Add a “command line” runner.  The working directory on the agent will be the location where the source is checked out, so you’ll need to attach this directory to a docker container with the -v option.  Then within that container, you’ll need to run the dotnet commands, so the container needs to be based off the SDK images provided from Microsoft.  The latest image at time of writing supports the msbuild project file format.

To actually build, you need to run a bash command to

  1. Change to the directory that is mounted from the host (the build agent’s working directory).
  2. Restore dependencies with dotnet restore
  3. Build the project or solution with dotnet build
  4. Test the project with dotnet test
  5. Build a nuget package with dotnet pack

All of that can be done within the same container using the following command line:

docker run -v `pwd`:/src --rm -i microsoft/dotnet:sdk /bin/bash -c "cd src; dotnet restore && dotnet build && dotnet test mytestproject/mytestproject.fsproj && dotnet pack myproject/myproject.fsproj"

Once executed, a successful build will result in a nuget package under the working directory, in this case, at myproject/bin/Debug/myproject.1.0.0-alpha.nupkg.  The && chains commands together upon success, so, for example, if the tests fail, the package won’t be produced.  The container was started with --rm and so it will be removed upon exit of the command, cleaning up any build artifacts.  An additional build step can publish that package to a nuget repository.

Distributed Coordination with Zookeeper

Zookeeper is a system for coordinating applications and provides a framework for solving several problems that can arise when building applications that must be highly available, distributed, tolerant to network partitions and node failures:

  1. Data update notifications. Imagine you have a few processes running to processes some data. Whenever one process is done, it needs to let the others know it’s ready for the next process to pick it up. A rudimentary way to accomplish this would be for all the processes to periodically read some file or database record to see if it is their turn. That works, but everything needs to poll, and that can lead to locking of the file or database table. With Zookeeper, you have a tree of nodes, like a mini file system, and can query for some data and set a watch on the path to that data. If the data is added, deleted, or otherwise changed on that path, your watch will be immediately notified. This is a very helpful tool for passing status information or other data between local or distributed processes, which is something you will do quite a lot when increasing scalbility.
  2. Distributed lock (semaphore). If you’ve written any multithreaded code, you likely had to occasionally use some mechanism to lock access to some shared resource. If you’re working with distributed resources and want to lock access to those, it’s not sufficient to use a locking mechanism in your own process or even in the OS – you need a distributed lock. To accomplish this in Zookeeper, create a node that represents the resource you are locking, and anything that wishes to have access to that resource must be able to create the node. If the node already exists, that means another application has the lock, so you should set a watch to know when it is released (the node is deleted). Because applications crash and network partitions occur, you will want a lock created by an application released even if the application abnormally disconnects, which can be accomplished by creating an “ephemeral” node for the lock. When the session associated with an ephemeral node ends, the node will be automatically deleted, and for your distributed application, that means the lock is released.
  3. Leadership election. Occasionally, your clustered application will need an active/passive capability. An election needs to occur to determine the active member, and if the that member dies, another election must occur to find a new active member. This can be accomplished in Zookeeper by all cluster members attempting to create data at a specific path in Zookeeper. The first one to create the node becomes the active member, and it should delete the node when it shuts down. The rest watch for the node to be deleted, at which point, they all try to create the node, and the winner becoomes the active member. Because application crash, Zookeeper nodes can be created as “ephemeral” so that if the connection closes and session ends, the ephemeral nodes created by that session will be removed automatically. So if the active member crashes, Zookeeper removes the node and passive members will hold an election.

Lots of nice use cases, so how do we use it? First you need a Zookeeper instance, or an “ensemble” if you want high availability of Zookeeper itself. There are plenty of tutorials for that, one of the quickest ways to get a simple instance up and running is to install the “zookeeperd” package.

yum install -y zookeeperd


apt-get install -y zookeeperd

After installing, you may need to start Zookeeper (sudo service zookeeper start), and then by default it will listen on port 2181.

This isn’t a tutorial on administration of Zookeeper itself, and I suggest you read the documentation on clustering, backup, purging the transaction logs, etc. On to actually using it from .NET.

The authors of Zookeeper didn’t do a great job of documenting the protocol, nor do they create a .NET client. A few have reverse engineered the Java client, and the best library for this currently is ZooKeeperNetEx. This library is well maintained by developers that are responsive to reported issues and keep up to date with the latest .NET technologies. Big features of this library:

  • Fully supports Windows .NET, Mono, and CoreCLR
  • Asyncronous through and through, so you will not find yourself blocking threads waiting on the Zookeeper protocol.
  • Comes with a companion package, ZooKeeperNetEx.Recipes that has implementation of common scenarios used in Zookeeper.
open org.apache.zookeeper
open org.apache.utils

// A Zookeeper Watcher that watches for the connection to be made successfully.
type WaitToConnect() =
    inherit Watcher()
    // Need to return a task that only completes once connected to a Zookeeper node.
    let connectedCompletionSource = TaskCompletionSource()
    // WaitAsync returns a task that completes once connected.
    member cw.WaitAsync () = connectedCompletionSource.Task
    // The process method is called by the framework when the connection state changes.
    override cw.``process``(evt:WatchedEvent) =
        match evt.getState() with
        | Watcher.Event.KeeperState.SyncConnected -> connectedCompletionSource.TrySetResult(true) |> ignore
        | _ -> ()
        Task.FromResult(true) :> Task

// A function to asynchronously connect to Zookeeper, wait until successfully connected, create a node, and disconnect.
let connectCreateNodeAndDisconnect () =
    async {
        let zkUrl = "localhost:2181"
        let sessionTimeout = 10000
        let newPath = "/mynode"
        let watcher = WaitToConnect ()
        let zk = ZooKeeper(zkUrl, sessionTimeout, watcher)
        let! connected = watcher.WaitAsync () |> Async.AwaitTask
        let! newNode = zk.createAsync (newPath, null, ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT) |> Async.AwaitTask
        do! zk.closeAsync() |> Async.AwaitTask

// Call this function, in this case synchronously
connectCreateNodeAndDisconnect () |> Async.RunSynchronously

That’s nice, we created a node. For fun, you can run with your Zookeeper node stopped (sudo service zookeeper stop), and it will sit and wait. It’s an async wait, so just a TPL task not blocking a thread for the I/O with Zookeeper, although Async.RunSynchronously will block our main thread so the whole application doesn’t just exit. Start Zookeeper (sudo service zookeeper start) and the client connects, the Task returned by WaitAsync() completes, and the code moves forward.

You can use a similar pattern to implement other watchers, such as waiting for a node to be created or deleted so you can perform the leader election mentioned above. ZooKeeperNetEx has a companion package, ZooKeeperNetEx.Recipes, which contains several common patterns, like distributed locking and leadership election.  And example of leadership election from these recipes is show below.

open org.apache.zookeeper
open org.apache.utils

let ElectLeader (zkUrl:string) =
    let electionNode = "/election" // This is the parent node where this election will occur.

    // Connect to Zookeeper and create the node for this election.
    let connectAndSetupElectionAsync () = async {
        let w = WaitToConnect()
        let zk = ZooKeeper(zkUrl, timeout, w)
        let! connected = w.WaitAsync() |> Async.AwaitTask
        let! electionNodeExists = zk.existsAsync(electionNode) |> Async.AwaitTask
        match electionNodeExists with
        | null -> // When a node doesn't exist, you get a null
            let! electionNodeCreated = zk.createAsync(electionNode, [||], ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT) |> Async.AwaitTask
            printfn "Created election node %" electionNodeCreated
        | _ -> ()
        return zk

    // When we are done, we will need to close.
    let closeAsync (zk:ZooKeeper) = async {
        do! zk.closeAsync() |> Async.AwaitTask

    // Define an async workflow for starting the election.  Each candidate will do this.
    let startElectionAsync (zk:ZooKeeper) hostname = async {
        let election = LeaderElectionSupport(zk, electionNode, hostname)
        do! election.start() |> Async.AwaitTask
        return election

    // Who won the election?
    let getLeaderAsync (election:LeaderElectionSupport) = async {
        return! election.getLeaderHostName() |> Async.AwaitTask

    // We're all done, we can stop the election.
    let stopElectionAsync (election:LeaderElectionSupport) = async {
        do! election.stop() |> Async.AwaitTask

    // Let's hold an election.
    async {
        let! zk = connectAndSetupElectionAsync ()
        let announceWinner candidate = printfn "%s won!" candidate
        // Create 3 candidates, they all run, and they all agree one wins
        let candidacy1 = async {
            let! election = startElectionAsync zk "candidate1"
            let! leader = getLeaderAsync election
            leader |> announceWinner
            do! Async.Sleep(10000)
            do! election |> stopElectionAsync
        let candidacy2 = async {
            let! election = startElectionAsync zk "candidate2"
            let! leader = getLeaderAsync election
            leader |> announceWinner
            do! Async.Sleep(10000)
            do! election |> stopElectionAsync
        let candidacy3 = async {
            let! election = startElectionAsync zk "candidate3"
            let! leader = getLeaderAsync election
            leader |> announceWinner
            do! Async.Sleep(10000)
            do! election |> stopElectionAsync
        let! electionsCompleted = [candidacy1; candidacy2; candidacy3] |> Async.Parallel
        do! zk |> closeAsync
    } |> Async.RunSynchronously

If you run the code multiple times, you’ll see that different candidates win each time. There is no reason this code needs to run on the same machine – typically it would run on each member in a cluster, and any node would check to see if it is the leader before performing its leadership duties. If one wants to drop out, even the current leader, it calls stopElectionAsync and one of the other candidates will be chosen.


  • Logs – Zookeeper keeps a transaction log for synchronization between nodes, and you need to clean this up or it will fill up the disks. This is not hard to automate, just don’t forget to do it.
  • Connection limit – the default is 50 connections from a single IP address. You probably want to share a single connection for your entire application, or at least create some sort of pool.
  • Heavy weight connection – the connection takes time to negotiate – it’s not a stateless protocol like HTTP. Avoid scenarios where you might create a connection per HTTP request or something that you cannot really control, because they can really hinder your application performance if constantly connecting and disconnecting.

.NET in the Linux Ecosystem

Running .NET applications on Linux provides access to API’s and libraries that augment the capabilities of horizontally scaled applications, but more importantly, living in the Linux ecosystem provides access to an ecosystem of tools for distributed application development and management.

You may be a seasoned .NET developer, excited about .NET core and mono and being able to take your code over to Linux where you don’t have to pay for the OS license or you could run code on your toaster. That is exciting, but Linux is great for software development because of all the software available to scale up your distributed application. I’ll discuss the following benefits and show the tools, libraries, and code I’m using for each.

  1. Cluster Coordination with Zookeeper
    a. Distributed semaphore
    b. Leader election
    c. Emerging alternatives – Consul + .NET client
  2. Mesos – a layer above the (V)M
    a. Lifetime and HA for clustered applications
    b. Mono & CoreCLR
    c. Docker
    d. Service discovery & mesos-dns
    e. Windows mesos-slave – work in progress
  3. Storm
    a. Scaling processing horizontally
    b. F# on Storm

All code samples are in F#, as it provides a relatively succinct language for demonstrating the functionality and also has rich async primitives, which are used quite frequently in distributed applications.