🎉 End-of-year Sale! Save 25% when you subscribe today.

Backtracking Parsers

Wednesday February 9, 2022

Today we are releasing 0.6.0 of our swift-parsing library that changes its backtracking behavior. Most users of the library will not notice a difference, though some of your parsers may start to run slightly faster. If you are interested in why we made this decision, then read on!

What is backtracking?

Backtracking in parsing is the process of restoring the input to its original state when it fails. From the first release of swift-parsing we made each parser responsible for backtracking its input upon failure. For example, the Many parser keeps track of the original input so that if something goes wrong we can restore the input to its original state.

A more complex example is the .flatMap operator. This operator allows you to run one parser after another such that the second parser can use the output of the first to customize its logic. In order to implement this parser with backtracking we need to check if something fails after each step of the sequence so that we can restore the original input:

public func parse(
  _ input: inout Upstream.Input
) -> NewParser.Output? {
  let original = input
  guard
    let newParser = upstream.parse(&input)
      .map(transform)
  else {
    input = original
    return nil
  }
  guard let output = newParser.parse(&input)
  else {
    input = original
    return nil
  }
  return output
}

Why remove backtracking?

Making parsers responsible for backtracking seems reasonable, but in practice was fraught. At best, if done correctly, it meant that your parser was littered with additional logic not related to parsing in order to properly restore the input to its original state. For example, removing backtracking from the .flatMap parser above greatly simplifies the parser’s logic:

public func parse(
  _ input: inout Upstream.Input
) -> NewParser.Output? {
  upstream.parse(&input).map(transform)?.parse(&input)
}

And at worst, backtracking in each parser could could be done incorrectly, or not at all, which would lead to subtle bugs where some parsers backtrack and some do not.

So, this is why we decided to remove the requirement that parsers individually implement backtracking logic. Backtracking is still useful, but if you need backtracking you can use a dedicated parser for it: OneOf. It’s a parser that tries multiple parsers on the same input. If one fails, it backtracks the input to its original state, and if one succeeds, then that output is returned and no other parsers are tried.

This coalesces all backtracking responsibilities into a single parser, which allows you, the library user, to decide when and where you want backtracking.

Does this affect me?

Most likely it does not. If you only use the parsers and operators that ship with the library you will probably not notice any different behavior in your parsers.

If you implement custom parsers, meaning you create new types that conform to the Parser protocol, then you no longer need additional logic for tracking the orginal input and restoring it when parsing fails. You can just leave the partially consumed input as-is.

And regardless of whether you implement custom parsers or not, it can be a good idea to be mindful of how backtracking affects the performance of your parsers. If used naively, backtracking can lead to less performant parsing code. For example, if we wanted to parse two integers from a string that were separated by either a dash “-” or slash “/”, then we could write this as:

OneOf {
  Parse { Int.parser(); "-"; Int.parser() } // 1️⃣
  Parse { Int.parser(); "/"; Int.parser() } // 2️⃣
}

However, parsing slash-separated integers is not going to be performant because it will first run the entire 1️⃣ parser until it fails, then backtrack to the beginning, and run the 2️⃣ parser. In particular, the first integer will get parsed twice, unnecessarily repeating that work.

On the other hand, we can factor out the common work of the parser and localize the backtracking OneOf work:

Parse {
  Int.parser()
  OneOf { "-"; "/" }
  Int.parser()
}

This is a much more performant parser.

Try it today

Update your projects to use 0.6.0 of swift-parsing today, and let us know if you encountered any strange behavior that you did not expect.

Get started with our free plan

Our free plan includes 1 subscriber-only episode of your choice, access to 64 free episodes with transcripts and code samples, and weekly updates from our newsletter.

View plans and pricing