Episode #119: Parser Combinators Recap: Part 1

Parser Combinators Recap: Part 1

Episode #119 • Oct 5, 2020 • Subscriber-Only

It’s time to revisit one of our favorite topics: parsing! We want to discuss lots of new parsing topics, such as generalized parsing, performance, reversible parsing and more, but before all of that we will start with a recap of what we have covered previously, and make a few improvements along the way.

Previous episode

The Point of Redacted SwiftUI: Part 2

Parser Combinators Recap: Part 1

Introduction
0:05
What is a parser?
2:46
Defining the problem of parsing
6:39
Defining a parser
10:04
Defining a parser’s home
18:48
Next time: parser composition recap
24:38

References

Downloads

Next episode

Parser Combinators Recap: Part 2

Locked

Unlock This Episode

Our Free plan includes 1 subscriber-only episode of your choice, plus weekly updates from our newsletter.

Sign in with GitHub

Introduction

Today we will pick up a topic that we last covered over a year ago, and it’s one of the most popular series of episodes we have ever done. And that’s parsing.

We spent 9 episodes exploring the problem space of parsing. First we gave a concise definition of what it means to parse something in general, including what tools Apple gives us in their frameworks for parsing. Then we gave a proper functional programming definition of a parser, and of course it basically boils down to just one single function signature. And with that function signature we were able to define all the usual functional operators on it that we have grown to love, such as map, zip and flatMap, and each one had a very precise job of how to break down a large, complex problem into smaller ones. And then finally we discussed “parser combinators”, which are custom little functions that take parsers as input and produce parsers as output, and these things allowed us to take parsing to the next level.

However, even though the things we covered previously were quite powerful, it doesn’t even begin to scratch the surface of parsing. There is so much we want to cover.

First, we want to improve the ergonomics of the parsers that we have defined so far, because we haven’t given much attention to that yet and there are a few sharp edges right now.

Next, we want to show how to generalize parsing so that we can parse more than just strings. We’d like to be able to parse any kind of data type.

And then, amazingly, the very act of generalizing our parser will give just the tools we need to finely tune the performance of our parsers. If done correctly they can be nearly as efficient as hand rolled parsers, which is pretty amazing.

And then finally, we want to show how to flip parsing on its head by describe what it means to “unparse” something. That is, if you were to “unparse” a parsed result, and then re-parse it, you should get back to where you started. That may not sound very interesting, but trust us, it’s surprising and amazing to see, but we don’t want to give away any spoilers right now.

So, there’s still a ton to cover, and we’ll get there in the coming months, but right now we want to take a step back and give a little recap on everything we covered before. There are a lot of people out there that haven’t watched our parser episodes, though we highly encourage you to, and others that have watched may have gotten a bit rusty. This episode will quickly summarize everything we covered in the past 9 episodes to get us all on the same foundation from which we can build much bigger concepts. It won’t have a ton of new content for those who are already familiar with parsing, but we will make a few tweaks to the previous code in order to improve its ergonomics along the way, so hopefully everyone will get at least something from this episode.

So let’s get started!

What is a parser?

References

Combinators
Daniel Steinberg • Sep 14, 2018
Daniel gives a wonderful overview of how the idea of “combinators” infiltrates many common programming tasks.
Note
Just as with OO, one of the keys to a functional style of programming is to write very small bits of functionality that can be combined to create powerful results. The glue that combines the small bits are called Combinators. In this talk we’ll motivate the topic with a look at Swift Sets before moving on to infinite sets, random number generators, parser combinators, and Peter Henderson’s Picture Language. Combinators allow you to provide APIs that are friendly to non-functional programmers.
https://vimeo.com/290272240
Parser Combinators in Swift
Yasuhiro Inami • May 2, 2016
In the first ever try! Swift conference, Yasuhiro Inami gives a broad overview of parsers and parser combinators, and shows how they can accomplish very complex parsing.
Note
Parser combinators are one of the most awesome functional techniques for parsing strings into trees, like constructing JSON. In this talk from try! Swift, Yasuhiro Inami describes how they work by combining small parsers together to form more complex and practical ones.
https://academy.realm.io/posts/tryswift-yasuhiro-inami-parser-combinator/
Regex
Alexander Grebenyuk • Aug 10, 2019
This library for parsing regular expression strings into a Swift data type uses many of the ideas developed in our series of episodes on parsers. It’s a great example of how to break a very large, complex problem into many tiny parsers that glue back together.
https://github.com/kean/Regex
Regexes vs Combinatorial Parsing
Soroush Khanlou • Dec 3, 2019
In this article, Soroush Khanlou applies parser combinators to a real world problem: parsing notation for a music app. He found that parser combinators improved on regular expressions not only in readability, but in performance!
http://khanlou.com/2019/12/regex-vs-combinatorial-parsing/
Learning Parser Combinators With Rust
Bodil Stokke • Apr 18, 2019
A wonderful article that explains parser combinators from start to finish. The article assumes you are already familiar with Rust, but it is possible to look past the syntax and see that there are many shapes in the code that are similar to what we have covered in our episodes on parsers.
https://bodil.lol/parser-combinators/
Sparse
John Patrick Morgan • Jan 12, 2017
A parser library built in Swift that uses many of the concepts we cover in our series of episodes on parsers.
Note
Sparse is a simple parser-combinator library written in Swift.
https://github.com/johnpatrickmorgan/Sparse
parsec
Daan Leijen, Paolo Martini, Antoine Latter
Parsec is one of the first and most widely used parsing libraries, built in Haskell. It’s built on many of the same ideas we have covered in our series of episodes on parsers, but using some of Haskell’s most powerful type-level features.
http://hackage.haskell.org/package/parsec
Parse, don’t validate
Alexis King • Nov 5, 2019
This article demonstrates that parsing can be a great alternative to validating. When validating you often check for certain requirements of your values, but don’t have any record of that check in your types. Whereas parsing allows you to upgrade the types to something more restrictive so that you cannot misuse the value later on.
https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/
Ledger Mac App: Parsing Techniques
Chris Eidhof & Florian Kugler • Aug 26, 2016
In this free episode of Swift talk, Chris and Florian discuss various techniques for parsing strings as a means to process a ledger file. It contains a good overview of various parsing techniques, including parser grammars.
https://talk.objc.io/episodes/S01E13-parsing-techniques

Downloads

Sample code

0119-parsers-recap-pt1

Get started with our free plan

Our free plan includes 1 subscriber-only episode of your choice, access to 72 free episodes with transcripts and code samples, and weekly updates from our newsletter.

Sign up for free →

View plans and pricing