And we think this is pretty incredible. We now have the basic infrastructure of printers in place. At its core it is just a protocol that exposes a single method for turning some type of output back into an input, and that process can fail. It is further expected that this print requirement plays nicely with the corresponding parse requirement, which consumes some input to turn it into an output, and that process can also fail.
This expectation is something we called “round-tripping”. If you parse an input into an output and print that output back into an input, then you should roughly get the same input as what you started with. Further, if you print an output to an input and parse that input to an output, then you should again roughly get the same output as what you started with. We say roughly because parsing and printing can be lossy operations, but the underlying content of the input or output should be unchanged.
With this infrastructure we’ve already converted a pretty impressive parser to also be a printer. We had a parser that could process a list of comma-separated fields into an array of tuples that represent a user’s id, name and admin status, and now it can print an array of tuples values back into a comma-separated string of users.
Further, there is some very nuanced logic embedded in this parser-printer, where we want to be able to parse quoted name fields so that we can allow for commas in the name, but at the same time we want to prefer printing without the quotes if the name doesn’t contain a comma. This logic can be tricky to get right, and previously we had it scattered in two different places: once for the parser and once for the printer. And now it’s all in just one place.
Now, we could keep moving forward by converting more and more parser combinators to be printer combinators, which would allow us to build up more and more complex printers, but there are some problems we need to address first. In order for us to get as far as we did with printers in the previous episodes we needed to make some simplifying assumptions that has greatly reduced the generality that our library aspires to have.
The first problem is that in a few places we had to needlessly constrain our printer conformances because their inputs were far too general for us to know how to print to them. For example, in the Prefix
printer we just assumed that we were working on substrings rather than any collection, and for the integer and boolean parsers we assumed we were working on UTF8View
s rather than any collection of UTF8 code units. This unnecessarily limits the number of situations we can make use of these parsers.
The second problem we ran into was that we couldn’t figure out how to make the map operation into a printer. The types just didn’t line up correctly, so we abandoned it and decided not to use map anywhere in our parser-printers. This means that we aren’t actually parsing and printing User
structs, but rather we are really dealing with just tuples of the data. We currently don’t have a printer-friendly way of bundling up the data in those tuples into a proper User
struct. Losing map
due to printing is a pretty big deal as it is a fundamental operation for massaging parser outputs into more well-structured data, so ideally we can recover this operation somehow.