Unlock This Episode
Our Free plan includes 1 subscriber-only episode of your choice, plus weekly updates from our newsletter.
Introduction
Pretty incredible. More than 10 times faster than the substring parser. This really does show that the performance gains to be had by working on UTF-8 can be truly substantial. This is the difference of being able to process more than 50 megabytes of logs per second, or being able to process a measly 4 megabytes of logs per second.
So this is pretty huge. We have found that our parser type is so general and so composable that we are not only able to parse at the low UTF-8 level in order to get huge performance gains, but we can also fluidly move between abstraction levels allowing us to choose the right balance of correctness and speed. This is absolutely incredible, and honestly it’s not something we’ve really ever seen in other parsing libraries.
And as hard as it may be to believe, it gets even better. There is even one more change we can make to our parser library that unlocks the final level of performance, and this will bring the performance of our parsers within a very slim margin of hand-rolled, ad hoc, imperative parsers, which are widely believed to be the fast way to write parsers even if they are a bit messy.
To see where this last performance gain could be hiding, let’s run one of our benchmarks in the time profile instrument and see what it exposes to us.
Subscribe to Point-Free
Access this episode, plus all past and future episodes when you become a subscriber.
Already a subscriber? Log in
Exercises
To better understand the difference between the
Parser
struct andParserProtocol
, let’s take a look at the stack of a deeply-nested parser.Print out the stack frames (using
Thread.callStackSymbols
) inside bothParser.int
andIntParser
and run only the nested benchmarks. You can reduce the number of iterations of a benchmark by passing theIterations(1)
option to the suite:let protocolSuite = BenchmarkSuite( name: "Protocol", settings: Iterations(1) ) { suite in
How do the stacks compare? How do they differ from the stack when viewed in the debugger and time profiler instrument?
Solution
Int.parser
’s call stack array contains 27 frames of parsing code.IntParser
, on the other hand has only 5 parser-specific frames. Quite the difference!In the debugger and time profiler instrument,
Int.parser
‘s parsing-specific stack is 37 frames in the debugger, so we’re seeing that Swift was able to optimize 10 frames out. Meanwhile,IntParser
shows 20 frames in the debugger, which means it optimized 15 frames out. Pretty impressive!
References
Why Combine has so many Publisher types
Thomas Visser • Thursday Jul 4, 2019A detailed article on the technique of “operator fusion” that Combine employs.
An operator fusion primer
Jasdev Singh • Wednesday Apr 1, 2020A detailed article on the technique of “operator fusion” that Combine employs.
swift-benchmark
Google • Friday Mar 13, 2020A Swift library for benchmarking code snippets, similar to google/benchmark.
UTF-8
Michael Ilseman • Wednesday Mar 20, 2019Swift 5 made a fundamental change to the String API, making the preferred encoding UTF-8 instead of UTF-16. This brings many usability and performance improves to Swift strings.
Strings in Swift 4
Ole Begemann • Monday Nov 27, 2017An excerpt from the Advanced Swift that provides a deep discussion of the low-level representations of Swift strings. Although it pre-dates the transition of strings to UTF-8 in Swift 5 it is still a factually correct accounting of how to work with code units in strings.
Improve performance of Collection.removeFirst(_:) where Self == SubSequence
Stephen Celis • Tuesday Jul 28, 2020While researching the string APIs for this episode we stumbled upon a massive inefficiency in how Swift implements removeFirst
on certain collections. This PR fixes the problem and turns the method from an O(n)
operation (where n
is the length of the array) to an O(k)
operation (where k
is the number of elements being removed).