So we are now starting to see the problems with the testing tools that are provided by Swift and Xcode. They really only work with the most basic types of async functionality. In particular, where the user does something, async work happens, and then we want to assert on what was changed after that async work.
There are no tools that allow us to deterministically assert on what happens in between units of async work. We have to sprinkle in some Task.yield
s and hope it’s enough, and as we’ve seen a few times now, often it is not enough. We should probably be yielding a lot more times in these tests, and possibly even waiting for a duration of time to pass, which would unfortunately slow down our test suite. And still we could never be 100% certain that the test still won’t flake some day.
Let’s add another piece of functionality to our feature that is also quite common in real world development, and that’s effect management. We actually have a bug in our code right now in which if you tap the “Get fact” button multiple times really quickly it is possible to get multiple responses from the API and they can be completely out of order.
We are going to write a test to prove that this bug exists in our code, and then fix the bug to make sure that our test would have caught the bug in the first place. But first, let’s quickly see how this bug can manifest itself right in the preview.
Now when we run the preview we can tap “Get fact” many times rapidly, and then after a second we will start to see a flood of facts come through. We probably don’t want that behavior. It would probably be better to cancel an inflight request when a new one comes through.
But before doing that let’s write a test to prove that this race condition really does exist in our code.
It’s a little convoluted to do since we need to figure out a way to get two inflight fact requests at the same time, and then have the second one finish before the first one.
I’m just going to paste in the code to show this off:
func testBackToBackGetFact() async throws {
let fact0 = AsyncStream.makeStream(of: String.self)
let fact1 = AsyncStream.makeStream(of: String.self)
let callCount = LockIsolated(0)
let model = withDependencies {
$0.numberFact.fact = { number in
callCount.withValue { $0 += 1 }
if callCount.value == 1 {
return await fact0.stream
.first(where: { _ in true }) ?? ""
} else if callCount.value == 2 {
return await fact1.stream
.first(where: { _ in true }) ?? ""
} else {
fatalError()
}
}
} operation: {
NumberFactModel()
}
let task0 = Task { await model.getFactButtonTapped() }
let task1 = Task { await model.getFactButtonTapped() }
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
fact1.continuation.yield("0 is a great number.")
try await Task.sleep(for: .milliseconds(100))
fact0.continuation.yield("0 is a better number.")
await task0.value
await task1.value
XCTAssertEqual(model.fact, "0 is a great number.")
}
It’s a lot, but most of it is just necessary to test this kind of nuanced behavior.
We need two continuations and streams to represent the two different requests for facts. And we need to keep track of some mutable state so that we can figure out when it’s the first call to the fact
endpoint vs the second.
Then we can emulate the sequence of actions the user performed. They tap the “Get fact” button twice in rapid succession, wait a moment for the tasks to start, then half the second request finish, followed by the first, and finally assert on what we expect the fact
state to be.
We would hope that the second request wins and updates the model with its fact, even though it emits before the other request, but sadly that is not the case. This test fails:
XCTAssertEqual failed: (“Optional(“0 is a better number.”)”) is not equal to (“Optional(“0 is a great number.”)”)
And it fails 100% of the time. If we run it 100 times we see it fails every, single time:
Test Suite 'Selected tests' failed.
Executed 100 tests, with 100 failures (0 unexpected) in 10.546 (10.582) seconds
So this proves we do have a race condition in our model. We are getting old data put into our model even though the freshest data was already populated.
The fix is to keep track of the inflight request so that we can reference it later:
@MainActor
class NumberFactModel: ObservableObject {
@Published var factTask: Task<String, Error>?
…
}
In particular, we can now cancel any inflight task when the “Get fact” button is tapped, and then store the new task when it starts:
func getFactButtonTapped() async {
self.factTask?.cancel()
self.isLoading = true
defer { self.isLoading = false }
self.fact = nil
self.factTask = Task {
try await self.numberFact.fact(self.count)
}
defer { self.factTask = nil }
do {
self.fact = try await self.factTask?.value
} catch {
// TODO: Handle error
}
}
We can also beef up the increment and decrement logic a bit by cancelling any inflight requests since that data is now going to be stale:
func incrementButtonTapped() {
self.count += 1
self.fact = nil
self.factTask?.cancel()
self.factTask = nil
}
func decrementButtonTapped() {
self.count -= 1
self.fact = nil
self.factTask?.cancel()
self.factTask = nil
}
With this little bit of work done in the model we would hope our test passes, and it does!
However, it’s not 100% deterministic. If we run it 100 times we will see it fails about 20 times:
Test Suite 'NumberFactModelTests' failed.
Executed 100 tests, with 20 failures (0 unexpected) in 12.808 (13.138) seconds
So that’s a bummer. We have fixed an actual, real world bug in our feature but we have no way of deterministically testing it.
We could even improve our feature by adding a new button that allows canceling an inflight request. First to our model:
func cancelButtonTapped() {
self.factTask?.cancel()
self.factTask = nil
}
Heck, we can even drive the isLoading
state off of factTask
now:
// @Published var isLoading = false
var isLoading: Bool { self.factTask != nil }
And stop managing this state explicitly in the getFactButtonTapped
method.
Down in the view we can wire up a cancel button:
Section {
if self.model.isLoading {
HStack(spacing: 4) {
Button("Cancel") {
self.model.cancelButtonTapped()
}
Spacer()
ProgressView()
.id(UUID())
}
} else {
Button("Get fact") {
Task { await self.model.getFactButtonTapped() }
}
}
}
We can try running this in the preview and now we can see it works how we expect.
Further, our test suite still passes, even with the changes we made to the isLoading
property. But the suite still does not pass 100% deterministically.
Let’s quickly see what it takes to write a test for the new cancellation behavior we implemented. That’s a pretty complex piece of behavior we’ve added, so it would be nice to have some test coverage.
This time we want to construct a NumberFactModel
whose fact
endpoint simply suspends forever. It never returns any data. The only way to get it to un-suspend is if you cancel its asynchronous context. And the reason we want this is because we want to show that while the request is in flight we can cancel it, and so we want that request to stay in flight for as long as possible.
We can do this by construct an async stream that never emits, and then for await
-ing over it. If the stream ends then it must have meant that the surround async context was cancelled, and so we can throw a cancellation error:
func testCancel() async throws {
let factStream = AsyncStream<Never> { _ in }
let model = withDependencies {
$0.numberFact.fact = { _ in
for await _ in factStream {}
throw CancellationError()
}
} operation: {
NumberFactModel()
}
}
In fact, this pattern of a forever-suspending-until-cancelled async task is pretty common when testing features, and for that reason our Dependencies library actually ships with a tool that makes it very easy to construct such an async task.
We call it Task.never
and it can be used like so:
func testCancel() async throws {
let model = withDependencies {
$0.numberFact.fact = { _ in try await Task.never() }
} operation: {
NumberFactModel()
}
}
With that done I would hope that I could now emulate the user tapping on the “Get fact” button, yielding to allow the task to start, then immediately tapping the “Cancel” button, and seeing that the model’s fact
state is still nil
:
let task = Task { await model.getFactButtonTapped() }
await Task.yield()
model.cancelButtonTapped()
await task.value
XCTAssertEqual(model.fact, nil)
This does seem to pass sometimes, but if I run it enough times it will eventually just hang. It seems like we are stuck on something.
This is happening because it looks like we cancelled the inflight task before it actually started, and so the await task.value
is now hung up on the never-ending stream. Maybe we need a few more yields:
let task = Task { await model.getFactButtonTapped() }
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
model.cancelButtonTapped()
But even that isn’t enough. Maybe I need more:
let task = Task { await model.getFactButtonTapped() }
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
await Task.yield()
model.cancelButtonTapped()
Incredible, but even that isn’t enough. If we run it repeatedly we will see it eventually gets hung up.
For whatever reason the Swift concurrency runtime is not scheduling the task quickly enough for us to cancel it. It’s things like this that led us define:
let task = Task { await model.getFactButtonTapped() }
for _ in 1...20 {
await Task.detached(priority: .background) {
await Task.yield()
}
.value
}
model.cancelButtonTapped()
It detaches an unstructured task with the lower priority, and then yields inside there, and does it 20 times. Finally this is now enough for the task to start up and for us to cancel it.
And sadly this tool is also often needed, and so it might be nice to have it around in a nicer form, which we affectionately call the “mega yield”:
extension Task where Success == Never, Failure == Never {
static func megaYield() async {
for _ in 1...20 {
await Task<Void, Never>
.detached(priority: .background) {
await Task.yield()
}
.value
}
}
}
And now we can simply do:
let task = Task { await model.getFactButtonTapped() }
await Task.megaYield()
model.cancelButtonTapped()
It is worth noting that mega yields can be a little erratic, probably due to the low priority. Sometimes they execute super fast, and sometimes that take quite a bit of time.
We can see that if we run the test suite repeatedly. It does seem to pass 100% of the time, but some tests take a tiny fraction of a second to pass and others take nearly a half second.
If we run the suite 1,000 times we see it takes 16 seconds:
Test Suite 'Selected tests' passed at 2023-05-31 14:34:30.341.
Executed 1000 tests, with 0 failures (0 unexpected) in 16.348 (16.628) seconds
Other tests we’ve been running have been able to run 1,000 times in just a few seconds, or sometimes even less than a second. But because of all these yields we are severely slowing down our test suite.
And of course I don’t have a ton of confidence that this will always pass deterministically, even though it does seem to be passing when run 1,000 times.
And at the end of the day, this is just a hack we are employing to try to get at least some test coverage on our feature’s behavior. And we’ve even seen regressions in concurrency performance in new iOS releases that force us to bump the mega yield up even higher. But, that’s just the cost of dealing with concurrency and the testing tools we have available today.
OK, we have seem quite a few drawbacks to testing async code in Swift. We are adding random yields all of the place just to give the concurrency runtime space to do its job, and we have no confidence that we are even doing it in the right way. Sometimes we only need a few yields, and other times we needed a ton of them.
So, all of that is enough to make you throw your hands up in the air and give up on ever getting any kind of meaningful test coverage in your async code, but things get worse. We want to show one more example of how difficult it can be to test async code.
While we do have some async behavior in this feature, we are not using one of the biggest tools from Swift’s concurrency arsenal, and that’s async sequences. Async sequences in Swift is what unlocks the ability to perform a for await
, which allows you to iterate over a sequence in a fashion that allows the procurement of each element in the sequence to suspend for however long it needs.
Let’s figure out a way to shoehorn in some async sequence behavior into this feature and then figure out how to test that new behavior.
We are going to subscribe to an async sequence when the view appears, and each time it emits we will increment the counter in our model. But, which sequence should we use? One immediate async sequence that comes with Apple’s platforms comes from notification center. You can listen for notifications using a simple for await
syntax:
func onTask() async {
for await _
in NotificationCenter.default.notifications(named: <#???#>) {
}
}
And so what notification should we listen for? Well one simple one is the notification when screenshots are taken on the device:
func onTask() async {
let screenshots = NotificationCenter
.default
.notifications(
named: UIApplication
.userDidTakeScreenshotNotification
)
for await _ in screenshots {
self.count += 1
}
}
And we can call this method from the task
view modifier in the view:
.task { await self.model.onTask() }
It’s of course a very silly example, but it gets the point across. We are going to often need to subscribe to async sequences in our application, and we will want to test how the emissions of those sequences causes our feature’s logic to execute.
Unfortunately we can’t try out this feature in Xcode previews since there is no way to trigger a screenshot in previews. But, we can run it in the simulator, trigger a screenshot and see that the count increases by one.
So, how do we test this behavior?
Well, we can start with a model:
func testScreenshots() async throws {
let model = NumberFactModel()
}
And this time we don’t need to override any dependencies because we don’t expect any to be used.
Then we can emulate the user coming to the feature by invoking the onTask
method:
await model.onTask()
But it is async, and it does suspend basically forever since it subscribes to notification center, and so we actually need to wrap it up in an unstructured task so that it can run concurrently with the rest of the test:
let task = Task { await model.onTask() }
Then we might hope we can immediately post a notification and assert that the count went up by 1:
NotificationCenter.default.post(
name: UIApplication.userDidTakeScreenshotNotification,
object: nil
)
XCTAssertEqual(model.count, 1)
And we could try again:
NotificationCenter.default.post(
name: UIApplication.userDidTakeScreenshotNotification,
object: nil
)
XCTAssertEqual(model.count, 2)
Well, if we’ve learned anything so far in this episode it is that this is probably too optimistic. If we run this test it of course fails.
This is happening because we are posting notifications before the onTask
can really start up and subscribe to notifications. So those posted notifications go out into the void and no one hears them.
So, just like we fixed all tests before, we need some yields:
let task = Task { await model.onTask() }
await Task.yield()
await Task.yield()
await Task.yield()
However, 3 yields is not enough apparently. Let’s try a mega yield:
let task = Task { await model.onTask() }
await Task.megaYield()
That causes one assertion to pass but the last doesn’t. Maybe we need multiple mega yields sprinkled throughout the test:
let task = Task { await model.onTask() }
await Task.megaYield()
NotificationCenter.default.post(
name: UIApplication.userDidTakeScreenshotNotification,
object: nil
)
await Task.megaYield()
XCTAssertEqual(model.count, 1)
NotificationCenter.default.post(
name: UIApplication.userDidTakeScreenshotNotification,
object: nil
)
await Task.megaYield()
XCTAssertEqual(model.count, 2)
And now the test passes, but of course I don’t have a ton of confidence that this test will always pass.
There’s another approach to testing these kinds of things. Rather than mega yielding and hoping the async sequence emitted, we can also suspend until we detect the async sequence emits.
Now, such an operation is surprisingly complicated to build in Swift concurrency. It requires racing two independent async operations and building in support for timeouts so that we don’t just hang the test suite forever if the condition is never met.
We aren’t going to do any of that, and instead do the poor man’s version, which is to perform a while
loop with a Task.yield
on the inside:
func testScreenshots_Waiting() async throws {
let model = NumberFactModel(numberFact: .liveValue)
let task = Task { await model.onAppear() }
await Task.megaYield()
NotificationCenter.default.post(
name: UIApplication.userDidTakeScreenshotNotification,
object: nil
)
while model.count != 1 {
await Task.yield()
}
XCTAssertEqual(model.count, 1)
NotificationCenter.default.post(
name: UIApplication.userDidTakeScreenshotNotification,
object: nil
)
while model.count != 2 {
await Task.yield()
}
XCTAssertEqual(model.count, 2)
}
This test passes and I have a lot more confidence in it. We are now explicitly waiting around until the model’s count increases, which is mostly due to the async sequence emitting. The only flakey part of the test is the initial mega yield we have to do in order to give the code enough time for the async sequence to start up. And also, as we mentioned a moment ago, if we ever introduce a bug into our feature code that prevents the async sequence from emitting then we will have an infinite loop on our hands that never breaks. And that will just leave our test suite hanging, which can be annoying.
And while we are primarily focused on testing right now, we do want to show that these problems can unexpectedly bleed out into non-testing situations too.
For example, many months ago we released a library of new Clock
protocol conformances that allow you to write tests again time-based asynchronous code. This includes what we call an “immediate clock”, which is a clock that doesn’t actually suspend when you tell it to sleep, and a “test clock”, which suspends until you advance the clock’s internal time. These clocks allow you to write succinct and expressive tests for really complex behavior.
However, due to the unpredictability of the concurrency runtime in Swift, we were forced to sprinkle in Task.megaYield
s into a number of places throughout the immediate and test clock code.
This was necessary to make sure that during tests our feature code did not interact with time before a sleep or a timer had actually started, which is the same problem we’ve seen over and over.
But that also had an unintended consequence in that it causes the immediate clock to be quite a bit slower than it should be. As the name suggests, an immediate clock should be… well, immediate. But because of the mega yields, it can take somewhat significant amounts of time to yield, and those times really start to add up. It makes tests slower than they need to be, but it can also affect previews.
Let’s take a quick look at that.
I’m going to start a new file called Countdown.swift and paste in a bunch of code:
import Clocks
import SwiftUI
struct CountdownDemo: View {
@State var countdown = 10
@State var isConfettiVisible = false
let clock: any Clock<Duration>
init(clock: some Clock<Duration> = ContinuousClock()) {
self.clock = clock
}
var body: some View {
ZStack {
if self.isConfettiVisible {
ForEach(1...100, id: \.self) { _ in
ConfettiView().offset(
x: .random(in: -20...20),
y: .random(in: -20...20)
)
}
}
Text("\(self.countdown)")
.font(.system(size: 200).bold())
}
.task {
while true {
if self.countdown == 0 {
self.isConfettiVisible = true
break
}
try? await self.clock.sleep(for: .seconds(1))
self.countdown -= 1
}
}
}
}
struct ParticlesModifier: ViewModifier {
@State var duration = Double.random(in: 2...5)
@State var time = 0.0
@State var scale = 0.3
func body(content: Content) -> some View {
content
.scaleEffect(self.scale)
.modifier(ParticlesEffect(time: self.time))
.opacity(1 - self.time / self.duration)
.onAppear {
withAnimation(.easeOut(duration: self.duration)) {
self.self.time = self.duration
self.self.scale = 2.0
}
}
}
}
struct ParticlesEffect: GeometryEffect {
var direction = Double.random(in: -.pi ... .pi)
var distance = Double.random(in: 20...400)
var time: Double
var animatableData: Double {
get { self.time }
set { self.time = newValue }
}
func effectValue(size: CGSize) -> ProjectionTransform {
ProjectionTransform(
CGAffineTransform(
translationX: self.distance
* cos(self.direction)
* self.time,
y: self.distance
* sin(self.direction)
* self.time
)
)
}
}
struct ConfettiView: View {
@State var anchor = CGFloat.random(in: 0...1).rounded()
@State var color = pointFreeColors.randomElement()!
@State var isAnimating = false
@State var rotationOffsetX = Double.random(in: 0...360)
@State var rotationOffsetY = Double.random(in: 0...360)
@State var speedX = Double.random(in: 0.5...2)
@State var speedZ = Double.random(in: 0.5...2)
var body: some View {
Rectangle()
.fill(self.color)
.frame(width: 20, height: 20, alignment: .center)
.onAppear(perform: { isAnimating = true })
.rotation3DEffect(
.degrees(
rotationOffsetX + (isAnimating ? 360 : 0)
),
axis: (x: 1, y: 0, z: 0)
)
.animation(
.linear(duration: self.speedX)
.repeatForever(autoreverses: false),
value: isAnimating
)
.rotation3DEffect(
.degrees(
rotationOffsetY + (isAnimating ? 360 : 0)
),
axis: (x: 0, y: 0, z: 1),
anchor: UnitPoint(x: self.anchor, y: self.anchor)
)
.animation(
.linear(duration: self.speedZ)
.repeatForever(autoreverses: false),
value: isAnimating
)
.modifier(ParticlesModifier())
}
}
private let pointFreeColors = [
Color.init(red: 152/255, green: 239/255, blue: 181/255),
Color.init(red: 252/255, green: 241/255, blue: 143/255),
Color.init(red: 113/255, green: 201/255, blue: 250/255),
Color.init(red: 141/255, green: 81/255, blue: 246/255),
]
struct CountdownDemo_Previews: PreviewProvider {
static var previews: some View {
CountdownDemo()
}
}
We don’t really care about most of the details of the code, but I can run this in the preview and we will see it’s just a simple little countdown, and when the count reach 0 there are some confetti effects. That’s nice.
But what isn’t nice is that every single time I made a change to this file the countdown starts all over again. In fact, I don’t know if you noticed, but when the confetti burst happened the confetti was actually behind the number, not in front. I’d like it to be in front.
We can see the problem really easily. We have a ZStack
and the number is on top. So, let’s move it behind:
And now we can run the preview again and see that indeed the confetti is on top. But it’s kind of annoying that we have to wait for the full 10 seconds all over again just to see the effect.
One thing we certainly could do is change the hard coded countdown value so that it starts at a lower number, like 1:
struct ControlledCountdownDemo: View {
@State var countdown = 1
…
}
But this isn’t great. We are adding a hack to our view just to make the preview more usable. What if we forget to change this value back, commit the code, push it to our repo, and a release goes out to all of our users. That’s a pretty serious bug to have just so that we could more easily iterate on this feature.
So, that’s not the best way to make this preview more usable. Instead what we can do is use an “immediate clock” to squash all of time into a single instant. So, when you ask it to sleep it just ignores you and lets time zip right by.
And luckily for us this code sample was already built in a way that makes this possible. The view takes an explicit clock as an argument:
struct CountdownDemo: View {
…
let clock: any Clock<Duration>
init(clock: some Clock<Duration> = .continuous) {
self.clock = clock
}
…
}
And that allows us to use a live, continuous clock in the simulator and on devices, but we can use a different kind of clock in previews and tests.
In particular, we can now use an immediate clock in the preview by passing it explicitly to the view:
struct CountdownDemo_Previews: PreviewProvider {
static var previews: some View {
CountdownDemo(clock: .immediate)
}
}
Now that is much better, but it’s also not quite as “immediate” as I expected. It counts down super fast, but also in a kind of glitchy manner, sometimes pausing for a brief moment.
This is entirely because of the mega yields inside the immediate clock, which are necessary in order to write tests against clocks. If we updated the initial countdown to something higher, like 100:
struct ControlledCountdownDemo: View {
@State var countdown = 100
…
}
We’ll see that it now takes a decent amount of time to fully count down. Definitely faster than a live, continuous clock, but still far too slow. And if I up it to 1,000 then it will take a very, very long time to count down.
So, that was a pretty deep overview of the current state-of-the-art when it comes to testing async code in Swift. We’re not going to mince words here: it can be incredibly frustrating to try to test features that make use of any asynchrony beyond some very simple constructs. And unfortunately we’re not really sure there is much work being done from the core Swift team to improve the situation.
And while what we have built is a pretty silly toy application, we promise that your application has these exact shapes somewhere. You are certainly making async network requests, and maybe you are juggling some state to let the view know when that request is in flight. And maybe you have some additional logic somewhere that determines when one should cancel that request if it is still in flight. And further, as async sequences become more and more ubiquitous in Apple’s APIs, and as Combine begins its slow march towards deprecation, you will be using async sequences a lot more.
As soon as you have some complex and nuanced logic involving asynchronous code you are in for a world of hurt when it comes to testing. You have no choice but to either not test all of your code, or sprinkle in some mega yields or sleeps just to get things passing, and hope it doesn’t flake on your CI server.
So, everything we have encountered so far is what pushed us to start a discussion on the Swift forums asking others how they are going about testing complex, async code. And unfortunately it seems that pretty much everyone is in the same boat. They either don’t test their complex async code, or at least not as deeply as they could, or they insert lots of yields and sleeps to force the concurrency runtime to push forward.
We wanted to find a better way, and around that time Apple had just opened sourced their new “async algorithms” package, and it had a couple of interesting advanced usages of Swift’s underlying concurrency runtime. We dug in a bit deeper and we saw that there was maybe something we could leverage to bend the will of the concurrency runtime to our advantage in tests.
So, let’s take a look at Apple’s code and see what clues it gives us in how we can better predict how Swift’s concurrency tools will behave at runtime…next time!