TestDispatcher: Become the Clock Master | by Michał Klimczak | Jun, 2022

A deep dive into the subtleties of testing Kotlin coroutines.

Photo by Matteo Vella on Unsplash

Even if you are fluent in Kotlin coroutines, you might still find it difficult to test them. Concurrency is just inherently hard to reason about, especially if you’re aiming for 100% deterministic behavior, which is necessary in testing.

Kotlin 1.6 introduced a lot of changes to the coroutine testing environment. runTest is not a simple change of naming convention from runBlockingTest, fundamental changes have been introduced to how tests work. This won’t be another article about the changes between 1.5 and 1.6, though, there have been a plethora of these lately. Instead, we will focus on a few not-so-obvious TestDispatcher characteristics that might have you pulling your hair in frustration.

So, in this article you will find some theory:

  • What is the difference between a task being scheduled and executed?
  • What are the main characteristics of StandardTestDispatcher and UnconfinedTestDispatcher and how are they different?
  • How can you control the virtual time of a TestDispatcher (with visualization)?

And subtle TestDispatcher traps explained in code:

  • How never-ending coroutines will prevent test finish.
  • How UnconfinedTestDispatcher might mess up your tests in an unexpected way.
  • How you might need an extra 1 ms using advanceTimeBy.

Please note that this article is based on the state of affairs in Kotlin 1.7.0. Most of the APIs used here are still under @ExperimentalCoroutinesApi opt-in, so they are subject to change in the future. Also, we will use Turbin and mockk in these examples to simplify test cases.

Before we get to the practical examples, let’s make sure we understand what TestDispatcher is and why it’s the very heart of coroutine testing.

TestDispatcher is nothing more than a special case ofCoroutineDispatcher(so a cousin of Dispatchers.Main, Dispatchers.IO, etc.). Like any other dispatcher, it is attached to a CoroutineScope and its job is to orchestrate the execution of the coroutines launched in that scope. The difference is that, with any regular dispatcher, we get to control this orchestration via TestCoroutineSchedulerits virtual time and the manual execution of scheduled tasks.

Scheduling and execution of coroutines

One thing that really helped me understand how TestDispatchers work was realizing the difference between a task being scheduled and executed. In production code, this distinction isn’t that important because it happens on a real clock and in the vast majority of scheduled situations everything tasks are executed right away. With TestDispatchers, the difference becomes apparent. Basically, even if we’re considering a task that is scheduled to execute in this exact millisecond, it does not mean that its subscribers will receive it immediately. It needs to be executed by the CoroutineDispatcher first. A quick look at the TestCoroutineScheduler source reveals a lot:

public class TestCoroutineScheduler {

/** This heap stores the knowledge about which dispatchers are interested in which moments of virtual time. */
private val events = ThreadSafeHeap<TestDispatchEvent<Any>>()

/** Establishes that [currentTime] can't exceed the time of the earliest event in [events]. */
private val lock = SynchronizedObject()

/** This counter establishes some order of the events that happen at the same virtual time. */
private val count = atomic(0L)

These 3 properties tell us the following:

  • events: Dispatchers use the scheduler to register events — specific moments of time they are interested in.
  • count: Even if there are multiple events scheduled at the same virtual time, there is a mechanism to ensure their deterministic order (although it is not used by UnconfinedTestDispatcher).
  • lock: It’s guaranteed that if the virtual clock is moved past the time scheduled for the task — that task is executed.

This will all become easier to understand on pictures below.

StandardTestDispatcher vs UnconfinedTestDispatcher

There are only two test dispatchers used in tests. StandardTestDispatcher is the default provided whenever we use runTest. It has strict guarantees about the execution order of the tasks, however, the execution is not eager, ie we need to use runCurrent to trigger it at the current moment of virtual time or advance the time manually with advanceTimeBy and advanceUntilIdle.

UnconfinedTestDispatcher is eager, it will not require poking with therunCurrent stick to execute. It will automatically advance the virtual time and execute all the enqueued tasks. However, its downside is that it will not guarantee the order of several coroutines scheduled inside of it. It basically works like Dispatchers.Unconfined, but with auto-advanced virtual time. This can cause a lot of confusion, as will be explained below.

Moving the virtual clock hand

With StandardTestDispatcherwe can precisely control the execution of scheduled coroutines via one of 3 methods:

  • runCurrent: will execute any tasks scheduled at the current moment of virtual time.
  • advanceTimeBy: advances the virtual time by the given amount of milliseconds and executes any tasks scheduled in the meantime.
  • advanceUntilIdle: works similarly toadvanceTimeBybut instead of advancing virtual time by a specific amount of milliseconds, it advances it until there are no more tasks scheduled in the queue.

Now let’s visualize this virtual timeline. Say, we have 4 tasks scheduled:

A: scheduled at 0 ms (immediately).
B: scheduled at 1000 ms.
C: scheduled at 1000 ms, but registered after B.
D: scheduled at 2000 ms.

Let’s see how each method will affect the timeline.

runCurrent() will not move the virtual time, but will execute task Awhich is scheduled at the current time (0 ms).

advanceTimeBy(1000) will move the clock by 1000 ms and will execute task A, which was scheduled in the meantime. It will not, however, execute B and C yet.

To execute them, we need to explicitly call runCurrent() after advanceTimeBy(). This will be shown in code in one of the examples below. Please note that if B had been registered before Cthat order would be maintained for execution byStandardTestDispatcher. UnconfinedTestDispatcher would not guarantee that.

And finally, we can advanceUntilIdle()which will advance the time by 2000 ms — ie, until all the currently scheduled tasks are executed.

Now that we have the basics out of the way, let’s take a look at a few subtleties that have made quite a few grown men scream at their computers.

Never-ending coroutines will prevent test finish

Let’s consider a MediaPlayer class that launches sound playback.

Sometimes, the coroutine that is supposed to play the sound throws an exception, possibly due to a corrupted file. MediaPlayer will notify its clients about these errors and we can test that behavior.

So far, so good. Now let’s assume that apart from delivering these errors to the UI layer, we also want to report them to Crashlytics and we have a special IssueReporter class to handle it. We can just observe the playerErrors Flow and report them.

This test will run for one minute and then timeout with:

Caused by: kotlinx.coroutines.test.UncompletedCoroutinesError: After waiting for 60000 ms, the test coroutine is not completing, there were active child jobs: ["coroutine#4":StandaloneCoroutine{Active}@36790bec]

Why is that? Because we leaked a coroutine job running on the test scope — obviously playerErrors.collect. Test dispatchers make sure that if there’s an active coroutine,runTest will block until all of its children are finished (or until timeout, which is 60 s by default and can be changed with runTest(dispatchTimeoutMs=x)).

This is actually quite handy: we leaked a resource, tests told us about it and now we need to clean it up. So let’s add a dispose() method that will take care of it by canceling the scope. This should fix it, right?

Nope. Well, yes, the leak is fixed, but the test still fails:

TestScopeImpl was cancelled
kotlinx.coroutines.JobCancellationException: TestScopeImpl was cancelled; job=TestScope[test ended]

Test scopes don’t like being canceled. What we can do is to make sure its children jobs are canceled so that it can die peacefully. If we replace sut.dispose() with this.coroutineContext.cancelChildren() the test will pass.

Personally, I don’t find this line of code very self-explanatory, so I like to wrap it into a function for the reader to know what’s happening.

private fun TestScope.cancelNeverEndingCoroutines() = this.coroutineContext.cancelChildren()

Note that this behavior is a bit controversial, so it’s possible it will change in the future. Eg https://github.com/Kotlin/kotlinx.coroutines/issues/1531.

UnconfinedTestDispatcher will mess up conflated StateFlow emissions

This actually shouldn’t come as a surprise because it’s documented, however, the indeterministic nature of UnconfinedTestDispatcher (and the original UnconfinedDispatcher too, fwiw) can be pretty subtle. It’s sometimes useful because it saves us all the runCurrent calls, but from time to time it can blow up in our faces.

Let’s use the MediaPlayer example again.

If run in a production app, this piece of code will properly emit Playingthen play the sound, and then turn back to Stopped. Things get a bit weird when we test that and mock playSound to be a 0 ms coroutine.

We expect the player state to switch to Playing. But this test will time out after 60 seconds — the second awaitItem() will suspend forever. Why is that?

For two reasons combined:

  • StateFlow is allowed to conflate emissions — eg if it accepts two identical values ​​(like two Stopped one after another), it will only emit once.
  • UnconfinedTestDispatcherciting the docs, “does not provide guarantees about the execution order when several coroutines are queued in this dispatcher.”

Now, we have launched from the scope with an unconfined dispatcher and have run 3 suspend functions inside of it.

scope.launch {            
playerState.emit(PlayerState.Playing)
soundPlayer.playSound()
playerState.emit(PlayerState.Stopped)
}

Even though a lot happened between sut.play() and awaitItem()our collector (Turbine’s sut.playerState.test) missed the whole show. It still only sees PlayerState.Stopped.

It’s easily fixed by replacing UnconfinedTestDispatcher with StandardTestDispatcherwhich guarantees that the second awaitItem() will wait for playerState.emit(PlayerState.Playing) and only after that will it resume.

This example is here to show that problems with UnconfinedTestDispatcher are not always super obvious like expecting events in an A-B-C-D order and receiving A-C-B-D. Combined with the rest of the coroutines machinery, like StateFlow‘s implicit conflation, it can get really obscure.

TestScope.advanceTimeBy will not execute coroutines scheduled exactly at the current virtual time, but only those scheduled earlier

This is a confusing difference between the old 1.5 TestCoroutineScope and the new 1.6 TestScope. The new one will just move the virtual time, but will not execute any pending tasks on the TestCoroutineScheduler. The old one would additionally call runCurrent. This is of course documented:

In contrast with TestCoroutineScope.advanceTimeBythis function does not run the tasks scheduled at the moment currentTime + delayTimeMillis.

In practice, this new behavior means that after advanceTimeBywe have to call runCurrent explicitly or just advance the time a tiny bit more (by 1 ms).

Let’s see an example.

With our old friend MediaPlayerlet’s assume that we want to make sure we won’t increase playbackCounter until the playback has actually finished. The test below will let us do just that:

We make the mock sound player run for 1000 ms so that after 500 ms we can check that the counter is still 0. Now, let’s prepare another test to make sure that after it’s finished playing, it really does change to 1.

Unfortunately, the assertion for this test fails. It would pass with the deprecated runBlockingTest, but nowadays, we need to be explicit about the execution of scheduled tasks. We can fix the test by adding runCurrent() just after advanceTimeBy(1000) or (which is a bit less elegant, I think) by replacing it with advanceTimeBy(1001).

There are a few other confusing behaviors to the coroutine testing framework like Dispatchers.setMain Implicitly providing a test dispatcher for all the default test (although I couldn’t imagine a convincing broken test for that). If you found a fragile coroutine test scenario yourself, let me know in the comments.

However, overall, the changes in 1.6 are a huge step forward and make testing concurrency more predictable. For the cases where they don’t, I hope these examples will help someone pull a little less hair from their head.

Additional resources:

Also, thanks Artur Klamborowski for your help with this article.

Leave a Comment