Fault Localisation with Tarantula

Photo by Anthony Martino on Unsplash

Sometimes unit-tests fail, and you don’t know why. That’s when you want to use fault localisation; to find the fault that’s causing the tests to fail. Tarantula is such an algorithm and finds which lines are most suspect of breaking the tests. I implemented this algorithm and used solidity-coverage results to localise bugs in Ethereum smart contracts.

Let’s start at the beginning; the motivation for fault localisation. Imagine you’re just developing a new feature and you run your test suite. The goal? To make sure that the feature you just built didn’t break any existing functionality.

Unfortunately, the results don’t come back positive; a few tests failed. Sigh… You get your rubber ducky ready and prepare yourself to go in, debug, and figure out what’s wrong.

Does this story sound familiar to you? Maybe it does, or perhaps it doesn’t. You might find a use for fault localisation either way.

In this article, we will look at the Tarantula fault localisation algorithm. Tarantula is an algorithm that looks at the results for a test suite and at the lines covered by each test. Based on that, it’ll determine the suspiciousness of each line.

💡 Suspiciousness is a number between 0.0 and 1.0, a higher number means a higher chance that that line caused a test to fail.

The Algorithm

The paper Empirical Evaluation of the Tarantula Automatic Fault-Localization Technique shows the tarantula fault localisation method and how good it works.

The idea behind the algorithm is simple, for each line in a program we look at the test cases that cover it, and we compare the ratio of succeeding and failing test cases. The higher the ratio of failing test cases compared to successful ones, the higher the suspiciousness of that line.

The following formulas describes how we compute that value, the suspiciousness:

💡 Hue shows how “unsuspicious” a line is! That’s why we need to invert it.

What can we do with suspiciousness?

You can do two things with suspiciousness, and both are implemented by a VSCode plugin that we made: vscode-tarantula.

1. 🌈 Colour lines based on their suspiciousness/hue
2. 🔢 Rank lines based on their suspiciousness

Option 2 turns out to work rather well. You’ll get a list of the most suspect lines in a set of smart contracts & you get to browse through them.

💪 Using trantula fault localisation you’re bound to find the bug in no-time!

Take it for a spin!

There are two easy ways that you can get started using tarantula fault localisation.

First, is the library tarantula-fl which implements the tarantula fault localisation algorithm.

Second is vscode-tarantula an add-on for vscode that uses the results from the fault localisation algorithm to highlight the areas of the code that are likely to cause a failing test suite.

The Library

🔗 GitHub - JoranHonig/tarantula: Implementation of the tarantula fault localisation algorithm Using the library is very straightforward; the following example should give you a good idea of how to use it:

``````var tarantula = require('tarantula-fl')

var testData = {
testResults: tarantula.TestData.fromMocha(exampleTestResult),
coverage: tarantula.TestData.fromSolCover(exampleCoverage)
}

score = tarantula.Tarantula.tarantulaScore(testData)
``````

Using the add-on for VSCode is even easier. There are two steps that you need to take:

1. Install the add-on Tarantula - Visual Studio Marketplace
2. Run solidity coverage with the `—-matrix` option

If there is a failing test suite, then you’ll be able to access two features.

1. The first one is a colour gradient indicating the suspiciousness of a line.
2. Second, you’ll be able to use the tarantula add-on pane (see bottom left of the VSCode screenshot) to navigate between the lines that are the most likely to cause a test suite to fail.

Fin

• cgewecke who was insanely helpful and added the detailed coverage reports necessary for fault localisation to `solidity-coverage`!