Cognitive system testing: Smoke testing

Part 2 of the Cognitive System Testing series, originally posted in 2016 on IBM Developer.

Introduction

A cognitive system, like most software systems, is a collection of modular components, each with varying degrees of complexity. One quick way of testing a cognitive system is to use a “smoke test”, which tells you if the system is behaving so poorly that it may as well be on fire. The smoke test suite should tell you if the system as a whole is responsive to user inputs and if the inner components are generally speaking to each other.

With the whole system at your fingertips, you may have a desire to make tests as comprehensive as possible. Resist the urge! Our ideal testing approach is a layered approach and the smoke test is just the tip of the spear. The smoke test should give you just enough information to decide if you want to run the rest of your test suites (you do have multiple batches, don’t you?) In fact, in our Watson solutions, our smoke tests don’t even verify if Watson provides the correct answer to a question – just that Watson provides an answer!

Motivation

Using any build process (hopefully an automated build), software will obviously not be delivered if it does not compile. Compile-time errors are fairly easy to catch and provide a fail-fast mechanism. If all of your software modules compile, that is a good start, but they may not run together. Smoke testing is a way of quickly flushing out batches of runtime errors by forcing the components to talk to each other.

A good smoke test suite

Your smoke test suite should aim for the extreme version of the 80/20 rule – write as few tests as possible to cover the major integration points within your application. Some of our Watson solutions started with only one test in their test suite, asking a single question of the Watson system. A medical Watson solution started with just a handful – each test tested one of major disease types covered by the system, with varied mock patient data to cover interesting patient characteristics (gender, age classes, etc). Your solution may need a few more test cases. The important thing is to hit each runtime component – in the Watson solutions described above this included a UIMA pipeline, a REST layer, a database, and a machine learning model.

I mentioned earlier that our smoke tests only verify that Watson provides an answer, not the answer. This an important distinction. By allowing some freedom in the responses, we prevent the smoke test from being brittle. As previously noted, Watson systems are probabilistic, non-deterministic systems and that can play havoc on a strict test. We rely on other test suites to verify the system is fully functioning.

In case of errors

Ideally all of our tests pass. If the smoke test suite fails, STOP, it’s as bad as if the system is on fire! A smoke test failure should prevent the other test suites from failing and should start an immediate triage process. Our smoke test suites include a series of log scanners which look for error and exception messages and provided a detailed email to all of the people who contributed to the current failed build, and this failure is immediately given high priority. Other channels are great for communicating this failure including SMS alerts and Slack channel notifications.

Example failure email

The smoke test for build 20160916_0830 on system node123 failed with several errors.

grep –R ERROR /solution/20160916_0830/logs found:
install.log:   ERROR     Could not load messages.properties
runtime.log:   ERROR     Failed initialize module ComponentA due to IllegalArgumentException
testcase.log:  ERROR     Question 1 did not return a response

The following code updates were added to the build:
abcd1234 (johndoe) Integrate ComponentA into build
efgh5678 (janedoe) Fix NullPointerException in installer

This email includes key details of what build was installed, where the build was installed, where the logs can be found and what indications of failure exist.

Conclusion

A smoke test suite needs to quickly hit the major integration points and code paths in your application and determine if the application is in a serious failed state or not. The smoke test suite should be small, run quickly, and be tolerant of output from the system. The smoke test suite is just the first part of your overall testing solution and it can leave some work to the other suites.