Test Sizes
December 13th, 2010 | Published in Google Testing
by Simon Stewart
What do you call a test that tests your application through its UI? An end-to-end test? A functional test? A system test? A selenium test? I’ve heard all them, and more. I reckon you have too. Tests running against less of the stack? The same equally frustrating inconsistency. Just what, exactly, is an integration test? A unit test? How do we name these things?
Gah!
It can be hard to persuade your own team to settle on a shared understanding of what each name actually means. The challenge increases when you encounter people from another team or project who are using different terms than you. More (less?) amusingly, you and that other team may be using the same term for different test types. “Oh! That kind of integration test?” Two teams separated by a common jargon.
Double gah!
The problem with naming test types is that the names tend to rely on a shared understanding of what a particular phrase means. That leaves plenty of room for fuzzy definitions and confusion. There has to be a better way. Personally, I like what we do here at Google and I thought I’d share that with you.
Googlers like to make decisions based on data, rather than just relying on gut instinct or something that can’t be measured and assessed. Over time we’ve come to agree on a set of data-driven naming conventions for our tests. We call them “Small”, “Medium” and “Large” tests. They differ like so:
Gah!
It can be hard to persuade your own team to settle on a shared understanding of what each name actually means. The challenge increases when you encounter people from another team or project who are using different terms than you. More (less?) amusingly, you and that other team may be using the same term for different test types. “Oh! That kind of integration test?” Two teams separated by a common jargon.
Double gah!
The problem with naming test types is that the names tend to rely on a shared understanding of what a particular phrase means. That leaves plenty of room for fuzzy definitions and confusion. There has to be a better way. Personally, I like what we do here at Google and I thought I’d share that with you.
Googlers like to make decisions based on data, rather than just relying on gut instinct or something that can’t be measured and assessed. Over time we’ve come to agree on a set of data-driven naming conventions for our tests. We call them “Small”, “Medium” and “Large” tests. They differ like so:
Feature | Small | Medium | Large |
Network access | No | localhost only | Yes |
Database | No | Yes | Yes |
File system access | No | Yes | Yes |
Use external systems | No | Discouraged | Yes |
Multiple threads | No | Yes | Yes |
Sleep statements | No | Yes | Yes |
System properties | No | Yes | Yes |
Time limit (seconds) | 60 | 300 | 900+ |
Going into the pros and cons of each type of test is a whole other blog entry, but it should be obvious that each type of test fulfills a specific role. It should also be obvious that this doesn’t cover every possible type of test that might be run, but it certainly covers most of the major types that a project will run.
A Small test equates neatly to a unit test, a Large test to an end-to-end or system test and a Medium test to tests that ensure that two tiers in an application can communicate properly (often called an integration test).
The major advantage that these test definitions have is that it’s possible to get the tests to police these limits. For example, in Java it’s easy to install a security manager for use with a test suite (perhaps using @BeforeClass) that is configured for a particular test size and disallows certain activities. Because we use a simple Java annotation to indicate the size of the test (with no annotation meaning it’s a Small test as that’s the common case), it’s a breeze to collect all the tests of a particular size into a test suite.
We place other constraints, which are harder to define, around the tests. These include a requirement that tests can be run in any order (they frequently are!) which in turn means that tests need high isolation --- you can’t rely on some other test leaving data behind. That’s sometimes inconvenient, but it makes it significantly easier to run our tests in parallel. The end result: we can build test suites easily, and run them consistently and as as fast as possible.
Not “gah!” at all.
A Small test equates neatly to a unit test, a Large test to an end-to-end or system test and a Medium test to tests that ensure that two tiers in an application can communicate properly (often called an integration test).
The major advantage that these test definitions have is that it’s possible to get the tests to police these limits. For example, in Java it’s easy to install a security manager for use with a test suite (perhaps using @BeforeClass) that is configured for a particular test size and disallows certain activities. Because we use a simple Java annotation to indicate the size of the test (with no annotation meaning it’s a Small test as that’s the common case), it’s a breeze to collect all the tests of a particular size into a test suite.
We place other constraints, which are harder to define, around the tests. These include a requirement that tests can be run in any order (they frequently are!) which in turn means that tests need high isolation --- you can’t rely on some other test leaving data behind. That’s sometimes inconvenient, but it makes it significantly easier to run our tests in parallel. The end result: we can build test suites easily, and run them consistently and as as fast as possible.
Not “gah!” at all.