If you’ve been a tester for some time, without a doubt you have run into scenarios where you had to write tests to verify the correctness of the implementation of business rules, algorithms or another form of backend logic. To get a fair amount of test coverage, you'll likely need more than a single test case.
Data-driven testing helps clean up your tests and make them more effective and maintainable. However, there are some pitfalls that you need to beware of when you start applying data-driven in your test suite. Let’s take a look.
Data-driven testing, in general terms, is running the same test multiple times using different sets of data. To clarify:
Let’s explore a scenario that would be a good candidate for a data-driven approach.
Consider the following piece of logic regarding the cost of and discounts for train tickets:
If you want to write tests for this logic, and assuming a regular ticket costs $100, you'll likely write at least three test cases, and this doesn't even include specific checks on boundary values.
In order to get good test coverage, you need to run at least three different tests. We're essentially testing the same logic three times, just with different input values (the regular ticket price and a person's age) and expected output (the resulting ticket price charged to that person).
The naive approach to creating tests for this example would be to write three different tests that look similar, but use different input values and expected output values.
This approach of writing test results creates a lot of duplication. Duplication is not necessarily bad all the time, but in this scenario, duplication results in tests that are tedious to read, cause a lot of maintenance overhead, and do not scale very well.
Fortunately, there is a better approach to writing this type of test, and that is by using data-driven testing.
Our three tests all exercise the same business logic, in this case, the logic that decides the actual ticket fare based on the regular ticket fare and the passenger's age. We use three different sets of data in our example.
In the newest version of Xray, there is a test parameterization feature that gives you the ability to do data-driven testing. Test parameterization is a powerful practice that allows the same test to be executed multiple times with different parameters.
In Xray, you can define parameters within datasets and execute 1 test with multiple iterations or input values. This will minimize the number of test cases you need to create and increase your coverage.
(This section was written by the Xray team)
This way, you can create 3 different iterations for 1 test. Using a data-driven approach the test is only one, and there’s no more duplication because the test flow (the application logic exercised by the test) is referenced only once.
Additionally, adding a new test case, or editing or removing an existing one only requires you to update the test case collection. This approach makes the test a lot easier to read and maintain.
Data-driven testing is a great way to clean up your tests and make them more effective and maintainable. However, there are some pitfalls that you need to beware of.
Data-driven testing is most useful when testing business rules or algorithm implementation or (backend) business logic in general. The concept can be applied just as easily to GUI-driven tests, for example when you use pytest to drive tests that use Selenium WebDriver in Python. However, this is a test smell for me, an indication that you're not writing your tests in the most efficient way.
If your application is designed properly, it is incredibly unlikely that the business logic you want to test is implemented in your graphical user interface (GUI) and as a result, your GUI should ideally not be included in the components that are invoked when you run business logic tests. Should you find yourself wanting to apply data-driven testing to tests that are driven through your GUI, my advice would be to take a step back and ask yourself whether or not you have to use the GUI in that test in the first place.
One of the exercises I typically include when I teach data-driven testing is to first transform a number of tests into a data-driven test, just as we've done in the example above. As a next step, I ask them to add another test that, upon closer inspection of the code that is being tested, exercises a different path in that code.
Almost all participants, however, try to add that test as a new test case to the existing data-driven test, often using if-then-else or try-catch constructs to make it all work. To me, again, having to use these constructs in your test methods is a test smell, because you're essentially replicating your business logic in your tests, with all the associated risks. You might end up having to write tests for your tests!
My advice to avoid falling into this trap would be to ask yourself “am I really exercising the same path in my code or application when running this new iteration?” If the answer is a definitive “yes,” by all means add it to your data-driven test. If the answer is a “no,” you're probably better off writing a separate test that exercises this other path.
If you take into account this advice, you'll find that data-driven testing is a great way to increase your coverage, keep your tests organized and tidy without duplications, and scale your testing.
Curious to see how data-driven testing works in Xray? Learn how to use data-driven testing with Xray’s newest test parameterization feature.