If you read part 1, you may be wondering how my automated check performed…
The programmer deployed the seeded bug and I’m happy to report, my automated check found it in 28 seconds!
Afterwards, he seeded two additional bugs. The automated check found those as well. I had to temporarily modify the automated check code to ignore the first bug in order to find the second. This is because the check stops checking as soon as it finds one problem. I could tweak the code to collect problems and keep checking but I prefer the current design.
Here is the high level generic design of said check:
Build the golden masters:
- Make scalable checks - Before test execution, build multiple golden masters per coverage ambition. This is a one-time-only task (until the golden masters need to be updated per expected changes).
- Bypass GUI when possible – Each of my golden masters consist of the response XML from a web service call, saved to a file. Each XML response has over a half a million nodes, which are mapped to a complex GUI. In my case, my automated check will bypass the GUI. GUI automation could never have found the above seeded bug in 28 seconds. My product-under-test takes about 1.5 minutes just to log in and navigate to the module being tested. Waiting for the GUI to refresh after the countless service calls being made in the automated check would have taken hours.
- Golden masters must be golden! Use a known good source for the service call. I used Production because my downstream environments are populated with data restored from production. You could use a test environment as long as it was in a known good state.
- Use static data - Build the golden masters using service request parameters that return a static response. In other words, when I call said service in the future, I want the same data returned. I used service request parameters to pull historical data because I expect it to be the same data next week, month, year, etc.
- Automate golden master building - I wrote a utility method to build my golden masters. This is basically re-used code from the test method, which builds the new objects to compare to the golden masters.
Do some testing:
- Compare - This is the test method. It calls the code-under-test using the same service request parameters used to build the golden masters. The XML service response from the code-under-test is then compared to that of the archived golden masters, line-by-line.
- Ignore expected changes - In my case there are some XML nodes the check ignores. These are nodes with values I expect to differ. For example, the CreatedDate node of the service response object will always be different from that of the golden master.
- Report - If any non-ignored XML line is different, it’s probably a bug, fail the automated check, report the differences with line number and file (see below) references and investigate.
- Write Files - For my goals, I have 11 different golden masters (to compare with 11 distinct service response objects). The automated check loops through all 11 golden master scenarios, writing each service response XML to a file. The automated check doesn’t use the files, they are there for me. This gives me the option to manually compare suspect new files to golden masters with a diff tool, an effective way of investigating bugs and determining patterns.
I’m feeling quite cocky at the moment. So cocky that I just asked my lead programmers to secretly insert a bug into the most complex area of the system under test.
Having just finished another epic automated check based on the Golden Master approach I discussed earlier, and seeing that most of the team is on pre-Thanksgiving vacation, this is the perfect time to “seed a bug”. Theoretically, this new automated check should help put our largest source of regression bugs to rest and I am going to test it.
The programmer says he will hide the needle in the haystack by tomorrow.
You’ve got this new thing to test.
You just read about a tester who used Selenium and he looked pretty cool in his bio picture. Come on, you could do that. You could write an automated check for this. As you start coding, you realize your initial vision was too ambitious so you revise it. Even with the revised design you’re running into problems with the test stack. You may not be able to automate the initial checks you wanted, but you can automate this other thing. That’s something. Besides, this is fun. The end is in sight. It will be so satisfying to solve this. You need some tasks with closure in your job, right? This automated check has a clear output. You’ve almost cracked this thing…cut another corner and it just might work. Success! The test passes! You see green! You rule! You’re the Henry Ford of testing! You should wear a cape to work!
Now that your automated thingamajig is working and bug free, you can finally get back to what you were going to test. Now what was it?
I’m not hating on test automation. I’m just reminding myself of its intoxicating trap. Keep your eyes open.
You must really be a sucky tester.
I’m kidding, of course. There may be several explanations as to why an excellent tester like yourself is not finding bugs. Here are four:
- There aren’t any bugs! Duh. If your programmers are good coders and testers, and perhaps writing a very simple Feature in a closely controlled environment, it’s possible there are no bugs.
- There are bugs but the testing mission is not to find them. If the mission is to do performance testing, survey the product, determine under which conditions the product might work, smoke test, etc., it is likely we will not find bugs.
- A rose by any other name… Maybe you are finding bugs in parallel with the coding and they are fixed before ever becoming “bug reports”. In that case, you did find bugs but are not giving yourself credit.
- You are not as excellent as you think. Sorry. Finding bugs might require skills you don’t have. Are you attempting to test data integrity without understanding the domain? Are you testing network transmissions without reading pcaps?
As testers, we often feel expendable when we don’t find bugs. We like to rally around battle cries like:
“If it ain’t broke, you’re not trying hard enough!”
“I don’t make software, I break it!”
“There’s always one more bug.”
But consider this, a skilled tester can do much more than merely find a bug. A skilled tester can also tells us what appears to work, what hasn’t broken in the latest version, what unanticipated changes have occurred in our product, how it might work better, how it might solve additional problems, etc.
And that may be just as important as finding bugs.
Hey testers, don’t say:
“yesterday I tested a story. Today I’m going to test another story. No impediments”
Per Scrum inventor, Jeff Sutherland, daily standups should not be “I did this…”, “I’ll do that…”. Instead, share things that affect others with an emphasis on impediments. The team should leave the meeting with a sense of energy and urgency to rally around the solutions of the day. When the meeting ends, the team should be saying, “Let’s go do this!”.
Here are some helpful things a tester might say in a daily standup:
- Let’s figure out the repro steps for production Bug40011 today, who can help me?
- I found three bugs yesterday, please fix the product screen bug first because it is blocking further testing.
- Sean, I know you’re waiting on my feedback on your new service, I’ll get that too you first thing today.
- Yesterday I executed all the tests we discussed for Story102, unless someone can think of more, I am done with that testing. Carl, please drop by to review the results.
- I’m getting out of memory errors on some test automation, can someone stop by to help?
- If I had a script to identify data corruption, it would save hours.
- Paul, I understand data models, I’ll test that for you and let you know something by noon.
- The QA data seems stale. Don’t investigate any errors yet. I’m going to refresh data and retest it today. I’ll let you know when I’m done.
- Jolie, if you can answer my question on expected behavior, I can finish testing that Story this afternoon.
Your role as a tester affects so many people. Think about what they might be interested in and where your service might be most valuable today.
“Golden Master”, it sounds like the bad guy in a James Bond movie. I first heard the term used by Doug Hoffman at STPCon Spring 2012 during his Exploratory Test Automation workshop. Lately, I’ve been writing automated golden master tests that check hundreds of things with very little test code.
I think Golden-Master-Based testing is super powerful, especially when paired with automation.
A golden master is simply a known good version of something from your product-under-test. It might be a:
- web page
- reference table
- grid populated with values
- or some other file output by your product
Production is an excellent place to find golden masters because if users are using it, it’s probably correct. But golden masters can also be fabricated by a tester.
Let’s say your product outputs an invoice file. Here’s a powerful regression test in three steps:
- Capture a known good invoice file from production (or a QA environment). This file is your golden master.
- Using the same parameters that were used to create the golden master, re-create the invoice file on the new code under test.
- Programmatically compare the new invoice to the golden master using your favorite diff tool or code.
Tips and Ideas:
- Make sure the risky business logic code you want to test is being exercised.
- If you expand on this test, and fully automate it, account for differences you don’t care about (e.g., the invoice generated date in the footer, new features you are expecting to not yet be in production).
- Make it a data-driven test. Pass in a list of orders and customers, retrieve production golden masters and compare them to dynamically generated versions based on the new code.
- Use interesting dates and customers. Iterate through thousands of scenarios using that same automation code.
- Use examples from the past that may not be subject to changes after capturing the golden master.
- Structure your tests assertions to help interpret failures. The first assertion on the invoice file might be, does the item line count match? The second might be, do each line’s values match?
- Get creative. Golden masters can be nearly anything.
Who else uses this approach? I would love to hear your examples.
Invigorated by the comments in my last post, I’ll revisit the topic.
I don’t think we can increase our tester reputations by sticking to the credo:
“Raise every bug, no matter how trivial”
Notice, I’m using the language “raise” instead of “log”. This is an effort to include teams that have matured to the point of replacing bug reports with conversations. I used the term “share” in my previous post but I like “raise” better. I think Michael Bolton uses it.
Here are a couple problems with said credo:
- Identifying bugs is so complex that one cannot commit to raising them all. As we test, there are countless evaluations our brains are making; “That screen seems slow today, that control might be better a hair to the right, why isn’t there a flag in the DB to persist that data?”. We are constantly making decisions of which observations are worth spending time on. The counter argument to my previous post seems to be, just raise everything and let the stakeholders decide. I argue, everything is too much. Instead, the more experience and skill a tester gains, the better she will know what to raise. And yes, she should be raising a lot, documenting bugs/issues as quickly as she can. I still think, with skill, she can skip the trivial ones.
- Raising trivial bugs hurts your reputation as a tester. I facilitate bug triage meetings with product owners. Trivial bugs are often mocked before being rejected": “Ha! Does this need to be fixed because it’s bugging the tester or the user? Reject it! Why would anyone log that?”. Important bugs have the opposite reaction. Sorry. That’s the way it is.
- Time is finite. If I’m testing something where bugs are rare, I’ll be more inclined to raise trivial bugs. If I’m testing something where bugs are common, I’ll be more inclined to spend my time on (what I think) are the most important bugs.
It’s not the tester’s job to decide what is important. Yes, in general I agree. But I’m not dogmatic about this. Maybe if I share some examples of trivial bugs (IMO), it will help:
- Your product has an administrative screen that only can be used by a handful of tech support people. They use it once a year. As a tester, you notice the admin screen does not scroll with your scroll wheel. Instead, one must use the scroll bar. Trivial bug.
- Your product includes a screen with two radio buttons. You notice that if you toggle between the radio buttons 10 times and then try to close the screen less than a second later, a system error gets logged behind the scenes. Trivial bug.
- Your product includes 100 different reports users can generate. These have been in production for 5 years without user complaints. You notice some of these reports include a horizontal line above the footer while others do not. Trivial bug.
- The stakeholders have given your development team 1 million dollars to build a new module. They have expressed their expectations that all energy be spent on the new module and they do not want you working on any bugs in the legacy module unless they report the bug themselves and specifically request its fix. You find a bug in the legacy module and can’t help but raise it…
You laugh, but the drive to raise bugs is stronger than you may think. I would like to think there is more to our jobs than “Raise every bug, no matter how trivial”.