My manager recently said she hates raking leaves because as soon as she rakes her yard, she turns around and there are more leaves to rake.
I immediately thought…weird, that’s exactly what testing feels like.
We get a build. It’s full of bugs. We work hard all day logging bugs. By the time we have a chance to turn around and admire our working AUT, we’ve gotten a new build and the bugs are back. So we get out the rake and start all over.
If I’m short on time, sometimes I just rake the important parts of the yard. If I’m not sure where to start, I usually look under the large Oak trees; I’ve noticed fewer leaves under the Loblolly Pines. If I’m expecting a windy day, I usually wait until the next day to rake, allowing the fallen leaves to accumulate. Sometimes, I find an obscure leaf…I have to ask my wife if I should rake it or leave it there. If I get done early, I might rake out some of those leaves from last season, from the garden bed on the side of the house.
It’s exhausting, really. But somebody’s got to keep the yard clean.
Labels: testing metaphor
Two days ago I logged a slam dunk bug. It was easy to understand so I cut corners; skipping the screen capture and additional details.
Yesterday a dev rejected said bug! After re-reading the repro steps, I decided the dev was a fool. However, a brief conversation revealed that even a semi-intelligent being (like a dev) could justifiably get confused. That’s when it hit me. If someone doesn’t understand my bug, it’s my fault.
Most testers experience the AUT via the UI and what seems obvious to the tester may not be obvious to the dev.
So if the dev is confused about your bug, it’s your fault. Graciously apologize and remove the ambiguity. Remember, we’re not dealing with people. We’re dealing with devs.
Another of Robert Sabourin’s STPCon sessions I attended was “Deciding What Not to Test”.
Robert did not give me the magic, simple answer, I was hoping for. His main point was, it’s better to not test certain things because you decided not to, rather than because you ran out of time.
I agree but I’m not convinced this is practical. In order to decide what not to test, one must spend time determining lots of extra tests that will be thrown out. I don’t like that. If there are an infinite number of tests to execute, when do I stop coming up with them? Instead, I prefer to come up with them as I test within my given testing time. My stopping point then becomes the clock. I would like to think, as a good tester, I will come up with the best tests first. Get it?
Maybe “which tests should I NOT execute?” is the wrong question. Maybe the better question is “which tests should I execute now?”.
At any rate, it is feasible that a tester often finds themselves in a situation where they have too many tests to execute in the available time, er…time they are willing to work. When this situation arises, Robert suggests a few questions to help prioritize:
1.) What is the risk of failure?
2.) What is the consequence of failure?
3.) What is the value of success?
Here are my interpretations of Robert’s three questions:
1.) A tester can answer this. Does this dev usually create bugs with similar features? How complex is this feature? How detailed were the specs or how likely is it that the correct info was communicated to the dev?
2.) It should be answered from the stakeholder’s perspective. Although, a good tester can answer it as well. It all comes down to this question: will the company lose money?
3.) This one should also be answered from the stakeholder’s perspective. If this test passes, who cares? Will someone be relieved that the test passed?
So if you answer “high” to any of Robert’s three questions, I would say you had better execute the test.
Do you have any better advice on knowing what not to test? If so, please share!
Labels: writing tests
Bob Galen’s STPCon session, entitled “Agile Testing within SCRUM”, had an interesting twist I did not expect. After a Scrum primer, Bob suggested that test teams can use a Scrum wrapper around their test activities, regardless of what the dev methodology may be.
In other words, even if you’re testing for one or more non-Scrum dev teams, you may still use Scrum to be a better test team. This is kind of a fun idea because I’ve been chomping at the bit to be part of a Scrum team. The idea is that your QA team hold the daily stand-up meetings, create a sprint backlog list, track sprint progress with a burndown chart, and end each sprint with a review meeting to reflect on sprint success/failure. You can add as many Scrum practices as you find valuable (e.g., invite project stateholders like devs/customers to prioritize sprint backlog items or attend daily meetings).
Wrapping QA practices with Scrum is actually not that difficult. For example, sprint backlog items can be bugs to retest, features to test, or test cases to write. Daily stand-up reports can be “Yesterday I tested 5 features and logged 16 bugs, today I will test these other features, and Bug13346 is blocking me from executing several tests.”
My QA team actually started holding Scrum meetings (see picture) about three months ago and it seems to help us stay more focused each day. What’s lacking is a formal sprint goal and means to track progress towards it. Bob Galen’s little session has convinced me it’s worth a try. At least to tide me over till all my devs implement Scrum!
Many of my notes from Hans Buwalda’s STPCon session are test design tips that can also apply to manual testing. One of my favorite tips was to remember to go beyond requirement-based testing. A good QA Manager should say “I know you have tested these ten requirements, now write me some tests that will break them”.
As testers, we should figure out what everyone else forgot about. These are the good tests. These are where we can shine and provide extra value to the team. One way to do this is to take a simple test and make it more aggressive.
Example Requirement: A user can edit ItemA.
Requirement-based Test: UserA opens ItemA in edit mode.
How can I make this test more aggressive? Let’s see what happens if:
- UserA and UserB both open ItemA in edit mode at the same time.
- UserA opens ItemA in edit mode when UserA already has ItemA in edit mode.
- UserA opens ItemA in edit mode, makes changes, goes home for the weekend, then attempts to save changes to ItemA on Monday.
- UserA opens ItemA in edit mode, loses network connectivity, then attempts to save ItemA.
What else can you think of?
Here are 10 things I heard Hans Buwalda say about test automation. I have thought about each of these to some extent and I would love to discuss any that you disagree with or embrace.
- Stay away from “illegal checks”. Do not check something just because an automated test is there. Stay within the scope of the test, which should have been defined by the test designer. There should be a different test for each thing to check.
- If bugs found by tests will not be fixed soon, do not keep executing those tests.
- All Actions (AKA Keywords) with parameters should have defaults so the parameters do not have to be specified. This makes it easier for the test author to focus on the target.
- Group automated tests into modules (i.e., chunks of tests that target a specific area). These tests should not be dependent on other modules.
- Do not use copy and paste inside your automation code. Instead, be modular. Instead of copying low-level steps to another test, use a procedure that encapsulates those low-level tests. This prevents a maintenance nightmare when the AUT changes.
- Remove all hard-coded wait times. Instead use active timing. Never tell a test to wait 2 seconds before moving on. If it takes 3 seconds your test breaks. Instead, test for the ready state using a loop.
- Ask your devs to populate a specific object property (e.g., “accessibility name”) for you. If not, you will waste time determining how to map to each object.
- Attempt to isolate UI tests such that one failed test will not fail all the other tests.
- Something I didn’t expect Hans to say was not to worry about error handling in your automation framework. He says not to waste time on error handling because the tests should be written to “work”. At first I disagreed with this. But later I realized, in my own experiences with error handling, that it made me lazy. Often, instead of automating a solid test, I relied on error handling to keep my tests passing.
- When Hans recommends an “Action-Based” test automation framework, IMO what he means is that it should support both low-level and high-level descriptions for the test steps. Hans considers “Keyword-Driven” automation to be low-level; the keywords being things like “click button”, “type text”, “select item”. Hans also considers Business-Template-Driven automation to be high-level; things like “submit order”, “edit order”. Action-Based test automation uses all of the above. One reason is to build a test library that can check low-level stuff first. If the low level stuff passes, then the high-level tests should execute.
What do you think about these?
Hans spent most of his time discussing test design rather than how to actually “do” the automation. After my experiences with test automation, I completely agree with Hans. The test design is the hardest part, and there is something about automation that magnifies poor test design.
Manual testing allows for dynamic test design, automated testing does not. A manual tester can read an AUT’s modified validation message and determine if it will make sense to a user. An automated test cannot.
Per Hans, when thinking about automated test design, the first two questions should be:
1.) What am I checking?
2.) How will I check it?
These seemingly simple questions are often more difficult than automating the test itself. And that may be why these questions are often neglected. It is more fun to automate for the sake of automation than for the sake of making valuable tests.
To eliminate this problem, Hans counters that Test Automation Engineers should never see a single test case. And Test Designers should never see a single piece of automation code. Hmmm…I’m still not sure how I feel about this. Mostly, because I want to do both!
In the next post, I’ll share some of Mr. Buwalda’s test design ideas that apply to manual and automated testing.
I attended Robert Sabourin’s “To Infinity And Beyond” class at STPCon. This guy’s sessions are fun because of his sense of humor and pop culture references. He played us a screen captured video of someone sending an email and we were asked to write down as many boundaries as we could think of within the video. I had nearly 40 in five minutes but many testers got more.
The point was to start a conversation about boundary tests that go beyond the obvious. The first take-away I embraced was a simple mnemonic Robert gave us for thinking about boundaries. I usually hate mnemonics because most of them are only useful as PowerPoint slide filler. However, I’ve actually started using this one:
A – Application logic
I – Inputs
M – Memory storage
These are three ways to think about boundaries for test ideas. Most features you attempt to test will have these types of boundaries. The other cool thing about this mnemonic is it prioritizes your tests (i.e., test the application logic first, if you have it).
The typical example is testing a text box. But let's apply it to something different. If we apply Robert’s mnemonic to the spec, “The system must allow the user to drop up to four selected XYZ Objects on a target screen area at the same time”, we may get the following boundary tests.
Application logic (i.e., business rules)
- Drop one XYZ object.
- Drop four XYZ objects.
- Attempt to drop five XYZ objects.
- Attempt to drop one ABC object.
- Attempt to drop a selection set of three XYZ objects and one ABC object.
- Can the objects get to the target screen area without being dropped? If so, try it.
- Attempt to drop 1,000,000 XYZ objects (there may be a memory constraint in just evaluating the objects)
- Refresh the screen. Are the four dropped XYZ objects still in the target screen area?
- Reopen the AUT. Are the four dropped XYZ objects still in the target screen area?
What other boundary tests can you think of and which AIM category do they fit into?
Sorry about the blog lapse. I just got back from STPCon and a 10-day vacation in the Pacific Northwest.
I’ll share my STPCon ideas and takeaways in future posts. But first I have to mention my STPCon highlight.
During the conference, keynote speaker James Bach invited me to participate in an experiment to see how quickly testers can learn testing techniques. He approached me before lunch and I joined him and another tester, Michael Richardson, in the hotel lobby. James asked us each to play a different game with him (simultaneously). Michael played the “Dice Game”, in which Michael had to roll random dice of Michael’s choosing, from several different dice styles, to determine a pattern that James made up.
I played a game much like “20 Questions”, in which I had to explain what happened based on vague information I was given. I could ask as many Yes/No questions as I wanted. During the games, James coached us on different techniques. After about 45 minutes, Michael jumped in and bailed me out. Afterwards, James asked me to play again with a different premise. I was able to crack the second premise in less than 5 minutes. I would like to think I actually learned something from my first, painful, attempt.
These games share similarities to software testing because they require skillful elimination of infinite variables. Some of the techniques James suggested were:
- Focus then defocus. Sometimes you should target a specific detail. Other times, you should step back and think less narrow. For me, defocus was the most important approach.
- Forward then backward. Search for evidence to backup a theory. Then search for a theory based on evidence you have determined.
- Dumb questions can lead to interesting answers. Don’t be afraid to ask as many questions as you can, even if they are seemingly stupid questions.
- Flat lining. If you are getting nowhere with a technique, it’s time to stop. Try a different type of testing.
Later, James asked Michael to think up a challenging dice game pattern. James was able to solve it in about 30 seconds using only one die. Obviously, having played before, he knew it would eliminate most of the variables. I just used this exact idea today to understand a software behavior problem I initially thought complex.
After the stress of performing these challenges with James, we sat back and enjoyed a conversation I will not forget. James was more personable and less judgemental than I expected. Despite being a high school dropout, he is currently writing a book about learning (not about testing). I also thought it cool that during STPCon he headed across the street to the Boston Public Library and began reading from various history books. He was trying to determine why people with power would ever choose to give it up. I guess he’s got more on his mind than just testing.
For me, spending time with James was a huge privilege and I was grateful for the opportunity to discuss testing with him, as well as getting to know the person behind the celebrity.
Note: Michael Richardson was also an interesting guy. He once did QA for a fish company and was literally smacked in the face with a large fish due to poor testing.
Verifying fixed bugs logged by you is easy. You’re a tester. You logged a proper bug, with repro steps, expected results, and actual results. You remember the issue.
But what about those bugs logged by non-testers that you never even had a chance to experience? You certainly need to verify their fix. However, how can you be sure you understand the bug?
Here is what I try to do.
First, I track down to person who logged the bug. I ask them how it was discovered. Knowing this info will solidify the repro steps (if there were any).
Second, I experience the bug in its unfixed build. This sometimes means preventing the fix from being released to the environment I need to verify it on (so I have time to repro it). If I experience the problem I am usually confident I can verify its fix.
Third, I determine what was done to fix the bug? I can usually identify the dev from the bug history. Even if the bug has the dev comments about its fix, devs are always thrilled to discuss it at length.
Finally, while talking to the dev, I squeeze in the important question, “What related functionality is at risk of breaking?.” The answer to this question helps narrow down my regression test goal to something realistic.
Am I missing anything?
While cleaning my desk, I discovered a brand new promotional Mercury cap. Since I’m not a big fan of Mercury (e.g., TestDirector, QuickTestPro), I decided to award the cap to the tester on my team who could log the most bugs in a given week.
Life would be so much easier if our main goal were simply to log as many bugs as possible. Most of you will agree with me, that given enough time, you can find bugs in any piece of software thrown at you. Sure, they may be stupid bugs that nobody cares about, but you’ll find them.
Let’s face it. Testers don’t have very concrete goals. “Ensure the quality of the software”. Sometimes it’s fun to get a little competitive with each other and just focus on primitive stuff, like flooding your devs with bugs!
I think they should.
The less selfish you are about your bug list, the more bugs you’ll get logged, and the more info you’ll have about the state of the AUT. In my team, the devs, and the business people who write the requirements, all log bugs. This is an Agile team so we have the luxury of continuous business testing.
How are their bug reports? Terrible. The dev’s bug reports are usually too low level for me to understand. The business people’s bug reports usually lack repro steps, include multiple problems, and rarely describe what they expected. That being said, I wouldn’t trade them for anything.
As poorly as said bug reports may be written, they are tremendously valuable to me. Although a brief face-to-face is sometimes necessary to decipher the bug description, some of these bugs identify serious problems I have missed. Others give me ideas for tests (from a dev perspective or a business perspective). And it’s more convenient to retrieve these from the Bug list rather than rooting through email. Finally, I think most team members enjoy the ability to log bugs and even consider it a privilege.
I have some non-testers on my team who have logged more bugs than some of the testers on my team. Scary, huh? The point is, if a non-tester wants to log bugs, don’t feel threatened by it. Embrace it!
I’ll talk more specifically about how I handle bugs written by non-testers in a future post.
I sent an email to a dev today, asking why his new sort-by-column functionality for a grid was not actually sorting the values properly. For example: an ascending sort would rearrage alphanumeric data to something like "A", "B", "C", "A", "S", B".
Shortly thereafter the dev stopped by my desk and explained, "It sorts. It’s just not perfect."
I know it looks better on tester resumes to emphasize one’s White Box Testing abilities and brag about how many bugs you caught before they manifested on the UI. It also serves for far more condescending trash talk amongst testers. But since the majority of the testing I do is manual Black Box Testing, I often feel depressed, wondering if I am inferior to my testing peers and fellow bloggers.
The other day something occurred to me… Black Box testing is actually more challenging than White Box Testing. That is, if it is good Black Box testing.
I’m testing a winform app, that at any given time, may have about 6 different panes or zones displaying. The bulk of the user paths require drag/drop between various zones into grids. The possible inputs are nightmarish compared to those of the underlying services. Determining bug repro steps takes creativity, patience, and lots of coffee. Communicating those repro steps to others takes solid writing skills or in-person demos. And predicting what a user may do is far more challenging than predicting how one service may call another service.
I’m not suggesting apps should be or can be tested entirely using a Black Box approach. But the fact is, no matter how much white box testing one does, the UI still needs to be tested from the user’s perspective.
So if you’re feeling threatened by all those smarty pants testers writing unit tests and looking down on the black box testers, don’t. Effective Black Box Testing is a highly skilled job and you should be proud of your testing abilities!
I finally added Perlclip to my tray. I use it several times a week when I have to test text field inputs on my AUT. Among other things, this helpful little tool, created by James Bach, allows one to generate a text string with a specific character count. That alone is not very cool. However, the fact that the generated text string is comprised of numbers that indicate the count of the following astrisk is way cool.
Example: If I create a "counterstring" of 256 characters, I get a string on my clipboard that can be pasted. The last portion of the string looks like this...
Each number is telling you the character number of its following astrisk. Thus, the last astrisk is character #256. The last "6" is character #255. Get it? So if you don't have a boundary requirement for a text input field, just paste in something huge and examine the string that got saved. If the last portion of the saved string looks like this...
...your AUT only accepted the first 62 characters.
The first Agile Manifesto value is…
“Individuals and interactions over processes and tools”
While reading this recently, something radical occurred to me. If I practice said value to its fullest, I can stop worrying about how to document everything I test, and start relying on my own memory.
I hate writing test cases. But the best argument for them is to track what has been tested. If someone (or myself) asks me “Did you test x and y with z?”, a test case with an execution result seems like the best way to determine the answer. However, in practice, it is not usually the way I answer. The way I usually answer is based on my memory.
And that, my friends, is my breakthrough. Maybe it’s actually okay to depend on your memory to know what you have tested instead of a tool (e.g., Test Director). But no tester could ever remember all the details of prior test executions, right? True, but no tester could ever document all the details of prior test executions either, right? To tip the balance in favor of my memory being superior to my documentation skills, let me point out that my memory is free. It takes no extra time like documentation does. That means, instead of documenting test cases, I can be executing more test cases! And even if I did have all my test case executions documented, sometimes it is quicker to just execute the test on the fly than go hunt down the results of the previous run (quicker and more accurate).
It all seems so wonderful. Now if I can figure out how to use my memory for SOX compliancy... shucks.
Before we test, we should plan. Before we plan, we should understand. Before we understand, we should discuss. As we discuss complex paths through software, we should use models.
As soon as you draw a model, the discussion becomes engaging for both parties. Each person is forced to pay attention because the items under discussion are now concrete entities we can point to. Our brains can focus less on what the names or actions for these objects are and more on how they interact.
Even the crudest models (e.g., circles and arrows) will facilitate understanding. Everyone wants to discuss things with a model, whether they know it or not. It’s easier to talk to the model than someone’s face, especially for us introverts. When I sense confusion during a discussion I say “let’s go the white board”. Once the other person (usually the dev) gets their butt out of their chair the rest is down hill.
My knowledge of the inner workings of my AUT is miniscule compared with that of my devs and I’m not afraid to show it. I’ll draw some first grader shapes with letters and before I know it the dev is reaching for the dry erase marker to show me how it really works. And the beauty of the whole thing is that your brain will use the model to develop test cases; the possible inputs/outputs become clearer. Say these out loud and your dev will help you find the weak points. Then thank the dev and go getcha some bugs!
I first attended one of IIST's international testing certification weeks five years ago. The certification requires certain classes, some of which are taught by Dr. Magdy Hanna. After taking his course, The Principles of Software Testing, I gave him a poor evaluation. I may be one of the few people in the world who actually bothers to provide valuable feedback on surveys and I'm not shy about giving it.
Anyway, two years later I talked my new company into letting me attend another certification week. I was only about three courses away from certification. I registered, signed up for the courses I needed, and bought an airline ticket. About a week later, I got an email saying I could not take one of the courses I signed up for because the instructor would not allow it (due to a poor evaluation I had given him two years earlier).
Was this guy for real? Dr. Magdy Hanna would not return my calls but eventually sent me an email saying since I didn't like his teaching style I would be banned from his classes. Oh, I would have taken someone else's class in a heartbeat but this was the last one I needed and Hanna happened to be the only guy teaching it that week. It would have been nice if they would have told me prior to me registering and buying an airline ticket, which I had to use elsewhere.
Anyway, I hate IIST. Hanna's classes did suck and so did his teaching technique, which was more about his own ego than trying to help anyone become a better tester. That year I attended Michael Bolton's Rapid Software Testing course and learned more than I had learned in all the IIST classes. This year I am attending the Software Test & Performance Conference and next year I hope to try CAST. If you ever have a choice, don't choose IIST.
My first job out of college was teaching software applications like MS Access, FoxPro, and Act!. Back then, in the late 90's, demand for these types of classes was much higher than it is today. This, I believe, is because today's software users are more sophisticated. Most have already been exposed to some flavor of word processing, spreadsheet, or email applications. Many can even teach themselves software or look online for answers.
After accepting the above, it's not too great a leap to also accept that modern software users are aware that software is not perfect. They have experienced application hangs and strange system errors and many users learn to avoid these bugs or recover via a reboot or similar.
If the above is true, why can't all bug lists be public? The culture of my dev team prefers to keep the bug list hidden because they believe users will have trust issues if we admit to known production bugs. I disagree. In fact, if properly facilitated, I think a public bug list can actually build user trust. Users are smart enough to see the value in having their software earlier, even at the expense of known bugs.
What do you think?
Contrary to my previous post, about devs taking more blame for production bugs, devs also take most of the credit when users like an application. I’ll bet testers rarely get praise from end users.
The reason is simple. Users don’t read the fixed bug list. Users have no idea how crappy the app was before the testers started working their magic. Have you ever heard a user say, “Wow, these testers really worked hard and found a lot of bugs!”. Users don’t have this information. For all they know, testers didn’t do a damn thing. The Devs could have written rock solid code and tested it themselves…who knows?
I’ve had the luxury of working on an AUT that hasn’t gone live for 3 years. Now that we’re live, the old familiar tester stress, guilt, and anger is back.
When the first major production bug was discovered I wanted to throw up. I felt horrible. Several people had to work around the clock to clean up corrupt data and patch the problem. I wanted to personally apologize, to each person on my team and hundreds of users, for not catching the bug in testing…and I did apologize to a couple individuals and offer my help. Apologizes in these cases don’t help at all, other than for personal guilt and accountability.
During my selfish guilt, I opened my eyes and realized my fellow devs felt just as accountable as I did (if not more so), and never attempted to pass blame to me. I started asking myself who is really more at fault here; the tester who didn’t test the scenario or the developer who didn’t code to handle it?
I think the tester is 75% responsible for the bug and the developer, 25%. However, the dev probably gets the brunt of the blame because they are a more prominent part of the development team. I would guess more end users have heard of people called software developers than have heard of people called software testers.
Devs have a special ability to talk testers out of bugs by claiming said bugs cannot be fixed. It goes something like this…
Tester: This UI control doesn’t do what I expect.
Dev: It’s not a bug. It’s a technology constraint. It is not possible to make it do what you expect.
My current AUT uses a third party .Net grid control called “ComponentOne” and I often hear the following:
Dev: It’s not OUR bug. It’s a ComponentOne bug.
I’m fine with these dev responses and even grateful for the info. Where it becomes a problem is when I get soft and don’t log the bug. Criteria for a bug should not include whether we believe it can be fixed or not. Log it! There is so much value in even non-fixable bugs!
Our bug list should capture what is known about the quality of the AUT. If we know the UI widget does not work the way we want it to, let’s document that. Who knows, if these non-fixable UI widget bugs keep getting logged, maybe we can justify using a different UI widget. But a more likely result, from my experience, is that devs will do everything in their power to fix everything on that bug list. And they will suddenly become more creative and write a hack for the technology constraint that earlier looked impossible to overcome.
So don’t be fooled like the storm troopers in front of the Mos Eisley Cantina. If you start hearing “These are not the bugs you’re looking for”, snap out of it!
When I first began getting paid to test software, I was a little confused. I knew it was invigorating to catch scary bugs prior to production but I wasn't really sure how valuable my job was to my dev team or the rest of the world! In fact, I didn't really know if testing software was anything to make a career out of in the first place.
A few years ago I came across Harry Robinson's Bumper Stickers for Testers post on StickyMinds.com. It was at that point that I decided my job as a software tester was valuable (and even a little cool). Harry and all the other software testers who contributed the excellent material on said post inspired me to take pride in my job and now I even sport a couple bumper stickers to show it (see below). If you test software, I encourage you to do the same.
Allen, one of the developers on my team, recently told me about some poorly written bugs he had received from a tester. Allen said,
"Basically these bugs describe the application behavior as a result of user actions working with an application, without any explicit identification of the buggy behavior. This tester wrote many, many bugs with one or two sentences. His bug description pattern was like this:
When I do A, it does B.
- When I try to move the video object, the background image stays still. (OK. Where is the bug?)
- When I set the video file to a path located in another course, the file is copied to the current course when rendering the page in IE browser. (To some, this may be the correct behavior. So where is the bug?)
This is an ambiguous way to describe a bug. It frustrates me! "
About a year ago, Allen offered similar candor on several of my bugs. Since then, I have made it a point to force the following two statements into every bug I log.
No, it’s not an original idea. But it is an excellent idea that makes it nearly impossible for one to ever log another ambiguous bug. If the tester had used it for one of Allen’s examples above, we might see the following.
Expected Results: The background image moves with the video object.
Actual Results: The background image does not move with the video object. Instead, the background image stays still.
Expected Results/Actual Results. Don't log a bug without these!
I have a recurring dream when I get sick. In the dream I’m tasked with riding an elevator up and down to various floors. On each floor I encounter a bunch of numbers bouncing around acting out of control. Part of my task is to make sense out of these numbers on each floor. Like, maybe put them in order…I’m never sure. However, I usually give up and ride the elevator to another random floor and try again. The dream just loops through this sequence. It sounds silly but during the dream it is quite scary and unstoppable for some reason.
What does this have to do with testing? The state of our project feels so chaotic that I keep getting flashes of this dream. My days are filled getting emails or people stopping by my cube with error messages accompanied with vague or no repro steps. Each email is critical and the next seems to preempt the previous. The amount of tests I’ve actually executed myself has been minimal over the past month. Instead, I’ve been attempting to determine what other people are doing. The email threads typically get hijacked by people chunking in non-related problems and I can often identify people misunderstanding each other because each thread becomes more ambiguous than the next. These threads contain bugs that get lost because nobody can figure out enough info to log them. Ahhhhhhh! So I get back on the elevator and see if I can make sense out of the next floor.
IMHO, much of this chaos could be avoided if people would log the bug, no matter how few infos are known. In an extreme case, I still believe it would be valuable to log the bug if all you have is a crumby screen capture of an error. Something like:
Bug#20074 - “No repro steps but someone got this error once…”
The next time this error is encountered we now have something to compare it with. “Hey, this is the same error as Bug#20074, did we notice any clues this time? No?” Well, we can at least update the bug to indicate we saw the error again in the next build and someone else got it.” The emails referring to this problem can say “This may be Bug#20074”. And so on. Once we have a bug, no matter how hollow the bug is, the problem becomes more than someone’s sighting of Bigfoot. It becomes a problem we can actually collect information against in an organized manner. And hopefully, I can stop riding the elevator.
I recently began Integration Testing between two AUTs. Each AUT has its own bug DB and dev team. Let’s say we execute a simple test like this...
Step 1: Trigger EventA in AUT1.
Expected Result: EventB occurs in AUT2.
Actual Result: EventB does not occur in AUT2.
We’ve got the repro steps. Can we log it as a bug? Not yet, because where do we log it? AUT1 or AUT2’s bug DB? In this type of testing, I think fewer bugs are logged per hour. I believe this is because the burden on the tester to perform a deep investigation is much higher than during Integration Testing between different modules of the same AUT.
Part of the problem may be a pride-thing between the dev teams. Each dev team believes the bug belongs to the other dev team until proven wrong by the tester. Yikes...not very healthy! This dev team pride-thing may exist because devs are not always fond of collaborating with other dev teams. On my floor, the QA teams are forced to work together while the dev teams somehow manage to survive in their own clans. There are exceptions, of course.
Anyway, after adequate bug research, it is usually possible to determine bug ownership. But what if neither dev team is at fault? What if the bug exists merely due to an integration design oversight? Where do you log it then?
Boy, being a software tester is hard work!
When one of my devs asked me to repro a bug on his box because he didn't know how to execute my repro steps, I didn't think twice. But I was a little surprised when he stated his position that he was strongly in favor of QA always performing the repro steps on the developer's box instead of the developer doing it themself. He argued the devs didn't always have time to go through time consuming or complex repro steps. He attempted to retract portions of his statement once he found out I was blogging it...too late.
I should, however, add that this particular dev has been tremendously helpful to me in setting up tests and helping me understand their results.
That being said, I've never heard anything like this from a developer in all my years of testing and I think it's ridiculous. But it did make me think about how we speak to our devs via bug reports. When we log repro steps from blackbox tests we use a kind of domain specific language that requires a user-based understanding of the AUT. To perform a repro step like "Build an alternate invoice", one must understand all the micro steps required to build the alternate invoice and one may have to understand what an alternate invoice is in the first place. If the next repro step is "Release the invoice to SystemX", one must know how to release the invoice, etc.
I think it is realistic to expect developers to have an understanding of this business-centric language as well as knowing how to perform said procedures from the AUT. And in general, time spent learning and using the AUT will help the developers improve it's overall quality.
Am I right?
That is the question. Or at least, hypothetically, it could be. As my current project nears its "Go-No-Go" (I really hate that phrase) date, my decisions on how I spend my dwindling time are becoming as critical as the bugs I still have to find.
My former manager, Alex, and I had an interesting disagreement today. If given a choice between verifying fixed bugs or searching for new ones, which would be more valuable to the project at the bitter end of its development life? Alex said verifying fixed bugs because you know the code around those bugs has been fiddled with and the chances of related bugs are significant. I would instead choose to spend this time searching for new bugs because I believe these unexecuted tests are more dangerous than those bugs said to have been fixed.
Well, it all boils down to a bunch of variables I guess...
- How critical are these unexecuted tests or do we even know what they are?
- What does history tell us about the percentage of our fixed bugs that get reopened after retest?
- How critical are said fixed bugs?
The main reason we entered this discussion in the first place was because I am stubborn when it comes to fixed bug retesting (we call it "verifying bug fixes"). I find it dull and a waste of my skills. It seems more like busy work. The test steps are the repro steps and the outcome is typically boring. "Yay! It's fixed!" is less interesting than "Yay! I can't wait to log this!".
What do you think?
The first software test blogger I read was The Braidy Tester. He is still my favorite.
I borrowed from his test automation framework, took Michael Bolton's Rapid Software Testing course based on his suggestion, and laugh at his testing song satires. But most of all, The Braidy Tester (AKA "Micahel" or "Michael Hunter") inspired me to think more about testing and how to improve it.
So when he asked to interview me for his Book of Testing series I was thrilled. I realize this post is nothing more than an attempt to rub my own ego, but perhaps my answers to the questions will help you think about your own answers. Here it is...
Labels: test blogs
At first glance, it appears my devs have the more challenging job. They have to string together code that results in a working application…usually based on ambiguous specs full of gaps.
But at second glance, I think the testers have it harder. Developers have a clear target to aim for. It’s called “Code Complete”. After which, their target may become “Fix The Bugs”. Each is a relatively objective target when compared to those targets of testers like “Write the Test Cases” or “Find the Bugs” or “Ensure the Quality”.
Arguably, a tester’s job is never complete because there is an infinite amount of tests to run. A dev can sit back and admire a stopping point where their code does what the feature is supposed to do. The tester cannot. The tester is expected to go beyond verifying the code does what the feature is supposed to do. The tester must determine the code’s behavior under all possible paths through the application in various configurations. If the tester is attempting to use thorough test automation it would require more code to support the automated test library than that of the AUT itself. Even then, there would still be more tests left to automate.
It may be worth noting that I’ve always wanted to be a developer. Why aren't I? I don’t know, I guess it seems too hard…
I'm in love with the word “appears” when describing software. The word “appears” allows us to describe what we observe without making untrue statements or losing our point. And I think it leads to better communication.
For example, let’s say the AUT is supposed to add an item to my shopping cart when I click the “Add Item” button. Upon black box testing said feature, it looks like items are not added to the shopping cart. There are several possibilities here.
• The screen is not refreshing.
• The UI is not correctly displaying the items in my cart (e.g., the database indicates the added items are in my cart).
• The item is added to the wrong database location.
• The added item is actually displaying on the screen but it’s not displaying where I expect it to.
• A security problem is preventing me from seeing the contents of my cart.
The possibilities are endless. But so are the tests I want to execute. So like I said in my previous post, after determining how much investigation I can afford, I need to log a bug and move on. If I describe the actual result of my test as “the item was not added to my cart”, one could argue, “yes it was, I see it in the cart when I refresh...or look in the DB, etc.”. The clarification is helpful but the fact is, a bug still exists.
Here is where my handy little word becomes useful. If I instead describe the actual result as “the item does not appear to be added to my cart”, it becomes closer to an undisputable fact. Once you begin scrutinizing your observation descriptions, you may find (as I did) yourself making statements that are later proven untrue, and these may distract from the message.
Think about this a little before you decide this post appears to suck.
Okay, we (tester) found a bug and figured out the repro steps. Can we log it now or should we investigate it further? Maybe there is an error logged in the client’s log file. Maybe we should also check the server error logs. And wouldn’t the bug description be better if we actually figured out which services were called just before said error? We could even grab a developer and ask them to run our test in debug mode. Or better yet, we could look at the code ourselves because we’re smart tester dudes, aren’t we?
If you thought I was going to suggest testers do all this extra stuff, you’re wrong. I've read that we should. But I disagree. We’re the tester. We test stuff and tell people when it doesn’t work. We don’t have to figure out why. If we’ve got the repro steps, it may be okay to stop the investigation right now. That’s the whole point of the repro steps! So the Dev can repro the bug and figure it out.
Look, I get it...we’re cool if we can dig deep into the inner workings of our AUT and maybe we’re providing some value added service to our devs. The problem is, we’re not the devs. We didn’t write it. Thus, we are not as efficient as the devs when it comes to deep investigation. And time spent investigating is time NOT spent on other testing. For me, everything must be weighed against the huge number of tests I still want to run.
So unless you’ve run out of good tests, don’t spend your time doing what someone else can probably do better.
What do you think?
Managers noticed testers were complaining about being too busy. So they gave Mercury QuickTest Pro licenses to us (most of us had little automation skills if any) and told us to start automating our tests because it would free up our time. Some of the managers offered many of the "reckless assumptions" in James Bach's classic Test Automation Snake Oil article; I think the plan fell flat.
Purchasing an application like QuickTest Pro can get most testers a handful of record-playback scripts that can provide superficial value. Anything beyond that requires a significant investment in learning, time, and creativity. For starters, one must select and understand an automation framework. But even after building the coolest automation framework in the world, one is still faced with the same damn question, "What should I automate?"
I have only been working with test automation seriously for about two years but along the way my small team and I have learned enough to throw a few suggestions out as far as what tests should be automated.
- Automate tests that cannot be performed by a human. An easy place to start is with performance testing from the user perspective. How long does a given user-triggered action take from the user’s perspective? This test cannot be performed manually (unless the tester is willing to do the same thing over and over with a stop watch and the time spans are at least several seconds). This type of test is not designed to find bugs. Instead, it is designed to collect performance information.
- Another practical way to use a test automation tool is to write scripts that simply get your AUT into the precondition state necessary to perform other tests? Again, this type of automated test is not designed to find bugs. It's not even really a test. It's more like a macro. Yet, a good tester can use it to their advantage.
- Sanity or Smoke Testing – If your AUT undergoes frequent builds with new bits or frequent builds with the same bits on different environments, automated tests can find configuration problems or regression problems. Building this type of automation library is more ambitious. And getting it to run unattended requires a seemly infinite amount of error handling. A good place to start is getting a few positive tests to navigate through some critical areas of the AUT, performing validations along the way.
- A final answer to the question of what to automate is "nothing". Don't automate anything. On smaller, in-house or custom apps that generally run on the same environment and seldom update, it would be silly to invest any significant effort into writing automated tests.
Unless your team gets together for drinks after work, you need a better way to get to know each other than arguing over bugs. Bring in some board games or card games and invite your devs to play over lunch and watch the magic begin. Here we are playing R-Eco.
The trick is to find the right game, that supports the right number of players, hits a sweet spot between luck and decision making, is easy to learn, and lasts about 30 to 45 minutes. Rob introduced us to Euro-Games, which tend to be shorter than American games and usually involve more decision making. My favorites are the ones that scale from about 6 players down to 2 so I can also play them at home with my wife.
If your company doesn't have a BBTest Assistant license, Microsoft's free Windows Media Encoder in its Windows Media Encoder 9 Series download has an awesome screen capture to video tool and a wizard that does all the setup for you. I've been having fun attaching videos to my bug reports and since they include even more info than still screen captures, they'll hopfully increase bug turn around.
Here's a sample video of a little MS Word bug James Whittaker describes in his book "How to Break Software". (The message indicating the index columns must be between 0 and 4 displays twice.)
As with manual tests, for each automated test we write, we must make a decision about where the test begins/ends and how much stuff it attempts to verify.
My previous post uses an example test of logging into an app then editing an account. My perfect manual test would stick with the main test goal and only verify the account edit. But if I were to automate this, I would verify both the log-in page and account edit. In fact, I would repeat the log-in page verification even if it occurred in every test. I do this for two reasons.
1. Automated verification is more efficient than manual verification.
2. Automated tests force extra steps that can’t determine bug workarounds.
In most cases, automated verification is more efficient than manual verification. Once the verification rules are programmed into the automated test, one no longer has to mentally determine whether said verification passes or fails. Sure, one can write just as many rules into a manual test, but it still takes a significant amount of time to visually check the UI. Worse yet, it takes a great deal of administrative work to record the results of manual verifications. So much time that I often get lazy and assume I will remember what I observed.
With manual tests, we can think on the fly and use our AUT knowledge to get the correct precondition state for each test. However, automated tests do not think on the fly. Thus, we have to ensure each automated test begins and ends in some known state (e.g., AUT is closed). This forces our automated tests to have a great deal more steps than our manual tests. An upstream bug may not have much impact on a test if a human finds a workaround. However, that same upstream bug will easily break an automated test if the test author did not plan for it. Thus, multiple verifications per test can help us debug our automated tests and easily spot upstream bugs (in both our AUT and our automated test library).
Let’s say each of our tests include multiple verifications. If Test#1 has an overall PASS result it tells us all things verified in Test#1 worked as expected. I’m okay with this. However, if it gets a FAIL result it tells us at least one thing did not work as expected. Anyone who sees this failed test does not really understand the problem unless they drill down into some type of documented result details. And how do we know when we can execute this test again?
The simpler our tests, the easier they are to write, and the less working things they depend on to execute. I'll use an exaggerated example. The following test verifies a user can log in and edit an account.
1. Log in. Expected: Log in page works.
2. Edit an account. Expected: Account is edited.
What is this test really interested in? What if the log in page is broken but you know a workaround to get you to the same logged in state? Can you test editing an account? And if it works should this test still pass?
My ideal manual test is structured as follows (I’ll discuss automated tests next week).
Do A. Expect B.
It has one action and one expected result. Everything else I need to know prior to my action is documented in my test’s Preconditions. This helps me focus on what I am actually trying to test. If I discover upstream bugs along the way, good, I’ll log them. But they need not force this specific test to fail. Get it? Have you thought about this? Am I missing anything?