DevelopsenseLogo

Smoke Testing vs. Sanity Testing: What You Really Need to Know

If you spend any time in forums in which new testers can be found, it won’t be long before someone asks “”What is the difference between smoke testing and sanity testing?”

“What is the difference between smoke testing and sanity testing?” is a unicorn question. That is, it’s a question that shouldn’t be answered except perhaps by questioning the question: Why does it matter to you? Who’s asking you? What would you do if I gave you an answer? Why should you trust my answer, rather than someone else’s? Have you looked it up on Google? What happens if people on Google disagree?

But if you persist and continue to ask me, here’s what I will tell you:

The distinction between the smoke and sanity testing is not generally important. In fact, it’s one of the most trivial aspects of testing that I can think of, offhand. Yet it does point to something that is important.

Both smoke testing and sanity testing refer to a first-pass, shallow form of testing intended to establish whether a product or system can perform the most basic functions. Some people call such testing “smoke testing”; others call it “sanity testing”. “Smoke testing” derives from the hardware world; if you create an electronic circuit, power it up, and smoke comes out somewhere, the smoke test has failed. Sanity testing has no particular derivation that I’m aware of, other than the common dictionary definition of the word “sanity”. Does the product behave in some crazy fashion? If so, it has failed the sanity test.

Do you see the similarity between these two forms of testing? Can you make a meaningful distinction between them? Maybe someone can. If so, let them make it. If you’re talking to some person, and that person want to make a big deal about the distinction, go with it. Some organizations make a distinction between the smoke and sanity testing; some don’t. If it seems important in your workplace, then ask in your workplace, and adapt your thinking accordingly while you’re there. If it’s important that you provide a “correct” answer on someone’s idiotic certification exam, give them the answer they want according to their “body of knowledge”. Otherwise, it’s not important. Don’t worry about it.

Here’s what is important: wherever you find yourself in your testing career, people will use language that has evolved as part of the culture of that organization. Some consultancies or certification mills or standards bodies claim the goal of providing “a common worldwide standard language for testing”. This is as fruitless and as pointless a goal as a common worldwide standard language for humanity. Throughout all of human history history, people have developed different languages to address things that were important in their cultures and societies and environments. Those languages continue to develop as change happens. This is not a bad thing. This is a good thing.

There is no term in testing of which I am aware whose meaning is universally understood and accepted. There’s nothing either wrong or unusual about that. It’s largely true outside the testing world too. Pick an English word at random, and odds are you’ll find multiple meanings for it. Examples:

  • Pick (choose, plectrum for a guitar)
  • English (a language, spin on a billiard ball)
  • word (a unit of speech, a 32-bit value)
  • random (without a definite path, of equal probability)
  • odds (probability, numbers not divisible by two)
  • multiple (more than one, divisible by)
  • meaning (interpretation, significance)

Never mind the shades and nuances of interpretation within each meaning of each word! And notice that “never mind”, in this context, is being used ironically. Here, “never mind” doesn’t mean “forget” or “ignore”; here, it really means the opposite: “also pay attention to”!

Not only is there no universally accepted term for anything, there’s no universally accepted authority that could authoritatively declare or enforce a given meaning for all time. (Some might point to law, claiming that there are specific terms which have solid interpretations. If that were true, we wouldn’t need courts or lawyers.)

If you find yourself in conversation (or in an interview) with someone who asks you “Do you do X?”, and you’re not sure what X is by their definition, a smart and pragmatic reply starts with, “I may do X, but not necessarily by that name.” After that,

  • You can offer to describe your notion of X (if you have one).
  • You can describe something that you do that could be interpreted as X. That can be risky, so offer this too: “Since I don’t know what you mean by X, here’s something that I do. I think it sounds similar to X, or could be interpreted as X. But I’d like to make sure that we both recognize that we could have different interpretations of what X means.”
  • You can say, “I’d like to avoid the possibility that we might be talking at cross-purposes. If you can describe what X means to you, I can tell you about my experiences doing similar things, if I’ve done them. What does X mean to you?” Upon hearing their definition of X, then truthfully describe your experience, or say that you haven’t done it.

If you searched online for an answer to the smoke vs. sanity question, you’d find dozens, hundreds of answers from dozens, hundreds of people. (Ironically, the very post that introduces the notion of the unicorn question includes, in the second-to-last paragraph, a description of a smoke test. Or a sanity test. Whatever.) The people who answer the smoke vs. sanity question don’t agree, and neither do their answers. Yet many, even most, of the people will seem very sure of their own answers. People will have their own firm ideas about how many angels can fit on the head of a pin, too. However, there is no “correct” definition for either term outside of a specific context, since there is no authority that is univerally accepted. If someone claimed to be a universally accepted authority, I’d reject the claim, which would put an instant end to the claim of universal acceptance.

With the possibile exception of the skills of memorization, there is no testing skill involved in memorizing someone’s term for something. Terms and their meanings are slippery, indistinct, controversial, and context-dependent. The real testing skill is in learning to deal with the risk of ambiguity and miscommunication, and the power of expressing ourselves in many ways.

19 replies to “Smoke Testing vs. Sanity Testing: What You Really Need to Know”

  1. you’re right, there’s no one definition. But as far as I’m concerned, you can only define smoke relative to sanity and vice versa. For example, if your sanity is comprised of the steps X1, X2, Y1, Y2, your smoke test will be only X & Y…

    Michael replies: I’m going to approve only one comment like this, and I’m going to reject anything else that looks even remotely like it. I’m going to leave this one as an example of the fact that even after someone explicity makes the point that there’s no inherent meaning in the terms, someone else will immediately come along and try to establish theirs without explaining why or how the distinction is helpful to them or to anyone else. Moreover, they’ll often do so without explaining or giving a basis for the distinction or for the assumptions and vocabulary underlying it, which is even more unhelpful. You can see this sort of pattern repeated on testing forums the world over.

    Testers: there is no reason to pay attention to this kind of noise pollution.

    Reply
  2. This kind of question presupposes that there’s some commonly accepted oracle for All Things Testing. An appeal to authority. A desire to Get It Right(tm).

    Where I am, as test manager, I have definitions for smoke and sanity testing (both are light, smoke lighter than sanity) because we needed labels for two layers of initial “is this thing basically borked or not” testing we do.

    Michael replies: This falls into the situation I note above: “Some organizations make a distinction between smoke and sanity testing; some don’t. If it seems important in your workplace, then ask in your workplace, and adapt your thinking accordingly while you’re there.” Mind you, I might hold out for Borked Level 1 and Borked Level 2, but that’s mostly because I like the word “Borked”. But then we’d only begin another round on what constitutes Level 1 and Level 2, ad infinitum

    I sure hope nobody here decides that This Is How Testing Should Be Done!

    Me neither, but it’s completely okay for them (you; anyone) to decide This Is How Testing Should Be Done Here.

    Reply
  3. I often need a small cricket on my shoulder who’s job it is to keep perspective and at times remind me the difference between asking after unicorns and asking after the giant beasts with a single horn out to gore me.

    Michael replies: I’d start by getting out of the way of the latter, if there’s one bearing down on you. Screw the cricket. Also, crickets on your shoulder might represent a whole new class of problem.

    Reply
  4. Michael, by a strict definition, if the question really was a “Unicorn Question”, then the answer you gave should be worthless.

    You might like to look at the original unicorn question post again. It gives an example of providing a helpful reply that addresses the underlying issue while pointing out that the direct answers to the questions, as posed, are useless.

    But you say that the words “Smoke/Sanity test” do point to something that is important. That doesn’t sound worthless to me.

    Is there any chance you’re confusing the significance of the distinction between smoke and sanity (near zero, except in context) with the significance of the underlying issue (nowhere near zero)?

    Given that the words ‘sanity/smoke tests’ do refer to something worthwhile, isn’t the value to the newbie tester in this case is to be told that (in your experience) there is no real difference them?

    Is there any chance you’re confusing the value of answer to the question (near zero) with the value of the discussion arising from the question (much greater than zero)?

    In a similar way, if a child or non-native English speaker asked me “What size unicorn do you wear”, I would simply assume they did not understand the (generally accepted) meaning of the word “Unicorn” and I would try and clarify it for them.

    That seems to me to be the most useful answer I can give in the context. If it actually turns out that the person believes that unicorns come in different sizes and can be worn, good luck to them.

    Yeah, that’s sort of my whole point here.

    Reply
  5. The only use I can find for any debate over smoke testing vs sanity testing is to explain why software testers should not draw too much from the world of hardware testing.

    Reply
  6. One of the reasons we keep on hitting the wall with such questions, is that in a world where someone from China can easily share information and discuss issues with someone from Australia or Canada, almost every 2nd term used in the discussion, is understood differently by each of them. That makes discussing quite challenging, and opens a huge door for miscommunication.

    Michael replies: That’s also true where someone from one cubicle can easily share information and discuss issues with someone from the next cubicle.

    Harry Collins refers to this activity he calls repair— the capacity of humans to fix up communication and information in various ways to suit some purpose. We use tacit knowledge to help us fill in the blanks or make the necessary corrections. We do this constantly, and we do it so well that we’re not aware we’re doing it. Indeed, we tend to notice it far more often when it results in a problem or a mistake. Most of the time it works so smoothly we don’t notice it at all.

    One key difference between the China-Canada case and the cubicle-next-door case is that presence—real, physical presence—affords greater opportunity for detecting certain kinds of communication problems more quickly and accurately, and repairing them more appropriately. When we’re face to face, the puzzled expression, the hesitant reply, the shuffling of the feet, the rapid-fire exchange of refinements and amplifications in the dialogue can all happen much more quickly, more easily, and more observably.

    We all hope we can solve that (at least many of us do), but we probably can’t, so we need to find other means for clear communication (as we will not clarify each term we use there and then). And maybe the reason is that we are testers, and like to investigate the finer details of everything.

    Surprisingly, though you have pointed so many English issues which might be ambiguous, we do find the means to use it to communicate quite well in other fields of life.

    Collins highlights this in The Shape of Actions (and also to some degree in Tacit and Explicit Knowledge). We do an immense amount of repair work as we provide input to and receive output from machines and the programs that run on them, and it stands to reason that we would be doing that same kind of work as we plan and discuss building and testing those machines and programs. Humans have the advantage of being able to watch and listen to each other, to get clues and cues from our behaviour and our environment, and not just from our spoken words and written documents.

    I’m not a native English speaker, but still (I assume 🙂 ) I did get the point you wished to point out…
    Probably since you didn’t use any testing terms in this post.

    Anyhow – it was fun to read, since we just had that same discussion here where I work.
    Trying to find a common language within a rather small and local R&D group.
    No matter what we will define – just as long as when we do use it, we all mean the same thing.

    Reply
  7. Very nice and interesting post. I can see some valuable comments and discussion. Can we have more such post on Localization and Globalization testing services. look forward to your next post.

    Reply
  8. Thanks, for another interesting and thought provoking post!

    For me the issue seems to be about defining vs. describing.
    I mean, “defining” is about explaining the meaning of a term (ex. sanity testing) and setting it as a standard.
    “Describing” is analyzing and describing how language is spoken by a group of people and agreeing on a version of it.

    For me it seems they should not be contrary but complimentary.

    We do have a bunch of definitions in other fields. For example we have dictionaries (that contain definitions) for medicine, physics, marine, etc. And as far as I know they are proven to be useful.

    Of course as language evolves and the world around us evolves the use of terms also changes.
    At least, for me it seems useful to have some kind of dictionary where to look up definitions and later if necessary change/redefine the meaning of it for a certain context.

    Reply
  9. Yes, generally smoke and sanity testing are used interchangeably. And these sorts of questions about terminology are often not very useful – ie, so long as all parties on a project agree on and understand the terminology, it doesn’t matter a whole lot. But I believe the terms originally referred to things that were a bit different. As was said above, smoke testing originated from the practise of plugging in a circuit board to see if it smoked. But I believe sanity testing originated from the idea of seeing if the product produces an output that could not possibly be correct, and it’s immediately pretty obvious that this is so (ie the result is insane). A simple and slightly silly example (that does illustrate the point) would be that the system is supposed to add two numbers together, so 100 and 50 are input, and it outputs 6,000,000.

    Reply
  10. We work with electronic devices; most of these devices are not released for public use. We have defined “smoke testing” as the set of tests that we execute to check if the device emits smoke (such as vary power input to the device, play around with battery power, etc.). On more than one occasion, the device tends to go up in smoke :).

    We have also defined “Sanity testing” in our teams; On many occasions, testers have to test a product for long time periods. (such as watching a video for more than 6 hours to observe the ability of the device to handle such videos and high resolution/volume, etc). There comes a point in time that the tester would tend to lose his sanity (has happenned to me multiple times). We call these tests that can potentially cause us to lose sanity as “Sanity Tests”.

    How I wish that the Google SEO picks up these definitions from us 🙂 111

    Reply
  11. Needed to compose one simple thing yet thanks for the suggestions that you are contributed here. Would like to read this blog regularly to get more updates regarding software testing types

    Reply
  12. Interesting post, Most of the times we get confused between the meaning of Sanity Testing and Smoke Testing. First of all, these two testings are way “different” and are performed during different stages of a testing cycle.
    Smoke testing means to verify (basic) that the implementations done in a build are working fine.
    Sanity testing means to verify the newly added functionalities, bugs etc. are working fine.
    Smoke testing- This is the first testing on the initial build. Done on every build.
    Sanity Testing – Done when the build is relatively stable. Done on stable builds post regression.

    Michael replies: This works fine as long as you say “where I work” or “in my community”. That’s the point of the post.

    Reply
  13. Very interesting analysis and clarification, made me think about how we go along with the organisation culture and you adapt and that adaptation becomes a definition.
    In my experience, I always have struggled to provide a straight answer to that question (differences between smoke vs sanity), here what we use it for.

    Smoke Testing
    In a release in DEV, QA and UAT we do a smoke testing – meaning: a short bunch of tests that we CAN enter dummy data.

    – Execute a bunch of happy path short tests.
    – A bunch of short negative tests.
    – Targeted regression tests.

    Sanity Checks
    In a release to production in STAGING (or PREPROD) and PROD, we do a sanity check – meaning: a short bunch of spot checks that we can NOT enter any data whatsoever.

    – Check if can load the system.
    – Check for error logs/sdk messages.
    – Check for any error on interfaces, mailboxes, queues.

    Notice that I have not mentioned exploratory as this is like standard in any context at any time, I don’t just stick with the scripts, but that is a whole new conversation for another discussion elsewhere.

    Reply

Leave a Comment