Blog Posts from March, 2009

World Agile Qualifications Board; God Help Us

Monday, March 30th, 2009

The World Agile Qualifications Board should be seen as an embarrassment even to regular peddlers of certification.

The WAQB has apparently been established recently—but when? By whom? There is a Web site. I hesitate to link to it, because I don’t want people to see it… but on the other hand, I do want people to see it, so that they can observe from the outset how these things work.

Note that there is no information to be found on who the World Agile Qualifications Board is. No human’s name appears on the Web site. None of the people whom I respect in the field of Agile development (many), nor any of the people that I don’t respect (somewhat fewer, but still numerous) appear to be willing to identify themselves with this shadowy organization, either on the site or in other forums. There is a LinkedIn group; see below.

Go to the site and you’ll see just how comical this gets. Under the link “Find Training and Certification”, those reading as I write (March 29, 2009) will see a graphic that includes the logo for the London Underground (this trademarked logo is doubtless used without the permission of Transport for London) and the words “After London, you deside where to go next” and “Lets hear from you and your team”. Then there’s a list of countries, preceded by the suggestion “Clik on your chose:” (Yes, everything misspelled above is really spelled that way in the original.)

The WAQB is, apparently, offering a course certificate in Agile Testing. The fee for the course is £990.

What is the course about?

The course provides basic knowledge of agile testing. After the course there is the opportunity to sit an examination for the WAQB-T Agile Testing Foundation Certificate. Agile test is gaining recognition as a specialized field. In thedevelopment of systems and software, testing can account for 30-40% of the total cost. It is possible to reduce this cost significantly and still achieve improved quality by adapting agile testing mindset

Who should attend?

This is for developers and tester. We mention developer’s upfront, because this is NOT only for traditional testers The course is for anyone involved in agile development. It is worthwhile for project leaders and developers, who need an introduction to agile testing or who test software.

The text above, errors and all, including the missing periods, is cut and pasted from the original. That is, I’m not making any of this up. Here’s a little more snipped from the site:

Express courses 2 days* : Participants who hold another formal certification like: Scrum Master, ISTQB/ISEB Foundation, PMP, PMI, Prince2 or similar a 2 days intensive courses is available.

The WAQB-T Foundation Certification – WAQB-T Agile Testing Foundation Certificate – statup May 2009

The WAQB-D Foundation Certification – WAQB-D Agile Development Foundation Certificate – statup May 2009

The WAQB-M Foundation Certification – WAQB-M Agile Team Member Foundation Certificate – statup May 2009

As I mentioned above, there is a LinkedIn group. I am the 27th member. No Agilist that I recognize is a member, and very few members of the list identify themselves as a member of other Agile-related LinkedIn groups. No one in my survey of the list members (hey, there are only 26 others, so I looked at most of them) appears to make any claims related to the founding of or involvement with the WAQB.

There is a review board, and you too can join!

WAQB will use the techniques from Open Source to ensure that the quality of the syllabus is of high quality.

You can now become member of the Review Board. As a member of this board you will be asked to review topics in the syllabus. But you will also have the opportunity to suggest new topics, or change of topics.

In this way WAQB are sure that we will get a high quality of the syllabus and course material at any time. The fast changing agile world will force us to be up front and to work in new stuff often.

…but not quite up front enough to identify ourselves from the get-go.

In my opinion, all this shows signs for the WAQB being a scam, and a racket. My opinion is that of an experienced tester, a member of several testing communities, and a teacher of and consultant in testing. I consider the WAQB to be a racket even worse, even more transparent, even more nakedly a way to separate people from their money than the usual certification schemes. Everyone is free to make his own decision, but I believe that one would be a fool to have his (her) pocket picked by these people (or this person). And you’re not a fool, right?

In fact, you’re an upstanding member of your community, and so you will warn your colleagues and your managers, and everyone else who might innocently or uncritically seek or support certification, agile or otherwise, that isn’t skills-based—the kind of certification that is roundly and rightly dismissed by many thoughtful people, including Elisabeth Hendrickson, James Bach, Tom DeMarco, and the Agile Alliance itself. Oh, and by me.

Requirements Development

Thursday, March 26th, 2009

Scott Berkun has an interesting post on requirements, and why they’re problemmatic. He says that “data collection” isn’t the issue. I agree with that. He also refers to “requirements gathering” (to me, not so good) and “requirements definition” (a little better, perhaps).

I like to think of this whole business not as requirements gathering, but as requirements development. (I credit Ian Heppell for pointing this out to me, but I believe that Karl Wiegers has said something along the same lines.) Some people think and write as though requirements are there to be gathered, like pretty stones or ripe blueberries in more-or-less plain sight. That’s often true of desires; they’re often visible right on the surface. But many requirements aren’t obvious from the start until we’ve learned more about what we’re developing, about what’s feasible and what’s unachievable. Testing often reveals new information that informs new thinking about what is required. In just about every case, some stakeholders’ desires will conflict. That’s why requirements must be discovered, revealed, prioritized, negotiated, and decided as the project continues.

Scott suggests giving authority over requirements to one person per major area. Even then, there’s a good chance that some requirements will conflict when they involve several major areas. As much as I’m a fan of collaboration, delegation, and self-organizing teams, at a certain point somebody’s signing the cheques. For good or ill, that will be the person to decide (or to delegate the decision to some other person) on what is a real requirement and what isn’t.

There was another interesting aspect to the post. In the comments, PSJ provides an example of a common problem. He (she?) says “Certainly being given some control and ownership over the requirements document would have made my life easier in the past.”

Now, you might think that the problem here is lack of control—and I’d agree, but I see another problem: Note how easily “requirements document” and “requirements” seem to slip into one another.

With respect, I’d suggest that someone (perhaps PSJ him/herself, but certainly the person who owns the project) needs control and ownership over the requirements, not merely the requirements document. The map is not the territory.

On Indispensable People, Documentation, and Skill

Tuesday, March 17th, 2009

In a blog post on The Test Eye, Martin Jansson has some things to say about the dangers of The Indispensable Worker. The post is worth reading. I commented there, and do a somewhat better job of it here:

Your point about indispensability is well-taken. In workshops that I’ve attended, Jerry Weinberg has often pointed out the urgency of getting rid of the problem of indispensability. If someone appears to be indispensable, it’s a great risk to the organization; it either has become or will become severely maladapted to existence without that person. This isn’t to say that people won’t be missed when they go; everyone carries tacit knowledge and experience that no one else has. But whether someone will be missed or not, their departure shouldn’t destroy the organization. An “indispensable” person will disappear eventually, one way or another. So if you see the problem, get started on addressing it now.

There was a point with which I disagree, though—at least with the emphasis:

As a co-worker you avoid these traps by requiring documentation enough for someone else to perform the task or that you have at least a backup for the critical tasks.

I’d put that in the opposite order. If you’ve got a backup (in the form of a person who can do that task), then you might not need documentation at all to get the job done; and you might have a better way of performing the tasks that the job requires in the high-pressure moments. This is why commercial airlines tend to have a captain and a first officer in the cockpit, rather than a pilot and a book on how to fly an aircraft.

Note also that there are several relatively high-pressure moments on every flight. Just because something could be done by a single person doesn’t mean that it’s automatically better policy to have only one person doing it. On the contrary, in most cases.

What’s James on about THIS time?

Tuesday, March 10th, 2009

James Bach reports on a statement allegedly made by Yaron Sinai, the CEO of Elementool, and Joseph Ours comments.

In light of Joseph’s comment, I too wonder if Elementool was tested by human testers under the control of their own process. Perhaps it was tested by testers who strictly followed the steps, which Mr. Sinai apparently suggested was all that was necessary.

If the latter, then Joseph’s experience can be explained in terms of W. Ross Ashby’s general systems law, The Law of Requisite Variety. This law suggests that a system with N states cannot be controlled or understood by a system with fewer than N + 1 states. There’s a variation of this law in Karl Weick’s advice that if you want to understand something complicated, you have to complicate yourself (Sensemaking in Organizations, yet another wonderful testing book that isn’t about testing).

One key to excellence in testing is not only to understand testing itself, but also to explore and investigate widely diversfied interests and skills so that new learnings come to the field.

If Elementool was indeed tested by unskilled human testers, Mr. Sinai’s dismissal of the role of interactive testing performed by humans is perhaps explicable. The problem Joseph reports would seem like a pretty difficult bug to pass by skilled testers. (Yet one never knows. Perhaps it was a bug introduced moments before a product manager, under market pressure, decided to ship the product without a review, a unit test, a more general functional test, or a targeted retest of that area. Perhaps there’s a platform-based problem, based on an environment that Joseph has and that the Elementools people don’t. Perhaps it was a known bug that only happens in the exact case outlined by Joseph, and was deemed insufficiently important to fix. But in such cases, you run the risk of being embarrassed by a tester like Joseph. Or me. On the other hand, maybe you don’t embarrass easily.)

On the page, I added a second user to my demo account. I deactivated the user by clicking the associated checkbox and pressing the Update button. Upon returning to that page, I clicked the checkbox to reactivate the user. I got this:

Every time I reproduced the error, I got a different error code. This is surprising and interesting; it was true when I reproduced Joseph’s problem too. I would have expected a consistent error code per problem. It may be that the error code refers to a specific incident logged in Elementool’s tracking system; or it may be that there’s no mapping between the error code and anything useful; or there may be some other explanation.

I was curious about the quote in James’ blog post, and I wanted to check out Jeff Feinman’s other articles on SDTimes’ Web site. I did a site search. Ah, here’s one:

Clicking on the underlined link calls up the URL listed below the summary. That returns a page of today’s top stories, not the article I’m looking for. The URL in my address window is now (my emphasis added there).

So I found another Jeff Feinman article instead. That one, headlined “Serena computerizes agile development” starts “March 6, 2009 — Serena Software is putting agile development into a software-as-a-service model, but the company admitted that transforming such a human-oriented process into a computerized one wasn’t an easy task.” For a second time in a few days, Mr. Feinman uses a peculiar form of expression to confuse a process and a tool that aids that process.

The article includes this little gem: “The No. 1 objection that people have against SCM tools in general is that it’s a tool,” said René Bonvanie, senior vice president of marketing for Serena. “Developers hate tools. Anything that forces process or input, they hate. So we needed to build something that was computerized but super simple to use, and very intuitively usable to people who are used to whiteboards.”

Well, at least there’s some consistency. The marketers for Elementool and Serena provide stupid, sweeping generalizations to offend members of the testing and programming communities equally. It’s odd that Mr. Feinman would publish such remarks uncritically, and that no developer, no member of the Agile community has as yet seen fit to object. QiD.

Update, 2014-09-10 This post was updated to correct a graphic problem, to get rid of some Blogger-related cruft, and to fix an inaccurate reference, pointing it properly to Weick.

IMVU: The Final Chapter

Sunday, March 8th, 2009

Perusing my Blogger page, I suddenly realize that I never posted this final wrap-up to my original observations on IMVU and its 50-deployments-a-day approach, plus the comments here, here, here, and here.

In addition to his Quality is Dead article, James has added his perspective on IMVU specifically.

Is there a problem at IMVU? As an outsider, is this any of my business anyway? I think so, at least to some degree.

  • I wouldn’t use the service myself. That’s reasonable; it’s not my thing. My stepson might be inclined to use the service at some point, and might be inclined to buy special services or other stuff from them, at which point he’d ask me to pay for it. With my credit card. And then we’d have a problem, because I don’t have sufficient trust in IMVU to give them my credit card information, based on everything I’ve seen so far.
  • IMVU is a service that seems to appeal mostly to kids. I wonder about the ethics of deploying what seems to be some pretty shoddy software to kids, and thereby teaching them that software doesn’t have to work, that their complaints can go unresolved, and that it’s routine and tolerable for their accounts to be vulnerable to hacking, as long as the programmers are deploying 50 times a day. As a person involved in developing software, that makes me sad, and I think, in a small way, it tarnishes all of our reputations.
  • In particular, it stands a good chance of tarnishing the reputation of the Agile community. The Agile Manifesto is a beautiful thing, and I believe in it. But agility, in the dictionary sense, isn’t merely about quick movement; it’s also about being in control, responding to what’s happening in your environment, and keeping your balance. If 50 deployments a day of this stuff is seen as an exemplar of Agile methods, we haven’t crossed the chasm; we’ve dived into it.
  • IMVU and its issues are one thing, but what really freaks me about this whole business is the oblivion about testing from the respondents to Timothy Fitz’s post, and the notion that automation makes testing passé. To me and to my colleagues—people who think seriously about testing, and who study testing—this whole 50-deployments-a-day and a million-automated-tests-a-day thing isn’t testing at all. I repeat, from an earlier post: at best, it’s checking, and it’s checking done almost entirely without inquiry. That, to me, is a corruption of the idea of testing, which is questioning a product in order to evaluate it (Bach), gathering information with the goal of informing a decision (Weinberg), or (deep breath here, but worth it) an empirical technical investigation of a product, done on behalf of stakeholders, with the intention of revealing quality-related information of the kind that they seek (Kaner).

It’s IMVU’s business; it’s their call. I hope it’s not contagious, but I fear that it might be, for a while.

Still more IMVU comment followup: The Final Chapter (so far)

Sunday, March 8th, 2009
Markus Gärtner commented, in part, Actually what seems to be missing is the pride and responsibility in the software world. From my point of view I can pile up a lot of technical debt, but deliver really fast.

I’m not so sure if pride is missing. Timothy Fitz seemed to be proud of the work that he was doing. Moreover, hubris is a form of pride, and there seems to be no shortage of that in our business.

You raise an important point on technical debt. It’s bad enough for us to be running up our credit card bills, but when we don’t know how much debt we’re running up, that’s really dangerous. I see this with every company that makes a conscious decision not to test things that might be important.

You cannot expect to improve your software process by one order of a magnitude. This is unrealistic, stupid and dangerous.

Right. Apropos of that: ever noticed the claims made by the tool vendors?

Ben Simo weighs in: I think that users have become so accustomed to bad software that they don’t notice the bugs — for long. Software users are accustomed to finding workarounds and the quickest way to make errors go away.

Yes, and I think this is amazing. Part of it has to do with the magical aura that we have around computers; any sufficiently advanced technology is indistinguishable from magic, as Arthur C. Clarke said. Magic dazzles us into non-reasoning. Here’s an example that I’ve been using for about 20 years:

You’re about to go on a trip. You take your car, which is running fine, to the mechanic for a checkup. Because he leaves work at 6:00pm, and you can’t get there until 6:30, you agree that he’ll leave your keys with the cashier at the gas bar. At 6:30 you arrive and pick up your keys. You get into the car, put the key in the ignition, and turn. Your car makes a noise (yawp!) and then suddenly goes dead. Now: is your reaction “Aw geez, I’m so lame. I’ll never understand these complicated car thingies. It must be something I did.” Or do you feel anger and resentment towards the mechanic?

It used to be that users of bad software blamed themselves, but now I think that most people simply don’t expect things to work. We’ve all become like Sam Lowry in Terry Gilliam’s Brazil (the relevant scene starts at 3:34 or so in the link above, but if you haven’t seen the movie, for heaven’s sake, rent it). To top it off, when we call Central Services, we get that wonderful message: “Thank you for calling Central Services. I’m sorry due to temporary staff shortages, Central Services cannot take service calls centrally between 23:00 and 09:00 hours. Have a nice day. This has not been a recording.”

Raoul Duke: wow. you 2 are my heroes. completely how i would feel about that kind of thing. it absolutely cracks me up that you were talking to people and asking them if they found bugs when i’m sure they just were 12 year olds looking to cyber. awesome. i want to hire you all but i’m just a nobody.

Raoul, just a reminder: the pseudonym you’ve taken is one assumed by a man who wrote vigorously, at length, and, above all, clearly.

More IMVU comment followup: Timothy Fitz’s reply

Saturday, March 7th, 2009

In response to my post on IMVU, I was delighted to receive a reply from Timothy Fitz, whose original blog entry triggered my investigation.

There are many things to like about Timothy’s reply. First of all, it’s honest and forthright. Second, he seems not to be taking personally the criticism of the product that he and his company have been working on for years. It’s a rare skill not to take things personally. So, thank you, Timothy, for that.

He begins:

I would like to clarify, we do have a Quality Assurance staff. There are numerous quality engineering tasks that only a human can do, exploratory testing and complaining when something is too complex come to mind. They’re just not in the “take code and put it into production” process; it can’t scale to require QA to look at every change so we don’t bother. When working on new features (not fixing bugs, not refactoring, not making performance enhancements, not solving scalability bottlenecks etc), we’ll have a controlled deliberate roll out plan that involves manual QE checks along the way, as well as a gradual roll-out and A/B testing.

When someone complains that something too complex, I’ve been trained by Jerry Weinberg and his work to ask too complex compared to what? In the same way, when someone says that it doesn’t scale to have testers look at every change, I ask why it can’t scale. Programmers make every change, don’t they? After all, “it can’t scale” is one of the great myths about the Agile Manifesto. There’s a difference between “it’s too complex” and “I don’t have a more useful model than the one I currently have”; between “it can’t scale” and “I don’t know how to solve the problem of making it scale.”

One approach to solving complexity or scaling problems is to reconsider what testing is and where it happens. Pair programming in which no tests are written is a form of testing (we often call it “continuous review”, but review is a form of testing). Behaviour-development, in which we check that each function at least to some degree does what it should as we build it, do is a form of testing. And continuous deployment is a form of testing, too.

One definition of testing is “the gathering of information with the intention of informing a decision” (that’s paraphrased from Perfect Software and Other Illusions About Testing, by Jerry Weinberg). The complaint that something is too complex is information. Your testers are telling you that you about the testability of the product, compared to their ability to test it. There are all kinds of details to the story to which we’re not privy. Maybe they believe that they have to retest everything in the product every time the programmers make a change; maybe they believe that testing means seeing the visible manifestation of every change from the user’s point of view; maybe there are insufficient hooks for the kind of automation they want to apply; maybe they are being mandated to do other things that impinge on their ability to study and grasp the issues that they’re facing; or maybe they’re the creditors on some of the programmers’ technical debt, and the number of bug reports that they have to investigate and report is taking time away from their test design and execution—that is, their test coverage.

There are constraints to every testing assignment, and as Jerry says (quoted in James Bach‘s article in The Gift of Time), it’s the first responsibility of the tester to figure out ways to get around those constraints. But that may not always be possible, given the situation.

Another form of testing is asking questions, like “if you don’t involve testers when you fix bugs, make performance enhancements, solve scalability bottlenecks, etc., how do you know that you’ve fixed, enhanced, or solved?” And others like, “What are your testers doing?” “Are they only testing new features?” “Are you aware of how useful skilled testers can be?” “Do you see any opportunities for adding efficiencies to your model of testing?”

Your point about the sheer number of bugs we have? you’re right. Our software has way more bugs than I’d like. It’s a byproduct of the choices made when the company was small: to prioritize determining what the customer actually wants at almost any cost. We would absolutely love to have a high quality client, and we’re working every day towards that goal.

Continuous Deployment let’s you write software *regression free*, it sure doesn’t gift you high quality software. As a start-up, we’re faced with hard decisions every day about where to make our product higher quality; most of the complaints you have would be immediately ignored by the majority of computer users and so we can’t in good faith prioritize them over the things that ARE causing our users trouble.

I’ll respond to the things I disagree with in a moment, but I want to draw attention to the most important aspect of Timothy’s reply: he highlights that developing software and services and products and systems is a constant set of tradeoffs, and that, just like the rest of us, he and the rest of the IMVU crew are making these decisions all the time. That’s important because, as I’d like to emphasize, my notion of IMVU’s quality doesn’t matter. “Quality is value to some person(s)”. When James and I teach Rapid Software Testing, we add something to that: “Quality is value to some person(s) who matter“. I’m an outsider. I don’t have any interest in using chat illustrated by anime-looking 3D avatars who teleport from place to place. I have no interest in handing IMVU my money for this service. I have no interest in the community this stuff supports. (So why did I even bother to look at the service? I am interested in software development and testing, and I wanted to investigate the relationship between a million test cases a day, a million dollars a month in revenue, and the system being tested thereby.)

I’m going to introduce something perhaps more controversial here. Even if I were working for IMVU, as a tester, I still wouldn’t matter. How can I, a tester, say that? It’s because my role is not to push my values down the throats of the programmers and business people that I serve. Saying that I don’t matter is a simplification; my values don’t matter as much as the business values and the customer values do. I do matter, but only precisely to the degree that I deliver those people the information that they value to inform their decisions. I can find defects in the features offered by the product; I can tell them about things that I see as driving risk; I can tell them about things that I’ve discovered that represent a threat to the value of the product to someone who does matter. And at that point, the decision is up to them.

A couple of other points:

While I agree that continuous deployment doesn’t give you high-quality software (in the sense of few bugs), I disagree that it lets you write software regression-free. It performs some tests on the software that might find regression if that regression happens to be covered by one of the tests. That’s not a bad thing in itself; that’s a good thing. The bad part is that, once again, it provides The Narcotic Comfort of the Green Bar. There’s a big difference between “our tests find no regressions” and “there are no regressions”.

Second, continuous deployment is Way Cool. As Elisabeth suggested, that IMVU can push out 50 deployments a day is impressive. But it reminds me of a story (I think it was Al Stevens) in which you go to the circus. Last year, they had a bear on roller skates, which impressed you. This year, the bear is on motorized roller skates. And you’re dazzled for a moment, until you consider, “Wait a second… do I really want to see a bear on motorized roller skates?” 50 deployments a day is impressive, but 50 deployments of what? For what purpose?

Could it be that the focus on deploying 50 times a day represents opportunity cost against other equally or more desirable goals? Goals like, say, addressing the problem “Our software has way more bugs than I’d like”; or addressing the complaint from the testers that the testing mission is too complex; or investigating the functionality and security problems that the customers seem to be reporting and that might represent a serious threat to the value of the product? Is 50 deployments a day providing business value that can’t be delivered any other way? Any other way? Would customers or IMVU itself suffer some loss if they had to wait 30 minutes for a code drop, instead of 15? I repeat: I don’t know in this case. I don’t know whether five deployments a day would be better, or five hundred a day, or one every five hundred days. I do know that part of testing is noticing the cost of one activity and its potential impact on the value of another.

The vast majority of developers take one look at what they think our product is and don’t bother to give it a try; I’m happy to see a thoughtful open-minded dive into IMVU from a developers perspective.

I’m a specific kind of a developer; I’m a tester. As such, it’s my particular bent to investigate things, and not to take them on faith, and to report on what I’ve found. I genuinely appreciate Timothy’s thoughtful and open-minded reply, and I thank him for triggering some of the questions and observations above.

More IMVU comment followup: Survivorship Bias

Saturday, March 7th, 2009

In the comments to our rapid test of IMVU, Elisabeth Hendrickson said, “Great list of issues. However, I suspect that these issues don’t really interfere with the core value to the target users.” That might be true. I think it might be more accurate to suggest that the issues don’t interfere with the core value to the existing users. That is, IMVU might be hitting the dartboard, but not the target. (I emphasize: I don’t know that either way.)

This brings me to a mention of one of the critical thinking errors discussed in The Black Swan: survivorship bias. Nassim Taleb retells a story originally told by Marcus Tullius Cicero.

“One Diagoras, a nonbeliever in the gods, was shown painted tablets bearing the portraits of some worshippers who prayed, then survived a subsequent shipwreck. The implication was that praying protects you from drowning. Diagoras asked, ‘Where were the pictures of the people who prayed, then drowned?’

“The drowned worshippers, being dead, would have a lot of trouble advertising their experiences from the bottom of the sea. This can fool the casual observer into believing in miracles.

“It is easy to avoid looking at the cemetary while concocting historical theories. But this is not just a problem with history. It is a problem with the way we construct samples and gather evidence in every domain. We shall call this distortion a bias, i.e. the difference between what you see and what is there.” (The emphasis there is Taleb’s.)

Taleb here is talking about survivorship bias, a form of selection bias. History tends to be a story written by the side that survived the war, and the audience tends to be on the same side. The historians tend to get their stories from the people who survived. Moreover, the people interviewed typically represent a limited subset of the people who could have been interviewed.

Testing informs an ongoing story of the product. When we consider test results, it’s easy to be lulled to sleep by the narcotic comfort of the green bar. The green bar provides us with the good news that the tests we have run are all passing. There is great value to that, and I don’t want to gainsay it. Yet confirmatory tests are less testing and more checking. If we’re going to avoid being fooled, we must also remember to consider what we might learn from the tests that we haven’t run. If we’re considering the success of our organization, as Peter Drucker suggests, we need to consider the people who aren’t our customers, in addition to the people who are.

In the next post, I’ll respond to the excellent comment I received from Timothy Fitz, who posted the blog posting that inspired my brief investigation of IMVU.

Comments on the IMVU Report

Saturday, March 7th, 2009

Well, that generated some comments. Interesting. People talk a lot about testing, but nothing gets ’em fired up like test results. I really appreciate the feedback, and I’d like to respond to it here, over a couple of posts. I don’t mind Blogger’s Compose feature but (as far as I know so far), it has a pretty clumsy method for editing comments. Like, none that I can see.

Sai Venkatakrishnan says:

I am happy that they are using continuous integration and deployment as well automated test suites. It is really healthy to integrate and deploy as soon as possible. But I will never try to use this as an alternative of manual testing i.e. Exploratory Testing.

Underestimating the value of human thinking and relying completely on automation is a mistake lot of people do. Automation can give you fast feedback but it can do only what you ask it to do. But people think, adapt and act and this is really important for testing. I am not sure when we will realize that and add it our mandatory routine of application development.

I think we have a chance of that when we understand that automation is not the goal; it’s a medium by which we achieve a goal. It has an effect: it can assist, extend, enhance, and accelerate testing, but that it isn’t testing. Something that has an effect is a medium, as McLuhan said. If we want truly to understand the medium of automation, we need to examine the other three effects that every medium has: every medium retrieves ideas from the past, from history, from literature, from mythology; every medium makes some previously prominent medium obsolete; and every medium, when taken beyond its limits (or “overheated”, as McLuhan said), reverses into the opposite of its extending/enhancing/accelerating effect. Cars extend our presence by getting us from place to place quickly; too many cars and we can’t get from here to there.

So automation, extending and intensifying our insight into some aspect of the product, helps for a while. And when automation overheats and reverses, we become blind, overwhelmed by the volume of what we have to analyze and maintain.

I don’t have the right answer to the question as to whether automation is overheated in a particular context. To me, it would seem responsible and competent technical work for the people involved to keep asking, and if the project community reaches consensus on the answer, then that’s the right answer for them. As a tester, part of my job is to draw attention to things that they might not have noticed; to be the headlights of the project. If what they see in the headlights is okay with them, that’s their perfectly legitimate choice.

Shrini Kulkarni asks

How does [my deciding that I was looking for bugs too early] compare with “conventional” wisdom that finding defects early in the life cycle is cheaper hence testing should be introduced early in the cycle? Are you saying that this conventional wisdom (or common sense) has changed its form?

I don’t think so. Finding bugs too early has a lot in common with creating test scripts too early; it’s dashing in and doing something before exploring the problem space properly. I made the freestyle explorer’s version of the scripted test designer’s mistake. Both mistakes incur opportunity cost; both are distractions from figuring out what’s important. Goal displacement, as you’d call it. I feel I should have got farther into the product and had a look around.

The advantage of the exploratory approach, I feel, is that I can recover quickly from this problem when I’m in control of my process. If someone or something else is controlling me, we’re dealing with a larger, more complex management system—and larger, more complex systems take more time to respond.

Kay Johansen remarks

We talked about continuous deployment today after Salt Lake Agile Roundtable, specifically about We concluded that it might work for “discretionary” software but had doubts about “critical” or business software. So IMVU may be another example of where continuous deployment “works” because the bugs are not “important” to the users.

I’m delighted to hear that other people are questioning the issue, and I’m in very strong agreement with the last sentence. I want to emphasize that my test report was not an assessment on what’s right>; it was an attempt to tell a story of what I observed, and then some musings on the larger issue of what might be okay and what might not be. If the story of continuous deployment and the story of what I found are consistent with what the IMVU folks think is okay, that’s their call and they’re right to make it that way. In fact, as …

Elisabeth says

Sims 2 (I’m a recovering addict) has similar rendering weirdnesses with objects intersecting. I can attest that they didn’t interfere with game play and didn’t make the game any less fun. Actually, it made the game *more* fun when the rendering issues produced particularly amusing and anatomically impossible intersections.

Quite right. My stepson and I had lots of fun in one of the PlayStation hockey games, running the instant playback in very slow speed and seeing the glass shatter long before anyone banged into it. (I was far more interested in that than in the hockey part of the game.) Elisabeth raises a number of other important points. She goes on…

So I decided to try to find out if users enjoy IMVU. I discovered that searching on “IMVU love” didn’t bring back the – ahem – information I was looking for. So I tried “IMVU fansite”. Lots of people love this thing. People have even made fan videos.

I went looking for complaints about IMVU from users. I came up with one where the poster is complaining not about software bugs but rather that the whole thing is a waste of time, that he didn’t get as many credits as he expected, and that people are disrespectful. In other words, none of his issues had anything to do with software quality.

That could be true. On the other hand, systems of all kinds tend to have a certain kind of broad consistency. A more sophisticated look and feel might be related to a more sophisticated community. a) And maybe not. b) And either way, IMVU may be happy with the community it has.

I went for rough-and-ready metric-based data:

Results 110 of about 354 for “imvu sucks“. (0.09 seconds)

Results 110 of about 782 for “imvu rocks“. (0.11 seconds)

Clearly more detailed research is called for. 🙂 But more to come later, on the subject of “survivorship bias”.

50 Deployments A Day and The Perpetual Beta

Friday, March 6th, 2009

There was much rejoicing on Twitter this afternoon over a blog posting.

Apparently, IMVU is rolling out fifty deployments each and every day, and they’re doing so by the magic of Continuous Deployment. “The high level of our process is dead simple: Continuously integrate (commit early and often). On commit automatically run all tests. If the tests pass deploy to the cluster. If the deploy succeeds, repeat.

Some more details:

Our tests suite takes nine minutes to run (distributed across 30-40 machines). Our code pushes take another six minutes. Since these two steps are pipelined that means at peak we’re pushing a new revision of the code to the website every nine minutes. That’s 6 deploys an hour. Even at that pace we’re often batching multiple commits into a single test/push cycle. On average we deploy new code fifty times a day.

So what magic happens in our test suite that allows us to skip having a manual Quality Assurance step in our deploy process? The magic is in the scope, scale and thoroughness. It’s a thousand test files and counting. 4.4 machine hours of automated tests to be exact. Over an hour of these tests are instances of Internet Explorer automatically clicking through use cases and asserting on behaviour, thanks to Selenium. The rest of the time is spent running unit tests that poke at classes and functions and running functional tests that make web requests and assert on results.

Now: to be fair to Tim, he may be making claims about the deployment process only. That’s not clear to me from his posting. I inferred that he was making claims about the overall product or service.

If so, those claims sounded very promising. I wanted to test them against experience, so I went to IMVU to see what a robust, well-tested-by-automation application looks like. I also invited James Bach over Skype, impromptu, to try testing the service at the same time, so that I’d have someone to chat to; IMVU is, essentially, a virtual world and chat room. What follows is a merging of our very rough notes, based on about 25 minutes of testing for me, 15 for James. If we had been testing in a more formal setting (that is, in-house for the company as employees or contractors), rather than for fun on a Thursday night, we would have been more detailed and specific about recording our actions. We would likely have used BB Test Assistant or some other form of screen-and-audio capture that we could annotate and narrate. Working together, we were able to confirm a general problem that I found to be the most annoying bug: music was played automatically, intermittently, and loudly, and the volume control and the mute button that appeared intended to control the music didn’t ever work.

I started the session by making a testing mistake: I went looking for bugs too early. I often make this mistake, and need to adjust my process. The problem, as Jon Bach pointed out to me several years ago, is that if you look for bugs too early, you may end up finding them, investigating them, and recording them. That takes time away from getting an overview of the product, building your model, and understanding the benefits that the product is trying to offer. I stepped into this trap right away.

It’s a particularly bad trap, in fact, when the zeal for bugs prevents you from being able to log in properly. That’s what happened. After a certain amount of rather sophmoric input constraint attacks (the email field accepted at least 1,000,000 characters of data; I didn’t attack further), every attempt to use the signup screen looked like this:

“Please correct the following errors:
Error IMVU was unable to process your request”

I couldn’t find a way to apologize and make the application start working, even though I fed it grovellingly valid data. To save troubleshooting time, I ended up setting up an account on a second machine. Here are the notes of what James and I did after that.

  • End and Shift-End keys don’t work in the “Choose Your Name” window.
  • pressing “cancel” during install results in “Download Failed” message.
  • IMVU (not responding) when trying to log in
  • clicked on the “New? click here to register” link and it froze. It turns out that it froze while attempting to load the webpage, but there was about a minute and a half delay or hang
  • unable to paste text into chat box
  • couldn’t stop the avatar from dancing (later found out that if he sat down, he’d stop)
  • in one of the rooms, I choose to move to a spot on the couch. I suddenly found my avatar arranged on the couch with a female avatar, such that our legs passed through one another in a rather painful and physically impossible kind of way.
  • music (techno/hip-hop, often in Japanese) started and stopped randomly, without warning.
  • the music was loud. The volume control and mute button on the music screen didn’t work at all. The music played at top volume at random times.
  • music appears subject to intermittent dropouts.
  • when I max out the characters in a message in the “special someone” box, I can still add additional characters one at a time, despite being told I’m at the max
  • when I tried to become a “special someone” with myself, the error message was “you are not buddies with that avatar!”
  • you can add yourself as a buddy
  • after adding yourself as a buddy, you can invite yourself for a chat. The message “inviting…” appears in the top of the room window and is still there 10 minutes later
  • after adding myself as buddy, I could add myself as a “special someone” to myself
  • in one of the rooms, there was a dropdown box labeled Media/Music. Double-clicking on one of the options, “Country 108”, gave a dialog box captioned Message, with “Loading error!Please try again!” (with the missing space after the first exclamation point) and a close button that said “Sure”.
  • with the uncontrolled music on in the room already, clicking on a working radio channel superimposes extra music over the existing music. Two tracks at the same time; din.
  • clicking on the music channel very quickly after clicking the first time results in a stream played at half-speed or double or n-tuple speed.
  • when the camera is tilted low, chat bubbles (dis)appear outside of the readable bounds of the window
  • sometimes you can’t get to your name in the room window, so you can’t get to your menu of options
  • in the menu, there’s an option for pasting text into the chat window. The first time I did that, I accidentally pasted the HTML source for the home page that was hanging around in my paste buffer. That text went out immediately to the chat room. There’s no way to buffer and identify what you’re pasting, as there is in Skype or Windows Live Messenger or other chat rooms
  • the text that goes into the chat window is different from the text that appears in the bubbles in the virtual world. If it’s sufficiently long, it throws away most of the text, and truncates at either end and you only get to see the middle.
  • why is the chat room text duplicated?
  • in all rooms, the rendering routines allowed limbs to penetrate furniture or other people’s limbs. Or vice versa. You can see an example:
  • my avatar has a necklace that appears to enter his back through the left shoulder blade, and then reappears at the front of the body, roughly through the heart. I know piercing is fashionable these days, but that looked painful.
  • the lyrical content sounded a little rough to ears that were brought up on the Beatles. That’s not a big deal to me personally, but there was no parental advisory. Which itself isn’t a big deal unless some parent sues.
  • chat text display isn’t apparently configurable
  • the product shows significant inconsistencies in presentation, look, and feel. Many features appear to open a browser outside of the IMVU application; the design esthetic of the Web-based service is very different from the application in which the action happens.
  • A prominent part of the screen in the window appears to be a message center. It turns out to be an adlink to, which is one of those sites that posts two annoying modal dialogs before it lets you leave. (This isn’t necessarily an IMVU thing; I don’t know the relationship between the companies.)

Most of the comments on the blog post mentioned above were positive. I’ll summarize: “D0000DE! That is sew kewl. U R l33t!” The enthusiasm suggested two things to me. The first is that nobody apparently had a look at the product that was being deployed the way James and I did—or if they did, they were far too polite to make comments or raise questions about it. The roars of approval were, to me, like the blurbs on the back of a book—you know the ones, in which you’re skeptical that the reviewers ever read the book. There used to be a feature in Spy Magazine, “Logrolling In Our Time”. Every month, they would find a smattering of book blurbs in this pattern:

“Best book I’ve read in years. Squane does it again!” Raoul Duke, on the back cover of J.D. Squane’s Frequent Manhood
“Brilliant, incisive. Duke’s a genius” J.D. Squane, on the flyleaf of Raoul Duke’s Temporarily Unsanitary

The second point derives from the first: that, among the technologists, there’s such a strong fetish for the technology—the automated deployment—that what is being deployed is incidental to the conversation. Yes, folks, you can deploy 50 times a day. If you don’t care about the quality of what you’re deploying, you can meet any other requirement, to paraphrase Jerry Weinberg. If you’re willing to settle for a system that looks like this and accept the risk of the Black Swan that manifests as a privacy or security or database-lclearing problem, you really don’t need testers.

Except… maybe I’m wrong about all this stuff. Quality is value to some person.

  • Maybe I’m not the person to make the evaluation. In the screen shot above, note the conversation that I’m having with Guest_SexyRocker1. I’m asking her if she finds bugs in this application. She says No. I’ve found it impossible not to find bugs.
  • Maybe these applications aren’t supposed to have elegance or polish; maybe audio controls aren’t supposed to work; maybe it’s cool, in a virtual world, to render furniture embedded in people.
  • Maybe this is just a proof of concept. If you look really closely, you’ll notice that the IMVU logo is occasionally accompanied by “Beta”. Maybe the Perpetual Beta, a release model that Google seems to have perfected, is the new normal, allowing us to forgive all sins.
  • Maybe, to use Bob Martin’s lingo, we don’t value craftsmanship over crap.
  • Maybe we expect things to have problems that can be revealed by these terribly basic tests. And maybe, in the sense that James is talking about quality here, quality is dead.