Blog: 50 Deployments A Day and The Perpetual Beta

There was much rejoicing on Twitter this afternoon over a blog posting.

Apparently, IMVU is rolling out fifty deployments each and every day, and they’re doing so by the magic of Continuous Deployment. “The high level of our process is dead simple: Continuously integrate (commit early and often). On commit automatically run all tests. If the tests pass deploy to the cluster. If the deploy succeeds, repeat.

Some more details:

Our tests suite takes nine minutes to run (distributed across 30-40 machines). Our code pushes take another six minutes. Since these two steps are pipelined that means at peak we’re pushing a new revision of the code to the website every nine minutes. That’s 6 deploys an hour. Even at that pace we’re often batching multiple commits into a single test/push cycle. On average we deploy new code fifty times a day.

So what magic happens in our test suite that allows us to skip having a manual Quality Assurance step in our deploy process? The magic is in the scope, scale and thoroughness. It’s a thousand test files and counting. 4.4 machine hours of automated tests to be exact. Over an hour of these tests are instances of Internet Explorer automatically clicking through use cases and asserting on behaviour, thanks to Selenium. The rest of the time is spent running unit tests that poke at classes and functions and running functional tests that make web requests and assert on results.

Now: to be fair to Tim, he may be making claims about the deployment process only. That’s not clear to me from his posting. I inferred that he was making claims about the overall product or service.

If so, those claims sounded very promising. I wanted to test them against experience, so I went to IMVU to see what a robust, well-tested-by-automation application looks like. I also invited James Bach over Skype, impromptu, to try testing the service at the same time, so that I’d have someone to chat to; IMVU is, essentially, a virtual world and chat room. What follows is a merging of our very rough notes, based on about 25 minutes of testing for me, 15 for James. If we had been testing in a more formal setting (that is, in-house for the company as employees or contractors), rather than for fun on a Thursday night, we would have been more detailed and specific about recording our actions. We would likely have used BB Test Assistant or some other form of screen-and-audio capture that we could annotate and narrate. Working together, we were able to confirm a general problem that I found to be the most annoying bug: music was played automatically, intermittently, and loudly, and the volume control and the mute button that appeared intended to control the music didn’t ever work.

I started the session by making a testing mistake: I went looking for bugs too early. I often make this mistake, and need to adjust my process. The problem, as Jon Bach pointed out to me several years ago, is that if you look for bugs too early, you may end up finding them, investigating them, and recording them. That takes time away from getting an overview of the product, building your model, and understanding the benefits that the product is trying to offer. I stepped into this trap right away.

It’s a particularly bad trap, in fact, when the zeal for bugs prevents you from being able to log in properly. That’s what happened. After a certain amount of rather sophmoric input constraint attacks (the email field accepted at least 1,000,000 characters of data; I didn’t attack further), every attempt to use the signup screen looked like this:

“Please correct the following errors:
Error IMVU was unable to process your request”

I couldn’t find a way to apologize and make the application start working, even though I fed it grovellingly valid data. To save troubleshooting time, I ended up setting up an account on a second machine. Here are the notes of what James and I did after that.

  • End and Shift-End keys don’t work in the “Choose Your Name” window.
  • pressing “cancel” during install results in “Download Failed” message.
  • IMVU (not responding) when trying to log in
  • clicked on the “New? click here to register” link and it froze. It turns out that it froze while attempting to load the webpage, but there was about a minute and a half delay or hang
  • unable to paste text into chat box
  • couldn’t stop the avatar from dancing (later found out that if he sat down, he’d stop)
  • in one of the rooms, I choose to move to a spot on the couch. I suddenly found my avatar arranged on the couch with a female avatar, such that our legs passed through one another in a rather painful and physically impossible kind of way.
  • music (techno/hip-hop, often in Japanese) started and stopped randomly, without warning.
  • the music was loud. The volume control and mute button on the music screen didn’t work at all. The music played at top volume at random times.
  • music appears subject to intermittent dropouts.
  • when I max out the characters in a message in the “special someone” box, I can still add additional characters one at a time, despite being told I’m at the max
  • when I tried to become a “special someone” with myself, the error message was “you are not buddies with that avatar!”
  • you can add yourself as a buddy
  • after adding yourself as a buddy, you can invite yourself for a chat. The message “inviting…” appears in the top of the room window and is still there 10 minutes later
  • after adding myself as buddy, I could add myself as a “special someone” to myself
  • in one of the rooms, there was a dropdown box labeled Media/Music. Double-clicking on one of the options, “Country 108″, gave a dialog box captioned Message, with “Loading error!Please try again!” (with the missing space after the first exclamation point) and a close button that said “Sure”.
  • with the uncontrolled music on in the room already, clicking on a working radio channel superimposes extra music over the existing music. Two tracks at the same time; din.
  • clicking on the music channel very quickly after clicking the first time results in a stream played at half-speed or double or n-tuple speed.
  • when the camera is tilted low, chat bubbles (dis)appear outside of the readable bounds of the window
  • sometimes you can’t get to your name in the room window, so you can’t get to your menu of options
  • in the menu, there’s an option for pasting text into the chat window. The first time I did that, I accidentally pasted the HTML source for the home page that was hanging around in my paste buffer. That text went out immediately to the chat room. There’s no way to buffer and identify what you’re pasting, as there is in Skype or Windows Live Messenger or other chat rooms
  • the text that goes into the chat window is different from the text that appears in the bubbles in the virtual world. If it’s sufficiently long, it throws away most of the text, and truncates at either end and you only get to see the middle.
  • why is the chat room text duplicated?
  • in all rooms, the rendering routines allowed limbs to penetrate furniture or other people’s limbs. Or vice versa. You can see an example:
  • my avatar has a necklace that appears to enter his back through the left shoulder blade, and then reappears at the front of the body, roughly through the heart. I know piercing is fashionable these days, but that looked painful.
  • the lyrical content sounded a little rough to ears that were brought up on the Beatles. That’s not a big deal to me personally, but there was no parental advisory. Which itself isn’t a big deal unless some parent sues.
  • chat text display isn’t apparently configurable
  • the product shows significant inconsistencies in presentation, look, and feel. Many features appear to open a browser outside of the IMVU application; the design esthetic of the Web-based service is very different from the application in which the action happens.
  • A prominent part of the screen in the www.imvu.com/shop window appears to be a message center. It turns out to be an adlink to myluvcrush.ca, which is one of those sites that posts two annoying modal dialogs before it lets you leave. (This isn’t necessarily an IMVU thing; I don’t know the relationship between the companies.)

Most of the comments on the blog post mentioned above were positive. I’ll summarize: “D0000DE! That is sew kewl. U R l33t!” The enthusiasm suggested two things to me. The first is that nobody apparently had a look at the product that was being deployed the way James and I did—or if they did, they were far too polite to make comments or raise questions about it. The roars of approval were, to me, like the blurbs on the back of a book—you know the ones, in which you’re skeptical that the reviewers ever read the book. There used to be a feature in Spy Magazine, “Logrolling In Our Time”. Every month, they would find a smattering of book blurbs in this pattern:

“Best book I’ve read in years. Squane does it again!” Raoul Duke, on the back cover of J.D. Squane’s Frequent Manhood
“Brilliant, incisive. Duke’s a genius” J.D. Squane, on the flyleaf of Raoul Duke’s Temporarily Unsanitary

The second point derives from the first: that, among the technologists, there’s such a strong fetish for the technology—the automated deployment—that what is being deployed is incidental to the conversation. Yes, folks, you can deploy 50 times a day. If you don’t care about the quality of what you’re deploying, you can meet any other requirement, to paraphrase Jerry Weinberg. If you’re willing to settle for a system that looks like this and accept the risk of the Black Swan that manifests as a privacy or security or database-lclearing problem, you really don’t need testers.

Except… maybe I’m wrong about all this stuff. Quality is value to some person.

  • Maybe I’m not the person to make the evaluation. In the screen shot above, note the conversation that I’m having with Guest_SexyRocker1. I’m asking her if she finds bugs in this application. She says No. I’ve found it impossible not to find bugs.
  • Maybe these applications aren’t supposed to have elegance or polish; maybe audio controls aren’t supposed to work; maybe it’s cool, in a virtual world, to render furniture embedded in people.
  • Maybe this is just a proof of concept. If you look really closely, you’ll notice that the IMVU logo is occasionally accompanied by “Beta”. Maybe the Perpetual Beta, a release model that Google seems to have perfected, is the new normal, allowing us to forgive all sins.
  • Maybe, to use Bob Martin’s lingo, we don’t value craftsmanship over crap.
  • Maybe we expect things to have problems that can be revealed by these terribly basic tests. And maybe, in the sense that James is talking about quality here, quality is dead.

22 Responses to “50 Deployments A Day and The Perpetual Beta”

  1. Sai Venkatakrishnan says:

    Good post.

    I am happy that they are using continuous integration and deployment as well automated test suites. It is really healthy to integrate and deploy as soon as possible. But I will never try to use this as an alternative of manual testing i.e. Exploratory Testing.

    Underestimating the value of human thinking and relying completely on automation is a mistake lot of people do. Automation can give you fast feedback but it can do only what you ask it to do. But people think, adapt and act and this is really important for testing. I am not sure when we will realize that and add it our mandatory routine of application development.

  2. Shrini Kulkarni says:

    Michael,

    Excellent Post … with many key take aways ….

    Here is one thing that I felt rather different …

    >>>I went looking for bugs too early. I often make this mistake, and need to adjust my process. The problem, as Jon Bach pointed out to me several years ago, is that if you look for bugs too early, you may end up finding them, investigating them, and recording them.

    How does this compare with "conventional" wisdom that finding defects early in the life cycle is cheaper hence testing should be introduced early in the cycle.

    Are you saying that this conventional wisdom (or common sense) has changed its form?

    Shrini

  3. Kay Johansen says:

    Great post! I’m glad you and James took the time to put this together. I have a feeling I’ll be referring clients to this post for some time to come.

    We talked about continuous deployment today after Salt Lake Agile Roundtable, specifically about Flickr.com. We concluded that it might work for “discretionary” software but had doubts about “critical” or business software. So IMVU may be another example of where continuous deployment “works” because the bugs are not “important” to the users.

    I think I recall a third example of a place that used continuous deployment, interestingly that would have been 5 or more years ago. It was a resume/job posting website.

  4. Elisabeth says:

    Great list of issues.

    However, I suspect that these issues don’t really interfere with the core value to the target users.

    Reading this review suggests to me that the primary user base is, well, a little younger than both of us. They seem to like it because it’s “fun.”

    FWIW Sims 2 (I’m a recovering addict) has similar rendering weirdnesses with objects intersecting. I can attest that they didn’t interfere with game play and didn’t make the game any less fun. Actually, it made the game *more* fun when the rendering issues produced particularly amusing and anatomically impossible intersections.

    So I decided to try to find out if users enjoy IMVU. I discovered that searching on “IMVU love” didn’t bring back the – ahem – information I was looking for. So I tried “IMVU fansite”. Lots of people love this thing. People have even made fan videos.

    I went looking for complaints about IMVU from users. I came up with one where the poster is complaining not about software bugs but rather that the whole thing is a waste of time, that he didn’t get as many credits as he expected, and that people are disrespectful. In other words, none of his issues had anything to do with software quality.

    And I found the Bugs section on their forums. There are bugs there – some pretty serious from the perspective of the users. But I noted that at least one bug seemed to get fixed very quickly. That’s the great thing about continuous deploy and tons of automated tests. If something *does* go wrong, recovery is much faster and easier.

    So I don’t think it’s fair to say that the system is crap. It clearly has value to a large number of someones. It makes money. It does what the makers intended. And when things go wrong, and they can get enough information to fix the issues, they’re able to deploy fixes fast.

    Frankly, I find that rather impressive.

  5. timothyfitz says:

    I would like to clarify, we do have a Quality Assurance staff. There are numerous quality engineering tasks that only a human can do, exploratory testing and complaining when something is too complex come to mind. They’re just not in the “take code and put it into production” process; it can’t scale to require QA to look at every change so we don’t bother. When working on new features (not fixing bugs, not refactoring, not making performance enhancements, not solving scalability bottlenecks etc), we’ll have a controlled deliberate roll out plan that involves manual QE checks along the way, as well as a gradual roll-out and A/B testing.

    Your point about the sheer number of bugs we have? you’re right. Our software has way more bugs than I’d like. It’s a byproduct of the choices made when the company was small: to prioritize determining what the customer actually wants at almost any cost. We would absolutely love to have a high quality client, and we’re working every day towards that goal.

    Continuous Deployment let’s you write software *regression free*, it sure doesn’t gift you high quality software. As a start-up, we’re faced with hard decisions every day about where to make our product higher quality; most of the complaints you have would be immediately ignored by the majority of computer users and so we can’t in good faith prioritize them over the things that ARE causing our users trouble.

    The vast majority of developers take one look at what they think our product is and don’t bother to give it a try; I’m happy to see a thoughtful open-minded dive into IMVU from a developers perspective.

  6. Markus Gärtner says:

    I read and fully support James’ point yesterday. It seems to be currently the case that the lack of quality, beta testing etc has become more or less a linear feature (see Mike Cohn’s Agile Estimation and Planning) and people accepted that fact. Bob Martin’s craftsmanship over execution (this is how he named it, when realising that the previous one violated the consistency from the Agile Manifesto) steps in here.

    Actually what seems to be missing is the pride and responsibility in the software world. From my point of view I can pile up a lot of technical debt, but deliver really fast. What I will be lacking when doing so, is denoted in Alistair Cockburn’s Cooperative Game manifesto: I did not prepare well for the next round of the software game. Still the same holds related to software testing as well. Stressing out time schedules quite damages software development as a craftsmanship to be proud of. On the other hand Fred Brooks already noticed quite a similar fact nearly 20 years ago in his “No silver bullet” article. Software development never could hold up with hardware imrpovements. You cannot expect to improve your software process by one order of a magnitude. This is unrealistic, stupid and dangerous.

    From the application you explorated, I have to say, that there is a large lack in the whole team. By using the wrong goals (50 deployments a day at what cost???) the team has put out quality. Explicitly or not. For me it’s obvious that non of the team members actually tried out the software they built. It was just a question of “how fast can you bring it to production?”.

    Finally, what are the options we can use to address this? Taking the courage to actually show flaws and issues in such products is the right origin to do so. To cite from Bob Martin’s blog once again: Glory and success are not a destination, they are a point of origin. (citation is from Rino Baglio) Maybe I’ll be writing on this on my blog as well, since I cannot agree with you more.

  7. Ben Simo says:

    I think that users have become so accustomed to bad software that they don’t notice the bugs — for long. Software users are accustomed to finding workarounds and the quickest way to make errors go away.

    We’ve fallen in love with numbers. So when people report idiotic metrics like percent quality and counts of the millions of automated tests cases run with every build, those numbers replace reality in people’s minds about quality. We’ve replaced quality with quantity.

    We’ve been desensitized to crap.

    So maybe quality is dead. Maybe there just aren’t very many people that care.

  8. Oliver Erlewein says:

    is information and what was the result?

  9. Raoul Duke says:

    wow. you 2 are my heroes. completely how i would feel about that kind of thing. it absolutely cracks me up that you were talking to people and asking them if they found bugs when i’m sure they just were 12 year olds looking to cyber. awesome. i want to hire you all but i’m just a nobody.

  10. Adam Goucher says:

    Random thoughts…
    - What if they are deploying protocol or backend processes? I’m not sure your testing would have found issues as you appeared to be concentrating on the frontend. Now, the author does mention Selenium so it somewhat implies they deploy frontend stuff as well
    - I agree, you are quite likely not the correct audience for the application. Meagan started playing with IMVU well over a year ago which would place her at 14 originally
    - I wonder how many of us don’t fall for the trap Jon mentioned
    - This kinda harkens to James’ recent Quality Is Dead article. We’ve conditioned the average consumer to expect non-perfection. As long as the basic functionality exists and works-ish then they are happy. Well, enough to not go through the hassle of migrating yourself and the rest of your friends to a different service
    - Group Think, or perhaps Group Acceptance is a powerful creator of stickiness

  11. Anonymous says:

    Part of the reason IMVU can get away with all their bugs is because they’re in the entertainment business, so none of their users rely on them for anything really important.

  12. Anonymous says:

    After looking through all the "bugs" you found, I'd have to say that none of them detract from the benefit that the product offers.

    So essentially these arent "bugs", they are "nice-to-have-fixed". A "bug" would be something like not being able to talk to someone or chat repeating lines of input or something like that.

    If the only bugs you have are for features which are "not crucial to hte product" then I say release!!

    I think having to fix all these so called "bugs" borders on perfectionism.

  13. Michael says:

    @anonymous (too shy to sign)

    You could be absolutely right; it could be that the racket of having two audio tracks playing death metal at the same time is exactly what they intended, and that all you have to do is turn it down. And it could be that the 500 some-odd pages of forum complaints (noted in subsequent blog posts to this one) are insignificant—even the ones that refer to accounts being hacked and credits being lost. However, the general lack of care for detail suggests a strong possibility of more serious problems, evidence of which we did find later. Suggests, but doesn't confirm, so you might like to have a look at those subsequent posts and see what you think.

    —Michael B. (brave enough to sign)

  14. John Allspaw says:

    Just so I'm understanding: the gist is…you don't think an organization can deploy small changes frequently and still have quality, and to prove this you find bugs on the company's site?

  15. Michael Bolton http://www.developsense.com says:

    No, I'm not saying that. An organization can absolutely deploy small changes frequently and still have quality.

    What I am saying is that when you focus your attention on something that customers don't value too much (like, say, deploying every half hour, assuming a 24-hour day; or deploying every 10 minutes, assuming an eight-hour day), you may reduce your perspective and vigilance on things that they may value more. There may be indirect value in being able to deploy very rapidly, but that begs questions on the quality of what you're deploying.

    http://www.imvu.com/catalog/modules.php?op=modload&name=phpbb2&file=viewforum.php&f=6

    Someone named Kara sent a comment recently in which she delivered quite a diatribe—strong enough that I wasn't inclined to publish it. However, she did provide a couple of links that were interesting.

    http://www.trustlink.org/BusinessProfile.aspx?ID=206049811
    http://www.rateitall.com/i-913414-imvu.aspx

    —Michael B.

  16. Michael Bolton http://www.developsense.com says:

    No, I'm not saying that. An organization can absolutely deploy small changes frequently and still have quality.

    What I am saying is that when you focus your attention on something that customers don't value too much (like, say, deploying every half hour, assuming a 24-hour day; or deploying every 10 minutes, assuming an eight-hour day), you may reduce your perspective and vigilance on things that they may value more. There may be indirect value in being able to deploy very rapidly, but that begs questions on the quality of what you're deploying.

    http://www.imvu.com/catalog/modules.php?op=modload&name=phpbb2&file=viewforum.php&f=6

    Someone named Kara sent a comment recently in which she delivered quite a diatribe—strong enough that I wasn't inclined to publish it. However, she did provide a couple of links that were interesting.

    http://www.trustlink.org/BusinessProfile.aspx?ID=206049811
    http://www.rateitall.com/i-913414-imvu.aspx

    —Michael B.

  17. John Allspaw says:

    Ok, I'm not sure that I follow the train of thought that since IMVU does make effort to bring small changes frequently to production, they're somehow unfocused on what those changes are, or how they affect the user experience. Not being an IMVU user, I can't comment, but clearly they have seen a very large amount of growth (looks like a 2x increase in traffic and a 3x increase in users), according to public sources (http://www.quantcast.com/imvu.com). It's possible that their growth was in spite of the quality issues that you're wanting to point out.

    What I am interested in, however, is the discussion that "Continuous Deployment" (small changes made frequently) to a live web application doesn't (or can't) yield desirable results. Given that every organization is different with respect to the tools and culture that surrounds their development process, and given that web applications are quite different from more traditional ("shrink-wrap"), continuous deployment isn't going to work for all forms of software development.

    But for some, it does indeed work very well, and there's nothing magic about it. It was very much one of the reasons why Flickr was able to make incremental changes while at the same time scaling our infrastructure and maintaining high levels of availability and performance. Our developers and operations teams quite frankly can't imagine working any other way, and if at any point our availability or feature-shipping schedules were at odds with the rate of change we were introducing to production, our deploying multiple times a day would have slowed. It didn't, and I would attribute our success to both the tools we used and the culture we grew.

  18. Michael Bolton http://www.developsense.com says:

    It seems to me that (via the chart you cite) IMVU's growth has been large worldwide in the last year, but pretty slow in the U.S., and the visits-per-person figure seems to have been declining marginally. Either way, they haven't gone through the kind of growth or sustain the levels of service of a company like, say, Flickr.

    What I am interested in, however, is the discussion that "Continuous Deployment" (small changes made frequently) to a live web application doesn't (or can't) yield desirable results.

    What, specifically, did you read here that points to that conclusion?

    In terms of the overall customer experience, I sense a difference between IMVU and, say, Flickr.

    —Michael B.

  19. John Allspaw says:

    ‘What, specifically, did you read here that points to that conclusion?’

    I guess I took your last bullet points incorrectly, then. I thought you were trying to point out that a development and deployment process such as IMVU’s couldn’t produce acceptable quality. If you’re not saying that, then I misunderstood. :)

  20. I didn’t say that they couldn’t. I said that they didn’t, to me..

  21. Sam says:

    IMVU has changed alot now. so there isnt all of these things anymore. Plus, IMVU is still in the beta process, they are changing things everyday

    Right. The perpetual beta.

    Thanks for the update!

  22. [...] aware that I read about some contents of the book from a different perspective. Back in 2009 Michael Bolton and James Bach reported on testing an internet application which was from their perspective more [...]

Leave a Reply