Avoiding LGTM PR Cultures

Introduction

Making a code change when using a distributed version control system (DVCS) like Git is usually done by packaging a change on a branch as a “pull request” (PR) which indicates the author would like the project to “pull” the change into it.

This was, and is, a key part of open source projects as it allows outside contributors to contribute to a project in a controlled way, however many internal software development teams also work in this fashion as there are many benefits of this approach over committing directly to a shared branch or trunk.

I’ve seen the pull request approach have a positive impact on software quality since pull requests facilitate discussion through peer reviews and allow running of automated tests against every commit and change that is proposed to be merged into the default branch.

taken lgtm.jpg

What is a LGTM PR culture?

I’ve also seen some negative behaviours emerge when moving to pull request based development which I’ll call a LGTM PR culture.

LGTM is a common acronym found in peer reviews of pull requests which means “Looks Good To Me”, and I’ve seen teams let unsuitable changes through with LGTM comments without doing solid peer reviews and testing.

How do you know if you have a LGTM PR culture?

One way to “test” your peer review process is by creating PRs and leaving a subtle bug or something not quite right that you know about in the PR. When it gets reviewed do you get a LGTM? I did this recently and whilst the PR didn’t even do what it was meant to do I received a LGTM 😕

2dsag2

How can you move away from a LGTM PR culture?

It’s tempting to just tell everyone to do better peer reviews but it’s not that easy!

I’ve found there’s some steps that an author of a pull request can do to facilitate better pull request reviews and lead towards a better culture.

1. Make pull requests as small as possible:

The smaller the pull request the more likely you’ll get specific and broad feedback on it – and you can then iterate on that feedback. A 500 line change is daunting to review and will lead to more LGTMs. For larger refactorings where there’ll be lots of lines changed, you can start with a small focussed change and get lots of review and discussion, and once the refactoring is established with a smaller example you can easily apply that feedback to a broader impact PR that won’t need as much feedback as you’ve established the new pattern previously.

2. Review your own pull request

Review your own work. This works best if you do something else then come back to it with a fresh mind. Anything you’re unsure about or doesn’t look right leave it as a comment on your own PR to encourage other reviewers to look closely at those areas also.

3. Include clear instructions for reviewing and testing your pull request

A list of test steps is good as well as asking for what type of feedback you’d like – you can explicitly ask reviewers something like “please leave a comment after your review listing what you tested and what areas of the code you reviewed.” This discourages shortcuts and LGTMs.

4. Test your peer review process – see above.

Conclusion

Developing software using pull requests can mean much higher quality code and less technical debt due to the feedback on peer reviews that accompany pull requests. As an author you can take steps to ensuring pull requests are easy to review and encourage a culture of effective peer reviews.

AMA: Time Estimation

Paul asks…

What is your stance on time estimation (involved people, granularity/level of detail, benefit)?

My response…

I’d like to start by stating that I’m by no means an expert on this topic; so please take what you will from what I write.

Time and effort estimation for any software development activity is very difficult to do so often we get our estimates very wrong. I believe this is because we try to do up-front time and effort estimation without fully understanding the domain or the extent of the problem we’re solving; we still have many unknown-unknowns.

We can still do detailed/granular planning, but we should try to delay the detailed estimation of these until we have more information.

What I prefer is detailed planning  up front, which involves breaking large lofty goals down into small goals. These small goals are broken down further into the smallest possible manageable unit of work that delivers something, however small that something is. It’s important to break things down to this level as this enables continuous delivery, and flexibility in scope as a project progresses.

Once these small units of work are detailed, before trying to estimate these, I think there’s validity in starting work and delivering some of these units of work. This will mean it’s possible to more accurately estimate the remaining work based upon real delivery experience.

As soon as you begin working on each unit you should get a feel for the size and effort that is required for each unit, and over a period of time (say a fortnight) you can start to work out how many of these units you can achieve (your velocity).

If you’ve got the detailed plan of how many units total you’d like to achieve, it is probably at this point where you realise that what you wanted to achieve is going to take too long or cost too much. This realisation means you need to prioritise all remaining work, and focus on what is high priority.

I’ve never seen a project finish with the same intentions as when it started, so as you progress you will find some items get completely de-prioritised (no longer in scope), some things become higher priority so they get delivered sooner, and some completely new ideas/pieces of functionality may be decided upon and included in your plan.

Since you understand what you’ve been able to deliver you can then have sensible conversations about what is feasible given the resources available.

Futurespectives are fun

Since my team (and every team at Automattic) is 100% distributed, it’s important that we meet in person a few times a year (somewhere in the world) to hang out, co-work, eat and plan together: we call these team meetups.

Two weeks ago I spent the week in La Jolla in beautiful Southern California working with my team. Each team member was asked to suggest activities/projects to work on for the meetup and I suggested we do a futurespective.

Most people are familiar with a retrospective as they’re very common in agile software development, but I’ve found futurespectives to be much less common.

A futurespective is an activity where a team can work together to create a shared vision for the future.

There’s not a huge amount of information online about how to facilitate a futurespective, so I went with this structure:

  1. Prime directive (5 mins)
  2. Check-in: clear the air (5 mins)
  3. Explain the purpose of the excercise: what we are aiming to get out of this (5 mins)
  4. Move to the future: Imagine a nirvana state (20 mins)
  5. Coming back: Success factors that got us there (20 mins)
  6. Now: what can we do to start achieving those success factors (20 mins)

Prime Directive

I found this prime directive online, and whilst it sounds a little cheesy, it set the tone for the excercise which is about working together for a better future together:

‘Hope and confidence come from proper involvement and a willingness to predict the unpredictable. We will fully engage on this opportunity to unite around an inclusive vision, and join hands in constructing a shared future.’ – Paulo Caroli and TC Caetano

Check in

There’s no point working on a team excercise to plan for the future if there’s something in the air, so it’s worthwhile just checking in on the team and how everyone is feeling about the current state of things.

Explaining the Purpose of the Excercise

The prime directive is a good start for this, but it’s worth explaining that the team will be brainstorming and working together to achieve a list of action items at the end of the excercise that will directly impact our future.

Move to the Future: Imagine a Nirvana State (20 mins)

This is where you start by setting the scene 12-18 months in the future where a particular milestone has been successfully achieved – this might be finishing a big project you’re working on, or having launched a new product etc. This is the nirvana state. Ask a question that you would like answered by this excercise: for example: ‘what does testing and quality look like on this day?’

Get each person to spend 10 mins writing sticky notes about the state of your particular question, what it is like, but not delving into how it is like this.

An example might be: ‘everyone is confident in every launch’ or ‘everyone knows what the right thing to work on is’.

As each person is finished we put these sticky notes on a wall and logically group them, and then vote on which are most important (each person is given typically three votes and marks three notes or groups with a sharpie).

Coming back: Success factors that got us there (20 mins)

From the first excercise you should have a list of three or four most end-states, and now we use these to brainstorm for about 10 minutes the success factors (hows) that got us to these end-states.

For example, a success factor for ‘everyone is confident in every launch’ could be ‘unit tests are super easy to write/run all the time (fast)’.

Once people have had time to write these up, we logically group them under our three or four headings on the wall so we can see these clearly.

Now: what can we do to start achieving those success factors (20 mins)

Our final activity is working out what we can do now to lead to these success factors which will get us to our end-goals. At this point you can either brainstorm again, or as a team start discussing what we can do.

If you need some structure you could use “Start Doing/Stop Doing/Keep Doing” to prompt for ideas, otherwise any format you want.

The goal here is after 20 mins have a list of action items that you can easily assign to someone knowing that these will lead to success factors and your end goals you’ve come up with as a team.

An example would be ‘ensure that 100% bugs are logged in one tool (GitHub)’ which can be assigned to someone.

Ensure someone is tasked with taking photos and writing up the findings, at least the action items and circulating these around.

Summary

The Futurespective we ran as a team was very useful as it had enough structure that enabled us to get through a lot of thought in a short amount of time. We did this on the first morning of our meetup and having this structured activity set the tone for the week as we could refer back to what we’d discussed in future activities during the week.

I thoroughly recommend this as a team planning tool.

 

A tale of working from trunk

Let me share with you a story about how we went from long lived feature/release branches to trunk based development, why it was really hard and whether this is something I would recommend you try.

Background

I’m familiar with three main approaches to code branching for a shared code-base:

  1. Long lived feature/release branches
  2. Short lived feature branches
  3. Trunk based development

Long lived feature/release branches

Most teams will start out using long lived feature/release branches. This is where each new project or feature branches from trunk and at a point where the branch is feature ready/stable then these changes are merged into trunk and released. The benefits of this approach is that changes are contained within a branch, so there’s little risk of non-finished changes inadvertently leaking into the main trunk, which is what is used for releases to production. The biggest downside to this approach, and why many teams move away from it, is the merging that has to happen, as each long lived feature branch needs to ultimately combine its changes with every other long lived feature branch and into trunk, and the longer the branch exists, the more it can diverge and the harder this becomes. Some people call this ‘merge hell’.

Short lived feature/release branches

Another version of feature branching is to have short lived feature branches which exist to introduce a change or feature and are merged (often automatically) into the trunk as soon as the change is reviewed and tested. This is typically done using a distributed version control system (such as GIT) and by using a pull request system. Since branches are ad-hoc/short-lived, you need a continuous integration system that supports running against all branches for this approach to work (ours doesn’t), otherwise it doesn’t work as you’d need to create a new build config every time you created a short lived feature branch.

Trunk Based Development

This is probably the simplest (and most risky) approach in that everyone works from and commits directly to trunk. This avoids the needs for merges but also means that trunk should be production ready at any point in time.

A story of moving from long lived feature/release branches to trunk based development

We have anywhere from 2-5 concurrent projects (each with a team of 8 or so developers) working off the same code base that is released to production anywhere from once to half a dozen times per week.

These project teams started out using long-lived feature/release branches specific to projects, but the teams increasingly found merging/divergence difficult – and issues would arise where a merge wasn’t done correctly, so a regression would be inadvertently released. The teams also found there would be manual effort involved in setting up our CI server to run against a new feature/release branch when it was created, and removing it when the feature/release branch was finished.

Since we don’t use a workflow based/distributed version control system, and our CI tools don’t support running against every branch, we couldn’t move to using short lived feature branches, so we decided to move to trunk-based development.

Stage One – Trunk Based Development without a Release Branch

Initially we had pure trunk based development. Everyone committed to trunk. Our CI build ran against trunk, and each build from trunk could be promoted right through to production.

Trunk Based Development without a Release Branch(1)

Almost immediately two main problems arose with our approach:

  1. Feature leakage: people would commit code that wasn’t behind a feature toggle which was inadvertently released to production. This happened a number of times no matter how many times I would tell people ‘use toggles!’.
  2. Hotfix changes using trunk: since we could only deploy from trunk, each hotfix would have to be done via trunk, and this meant the hotfix would include every change made between it and the last release (so, in the above diagram if we wanted to hotfix revision T4 and there were another three revisions, we would have to release T7 and everything else it contained). Trying to get a suitable build would often be a case of one step forward/two steps back with other unintended changes in the mix. This was very stressful for the team and often led to temporarily ‘code freezes’ whilst someone committed a hotfix into trunk and got it ready.

Stage Two – Trunk Based Development with a Release Branch

Pure trunk based development wasn’t working, so we needed some strategies to address our two biggest problems.

  1. Feature leakage: whilst this was more of a cultural/mindset change for the team learning and knowing that every commit would have to be production suitable, one great idea we did implement was TDC: test driven configuration. Since tests act as a safety net against unintended code changes (similar to double entry book-keeping), why not apply the same thinking to config? Basically we wrote unit tests against configuration settings so if a toggle was turned on without having a test that expected it to be on, it would fail the build and couldn’t be promoted to production.
  2. Hotfixing changes from trunk: whilst we wanted to develop and deploy from a constantly verified trunk, we needed a way to quickly provide a hotfix without including every other change in trunk. We decided to create a release branch, but not to release a new feature per say, but purely for production releases. A release would therefore involve deleting and recreating a release branch from trunk to avoid having any divergence. If an hotfix was needed, this could be applied directly to the release branch and the change would be merged into trunk (or the other way around), knowing that the next release would delete the release branch and start again from trunk. This alone has made the entire release process much less stressful as if a last minute change is needed for a release, or a hotfix is required, it’s now much quicker and simpler than releasing a whole new version from trunk, although that is still an option. I would say that nine out of ten of our releases are done by taking a whole new cut, whereas one or so out of ten is done via a change to the release branch.

Trunk Based Development with Release Branch(2)

Lessons Learned

It’s certainly been a ride, but I definitely feel more comfortable with our approach now we’ve ironed out a lot of the kinks.

So, the big question is whether I would recommend team/s to do trunk based development? Well, it depends.

I believe you should only consider working from trunk if:

  • you have very disciplined teams who see every-single-commit as production ready code that could be in production in a hour;
  • you have a release branch that you recreate for each release and can uses for hotfixes;
  • your teams constantly check the build monitors and don’t commit on a red build –  broken commits pile upon broken commits;
  • your teams put every new/non-complete feature/change behind a feature toggle that is toggled off by default, and tested that it is so; and
  • you have comprehensive regression test suite that can tell you immediately if any regressions have been introduced into every build.

Then, and only then, should you work all off trunk.

What have your experiences been with branching?

Extensive post release testing is sign of an unhealthy testing process

Does your organization conduct extensive post-release testing in production environments?

If you do, then it shows you probably have an unhealthy testing process, and you’ve fallen into the “let’s just test it in production” trap.

If testing in non-production environments was reflective of production behaviour, there would be no need to do production testing at all. But often testing isn’t reflective of real production behaviour, so we test in production to mitigate the risk of things going wrong.

It’s also the case that often issues are found in a QA environment don’t appear in a local development environment.

But it makes much more sense to test in an environment as close to where the code was written as possible: it’s much cheaper, easier and more efficient to find and fix bugs early.

For example, say you were testing a feature and how it behaves across numerous times of day across numerous time zones. As you progress through different test environments this becomes increasingly difficult to test:

In a local development environment: you could fake the time and timezone to see how your application behaves.
In a CI or QA environment: you could change a single server time and restart your application to see how your application behaves under various time scenarios: not as easy as ‘faking’ the time locally but still fairly easy to do.
In a pre-production environment: you’ll probably have clustered web servers so you’ll be looking at changing something like 6 or 8 server times to test this feature. Plus it will effect anyone else utilizing this system.
In a production environment: you’ll need to wait until the actual time to test the feature as you won’t be able to change the server times in production.

Clearly it’s cheaper, easier and more efficient to test changing times in an environment closer to where the code was written.

You should aim to conduct as much testing as you can in earlier test environments and taper this off so by the time you can a change into production you’ll be confident that it’s been tested comprehensively. This probably requires some change to your testing process though.

Tests Performed per Environment

How to Remedy A ‘Test in Production’ Culture

As soon as you find an issue in a later environment, ask why wasn’t this found in an earlier environment? Ultimately ask: why can’t we reproduce this in a local environment?

Some Hypothetical Examples

Example One: our tests fail in CI because of JavaScript errors that don’t reproduce on a local development environment. Looking into this we realize this is because the JavaScript is minified in CI but not in a local development environment. We make a change to enable local development environments to run tests in minified mode which reproduces these issues.

Example Two: our tests failed in pre-production that didn’t fail in QA because pre-production has a regular back up of the production database whereas QA often gets very out of date. We schedule a task to periodically restore the QA database from a production snapshot to ensure the data is reflective.

Example Three: our tests failed in production that didn’t fail in pre-production as email wasn’t being sent in production and we couldn’t test it in pre-production/QA as we didn’t want to accidentally send real emails. We configure our QA environments to send emails, but only to a white-list of specified email addresses we use for testing to stop accidental emails. We can be confident that changes to emails are tested in QA.

Summary

It’s easy to fall into a trap of just testing things in production even though it’s much more difficult and risky: things often go wrong with real data, the consequences are more severe and it’s generally more difficult to comprehensively test in production as you can’t change or fake things as easily.

Instead of just accepting “we’ll test it in production”, try instead to ask, “how can we test this much earlier whilst being confident our changes are reflective of actual behaviour?”

You’ll be much less stressed, your testing will be much more efficient and effective, and you’ll have a healthier testing process.

Testing beyond requirements? How much is enough?

At the Brisbane Software Testers Meetup last week there was a group discussion about the requirement to test beyond requirements/acceptance criteria and if you’re doing so, how much is enough? Where do you draw the line? It came from an attendee who had a manager pull him up for a production bug that wasn’t found in testing but wasn’t in the requirements. If it wasn’t in the requirements, how could he test it?

In my opinion, testing purely against requirements or acceptance criteria is never enough. Here’s why.

Imagine you have a set of perfectly formed requirements/acceptance criteria, we’ll represent as this blue blob.

Requirements

Then you have a perfectly formed software system your team has built represented by this yellow blob

System

In a perfect, yet non-existent, world, all the requirements/acceptance criteria are covered perfectly by the system, and the system exists of only the requirements/acceptance criteria.

Requirements - System

But in the real world there’s never a perfect overlap. There’s requirements/acceptance criteria that are either missed by the system (part A), or met by the system (part B). These can both be easily verified by requirements or acceptance criteria based testing. But most importantly, there are things in your system that are not specified by any requirements or acceptance criteria (part C).

Requirements - System(1)

These things in part C often exist of requirements that have been made up (assumptions), as well as implicit and unknown requirements.

The biggest flaw about testing against requirements is that you won’t discover these things in part C as they’re not requirements! But, as shown by the example from the tester meetup, even though something may not be specified as a requirement, the business can think they’re a requirement when it effects usage.

Software development should aim to have as few assumptions, implicit and unknown requirements in a system as reasonably possible. Different businesses, systems and software have different tolerances for how much effort is spent on reducing the size of these unknowns, so there’s no one size fits all answer to how much is enough.

But there are two activities that a tester can perform and champion on a team which can drastically reduce the size of these unknown unknowns.

1 – User Story Kick-Offs: I have only worked on agile software development teams over the last number of years so all functionality that I test is developed in the form of a user story. I have found the best way to reduce the number of unknown requirements in a system is to make sure every user story is kicked-off with a BA, tester and developer (often called The Three Amigos) all present and each acceptance criterion is read aloud and understood by the three. At this point, as a tester, I like to raise items that haven’t been thought of so that these can be specified as acceptance criteria and are unlikely to either make it or not make it into the system by other means or assumptions.

2 – Exploratory Testing: As a tester on an agile team I make time to not only test the acceptance criteria and specific user stories, but to explore the system and understand how the stories fit together and to think of scenarios above and beyond what has been specified. Whilst user stories are good at capturing vertical slices of functionality, their weakness, in my opinion, is they are just a ‘slice’ of functionality and often cross-story requirements may be missed or implied. This is where exploratory testing is great for testing these assumptions and raising any issues that may arise across the system.

Summary

I don’t believe there’s a clear answer to how much testing above and beyond requirements/acceptance criteria is enough. There will always be things in a system that weren’t in the requirements and as a team we should strive to reduce the things that fall into that category as much as possible given the resources and time available. It isn’t just the testers role to either just test requirements or be solely responsible/accountable for requirements that aren’t specified, the team should own this risk.

Intentionally Disposable Software

Update 11 Feb: Sorry, somehow comments were disabled for this post. This has been resolved now.

There’s a series of code retreats that take place each year where a group of programmers get together to work in groups to solve a problem (kata). They do this in iterations over and over again, and most importantly they delete their entire code at the end of every iteration (typically 45 minutes).

“It’s much easier to walk a new road when the old road is out of sight”

~ Drew Miller

Programmers don’t delete enough production code. Which is funny because I’ve met heaps of programmers, including myself, who love deleting code; it’s a strangely satisfying, cleansing ritual.

What if we could replicate what we do when doing these katas and delete the entire source code of a system every 6 months or a year and start again? Does that sound crazy? Can we do this?

We couldn’t do this with how we currently develop software. It’s because we build software systems that are way too complex, have features that no-one uses and are built to last way too long. We expect our software systems to last 10+ years, and we’re so anxious about long-lasting technology impacts we make overly cautious or conservative decisions that come back to bite us constantly in the years to come. We build technical debt into legacy systems that no one wants to work on. We obsess about re-usability, longevity and salvageability of every piece of code we write. We build massive overly complex regression test suites because we expect the system to work for so long and be maintained for so long we expect it to eventually deteriorate and we want a regression test suite as a ‘safety net’ when it inevitably does.

Maintaining legacy software is like painting the Sydney Harbour Bridge. You start painting it on one side and by the time you get to ‘finish’ it on the other side it’s taken you so long you now need to start repainting the original side again. Wouldn’t it be easier to just build a new bridge?

What we need is Intentionally Disposable Software. We need to build software only designed to last 6 months, a year max. As soon as we deploy it we start immediately on a replacement for it. We put bare minimum effort into maintenance as we’ll just replace what we have in Production as soon as we can: why wash up when you can use fresh paper plates for every meal? As soon as the replacement is ready, we deploy that and completely blow away the old software system. We rinse and repeat.

It’s somewhat similar to planned obsolescence but we don’t do it to annoy our customers and attempt to generate repeat purchases, we do it to refine our software system without any legacy.

We use analytics to tell us exactly what features of the system in production are used and most importantly, what features are little or never used. We don’t build those features into the replacement systems ever again so each system we build is leaner, more focused on the important things it is meant to do and does them better each time. We don’t waste any time on building or supporting unimportant things.

We don’t have time to build up technical debt. We aren’t anxious about choosing a wrong technology. Did we use AngularJS and now hate it? Never fear, we start work immediately on our new system replacement and use ReactJS (or whatever the latest/coolest framework is).

Developer happiness skyrockets! No legacy code! No technical debt! Everyone can work on the latest/best technology to get the job done and want to stick around in our organization to do just that. It’s like being a consultant without being a consultant. Because everyone has already implemented the same thing before, everyone is aware of the gotchas, so whilst people are constantly learning new technology, they’re efficient because they know what they’re actually trying to achieve. And because we’re building such a lean system we’re lean in our approach.

We do need to make sure we use open standards and have an easy way to export/import/migrate data in fast, clean ways – which is good.

The same applies to integration points, we need to be modular enough and use open standards and protocols exclusively to be able to drop out one system and replace it with another that integrates easily.

So what about testing?

If we expect a system to last six months to a year, we need just enough testing. We need just enough testing to make sure the system is built right (doesn’t break), but not too much requirements based testing around building the right system, because we know we won’t build the right system, we’ll be building a ‘more right’ replacement as soon as this one is built.

We need testing that focuses on intention of the system over implementation, because the implementation will constantly change each time we rewrite it. If we write our automated tests in a declarative style devoid of implementation detail we’ll be much better off.

Think


Scenario: Overseas customers must provide ID for expensive purchases
Given I am a non-Australian customer
When I order more than $100 of tea
Then I need to provide additional ID on checkout

Not

Scenario: Overseas customers must provide ID for expensive purchases
Given I go to the Beautiful Tea Home Page
When I click Sign InAnd I enter my username and password
And I click OK
...

Think of Disposable Software like building a house.

You want somewhere to live so you decide to build a house. You design and build a new ‘dream home’ with the best intentions but soon you realize there’s a big difference between what you actually need and what you thought you needed. There’s also some fundamental design flaws you didn’t even realize until it was built and you’re living in it, like mold in the bathroom as it has not enough air-flow and the bedrooms face the wrong direction and are constantly too hot to sleep at night. Plus life has since thrown a new baby into the mix, so 12 months later you find yourself with a house with a lot of design flaws that doesn’t meet your now much clearer, and since expanded, requirements.

So what do you do? Since you’ve invested a lot (financially and emotionally) into your house and you expected it to last ten years of more, you renovate. But the problem with renovating is that you’ve got to work around all the original design flaws, and since the house is already built it’s much more difficult and expensive to make changes to it, and also since you’re living in it, any renovation comes with risk of disruption/displacement to the occupants including an overly sensitive newborn. You find since it’s mainly custom work that the renovations you’re planning will cost nearly as much as the original house.

This is like legacy software. You can’t renovate it easily as it’s already in use, you’re constantly working around its design flaws so it’s much more difficult and costly to make changes to it. Plus it’s really hard to remove the unnecessary parts of your house by renovation.

But what’s the alternative? What if you built the house knowing that come 12 months time you could knock it down, recycle it, and build a new house knowing exactly what you want this time around. You’ll know not to face the bedrooms West. You’ll know the make the bathroom has plenty of air-flow. You’ll even include a bedroom suitable for the baby. But you don’t get too caught up in getting this house ‘perfect’ because come 12 months time you can do it all again. The house could be prefabricated so it’s much cheaper to build off-site in a construction friendly environment, and the migration involves temporarily moving some furniture from the old structure, placing the new house in place with the furniture and recycling the old. You own less objects (clutter) as you know this happens and are prepared for it. As you kids grow up their needs change so instead of doing ‘extensions’ and ‘renovations’ you simply redesign your new house which will be delivered in 12 months time.

This is like Intentionally Disposable Software.

As Alex Bundardzic wrote almost ten years ago in his fantastic Disposable Software post:

“In reality, a good software developer should be able to… deliver a disposable, short lived product that addresses only the pressing needs of the moment, and then simply disappears. No maintenance, no enhancements, no song-and-dance, nothing.”

What do you think?

Never compare your organization’s insides with another organization’s outsides

I once heard the brilliant suggestion that you should never compare your insides with another person’s outsides; because they’re not the same thing. For example, just because someone may seem happy and drive an expensive car, that’s only the outside view of that person and it doesn’t paint the full picture of that person’s insides, which you’re dangerously trying to compare with your own inner thoughts/status.

The same applies with comparing things at your organization to things you’ve heard about other organizations. Countless times, including just this week, have I heard managers and colleagues say things they’ve heard like: “Facebook don’t have testers”, “Google has 10,000+ engineers in 40 offices working on trunk” and “Flickr deploys to production 10 times a day so we can too”. These are examples of comparing our insides to other’s outsides, again.

Yes, Google may have 10,000+ engineers committing to one branch but having spoken to people who work at Google it’s not quite as amazing as it seems. For example, firstly the code-base is broken down into projects (imagine the checkout time without this), each and every change set must be code reviewed, have automated and manual tests performed against it (which can take hours/days) before it is even committed to the trunk, even before it can even be considered for a production release.

I didn’t realize it at the time but the keynote at GTAC last year captured this phenomenon perfectly:

Google from the outside (like a jet plane)Google from the inside (lots of people pushing a broken down car)

Summary

It can not only be really annoying/unhealthy for staff to constantly hear such comparisons, it can also be dangerous because doing something just because Google/Facebook/Twitter/Flickr does it without knowing the inner workings of their organizations will inevitably lead to failure when you try to do it without that context and experience.

So next time you are tempted to drop something you’ve heard from a conference or a blog post about how another company does something better than yours, or to justify that we can/should do it this way, remember, never compare your organization’s insides with another organization’s outsides.

Iterative vs Incremental Software Development

What’s the difference between ‘iterative’ and ‘incremental’ software development?

I know a lot of agile software development teams call their blocks of development time ‘iterations’ instead of ‘sprints’. Does that mean they’re doing iterative software development?

You’ve probably seen the Mona Lisa analogy by Jeff Paton that visually tries to show the difference between the two development approaches:

Incremental Development:

incrementing

Iterative Development:

iterating

But which is better?

Well, if for some (very likely) reason (lack of money, changed business conditions, change in management) we had to stop after iteration/increment one or two, which approach would yield a better outcome?

Mona Lisa

Incremental development gives us a painting of half a lady whereas iterative development gives us an outline of a lady, but both paintings really wouldn’t belong in The Louvre. Perhaps we could have just painted a smaller painting?

This is where I think the Mona Lisa art analogy falls apart. A work of art, like a book, but unlike a piece of software, has a pretty clear definition of done. An artist knows when their piece of art is done: not a single stroke more, not a single stroke less.

But I’ve never worked on a piece of software that was considered done: there’s always more functionality to add/remove/fix.

If we can recognize that software is never done, all we need to do it work out how to get it to where we want it to be (for now).

“We shall not cease from exploration, and the end of all our exploring will be to arrive where we started and know the place for the first time.”
~ T.S. Eliot

If we are driven by time to market we should internally iterate just enough so we can  release ‘increments’ fast and often, and iterate/release again and again.

If we are driven by user experience, we should internally iterate a lot to get things right, release increments only when necessary, and iterate again.

Both approaches are about iterating. Both are also about incrementing. The difference is how soon we release after how many times we iterate.

Compare the beginnings of the two dominant mobile operating systems. Google went for time to market with Android, they released an unpolished, yet feature rich, operating system quickly and made it better by iterating/incrementing again and again over time. Apple took the opposite approach: they released iOS with highly polished features relatively slowly (it took three major iOS releases to get MMS and copy & paste!) but focused on getting things right from the start.

Both approaches are different but neither are wrong: they highlight the differences between Apple and Google and their approach to developing software.

Summary

We can’t build anything without iterating to some degree: no code is written perfectly the second that it is typed or committed. Even if it looks like a company is incrementally building their software: they’re iteratively building it inside.

We can’t release anything without incrementing to some degree: no matter how small a release is, it’s still an incremental change over the last release. Some increments are bigger because they’ve already been internally iterated upon more, some are smaller as they’re less developed and will evolve over time.

So, we develop software iteratively and release incrementally in various sizes over time.

What is a good ratio of software developers to testers on an agile team?

The developer:tester ratio question comes up a lot and I find most, if not all, answers are “it depends”.

I won’t say “it depends” (it’s annoying). I will tell you what works for me given my extensive experience, but will provide some caveats.

I’ve worked on different agile software development teams as a tester for a number of years and I personally find a ratio of 8:1 developers to tester(s) (me) works well (that’s 4 dev-pairs if pair programming). Any less developers and I am bored; any more and I have too must to test and cycle time is in jeopardy.

Some caveats:

  • I’m an efficient tester and the 8:1 ratio works well when there’s 8 equally efficient programmers on the team – if the devs are too slow, or the user stories are too big, I get bored;
  • Everyone in the team is responsible for quality; I have to make sure that happens;
  • A story must be kicked off with the tester (me) present so I can question any assumptions/anomalies in the acceptance criteria before any code is written;
  • A story is only ready for test if the developer has demonstrated the functionality to me at their workstation (bonus points in an integrated environment) – we call this a ‘shoulder check’ – much the same way as monkeys check each others shoulders for lice;
  • A story is also only ready for test if the developer has created sufficient and passing automated test coverage including unit tests, integration tests (if appropriate) and some acceptance tests; and
  • Bug fixes take priority over new development to ensure flow.

What ratio do you find works for you?

Why to avoid t-shirt sizes for user story estimation

The more I work on agile software development teams who use t-shirt sizes (S,M,L,XL etc.) to estimate user stories the more I dislike this approach. Here’s why:

  • In my opinion, the most import thing about user story sizing is relativity, and t-shirt sizes are a subjective measure of relativity: someone in the team might think a large is two times as big as a small, whereas another person might think it’s three times as big. This isn’t helped by the t-shirt analogy where it’s actually hard to determine how much bigger is a large t-shirt than a small one?
  • You can’t create a single measure of team velocity unless you define a scale that converts t-shirt sizes into a numeric size so you can measure t-shirt size relativity and velocity.
  • As soon as create a scale to convert t-shirt sizes into a numeric size you’ve essentially started using story points (in a convoluted way).

TL;DR: Using t-shirt sizes for user story estimation is confusing and ultimately leads the team to using story points so just skip t-shirt sizes and use relative story points instead.

Waterfall, Agile Development & Hyperbole

Hyperbole. Love it or hate it, it’s been around for centuries and is here to stay. And, as someone pointed out this week, I’m guilty as charged of using (abusing?) it on this blog. You just need to quickly flick through my recent posts to find such melodramatic titles such as ‘Do you REALLY need to run your WebDriver tests in IE?‘, ‘UI automation of vendor delivered products always leads to trouble‘, and  ‘Five signs you’re not agile; you’re actually mini-waterfall‘. Hyperbole supports my motto for this blog and my life: strong opinions, weakly held.

But it’s not just me who likes hyperbole mixed into their blog posts. Only this morning did I read the catchy titled ‘Waterfall Is Never the Right Approach‘ followed quickly with a similarly catchy titled rebuttal: ‘Why waterfall kicks ass‘ (I personally would have capitalized ‘NEVER’ and ‘ASS’).

While I found both of articles interesting, I think they both missed the key difference between waterfall and agile software development (and why waterfall rarely works in these fickle times): waterfall is sequential whereas agile development is (at least meant to be) iterative.

I personally don’t care whether you do SCRUM or XP, whether you write your requirements in Word™ or on the back of an index card, or even if you stand around in a circle talking about what card you’re working on.

What I do care about is whether you’re delivering business value frequently and adjusting to the feedback you get.

Sequential ‘big bang’ development such as waterfall, by its nature, delivers business value less frequently, and chances are when that value is realized the original problem has changed (depending on how long ago that was), because as I stated and believe, we live in fickle times.

Iterative development addresses this by developing/releasing small fully functional pieces of business value iteratively and adjusting to feedback/circumstance.

Just because an organization practices what they call ‘agile’, doesn’t mean they’re delivering business value iteratively. I’ve seen plenty of ‘agile’ projects deliver business value very non-frequently, they’re putting a sequential process into agile ‘sprints’ followed by a large period of end to end, business and user acceptance testing, with a ‘big bang’ go live.

Whilst I believe iterative development is the best way to work; I’m not dogmatic (enough) to believe it’s the only way to work. Whilst I believe you could build and tests parts of say an aeroplane iteratively, I still hope there’s it’s a sequential process with a whole heap of testing at the end on a fully complete aeroplane before I take my next flight in it.

Tips for great brown bag lunches

I’m a big fan of brown bag seminars also called brown bag lunches or just brown bags. I’ve seen them used very successfully to share knowledge and increase team bonding. Here’s some tips to make them successful for you.

Commit to a date and lock in a topic and presenter

Since a brown bag lunch is just as much about discussion as content, I find it’s good to commit to a date and lock in a topic and presenter. This puts pressure on the presenter to make time to get their content ready, and also not worry about having it ‘perfect’.

Give everyone an opportunity to present: try to avoid having the same person presenting over and over again. A good way to harvest ideas is to have spot near your team wall (or a trello board) where people can suggest topics they would like to hear or present.

Don’t limit the audience

Resist the temptation to make a brown bag lunch only for programmers, or only for business analysts etc. Even if the topic is aimed at programmers or testers, it’s good to have a goal to make your content interesting enough that it’ll appeal to the programmer or tester in anybody.

Don’t limit yourself to content that is directly aligned with your current work

Whilst content that is directly aligned to work is good as it’s a good way to get buy in, it’s also good to present content loosely related to what people are working on. For example, you could present a brown bag on distributed version control systems (such as git) to a team purely used to working with centralized version control (such as Subversion or TFS).

If you have a couple of short presentations during a single brown bag lunch you could possibly even have one that isn’t related to work. This is a little risky of course, but it can also be fun (I’m sure that everyone would love to hear about arid plants!). It’s also a good way to break any information filters we have.

Provide lunch

When I first started organizing brown bags, I couldn’t work out whether the term brown bag seminars came from people bringing along their own lunch in a brown bag or being provided lunch in a brown bag. But through experience I have found providing a good lunch is a key contributor to a successful brown bag seminar: ‘chimpanzees who share are chimpanzees who care‘. It also provides a good motivator for people to give up their lunch break and come along because who can resist a free lunch, right?

Make sure everyone knows each other

If you’ve got a new team, or people from different areas who don’t know each other, start with a quick icebreaker where you go around the room and get everyone to introduce themselves. I usually follow the format of ‘name’, ‘role’, ‘a fun fact’ and another random tidbit such as ‘my biggest fear’ or ‘what I’m looking forward to’.

Make sure everyone takes something away

I follow the icebreaker with a question to the audience: ‘what do you expect to get out of today’s session?’ I bring a bunch of Post-it notes and sharpies along and get each person to write a few things they want to get out of the session and stick them to the wall. Ten minutes before the end of the session the presenter reads out each objective and confirms each one has been met with whomever wrote it. If there’s something that wasn’t covered, it can be discussed, or it could even become the topic of a future brown bag.

I’ve seen lots of great objectives written from things like “learn more about automated mobile testing” to “have a nice lunch with my colleagues”.

Always leave plenty of time for discussion

The discussion generated by a brown bag seminar is as important as the content. Make sure you leave plenty of time to discuss what is being presented.

Summary

I thoroughly recommend brown bag lunches as an effective information sharing and team bonding technique, and if you get them right people can really enjoy them and look forward to them.

What’s your experience been with brown bag lunches? Good? Bad? Do you have any tips yourself?

Answer ‘Will it work?’ over ‘Does it work?’

Software teams must continually answer two key questions to ensure they deliver a quality product:

  1. Are we building the correct thing?
  2. Are we building the thing correctly?

In recent times, I’ve noticed a seismic shift of a tester’s role on an agile software team from testing that the team is building the thing correctly to helping the team build the correct thing. That thing can be a user story, a product or even an entire company.

As Trish Khoo recently wrote:

“The more effort I put into testing the product conceptually at the start of the process, the less I effort I had to put into manually testing the product at the end”

It’s more valuable for a tester to answer ‘will it work?‘ at the start than ‘does it work?‘ at the end. This is because if you can determine something isn’t the correct something before development is started, you’ll save the development, testing and rework needed to actually build what is needed (or not needed).

But how do we know it actually does work if we’re focused on will it work? How do we know that we’re building the thing correctly? The answer is automated tests.

Automated tests, written by programmers, alongside testers, during the engineering process validate the software does what it’s meant to do correctly. Behavior driven approaches assist to translate acceptance criteria directly into automated tests.

So, how can a tester be involved to make sure a team is building the correct thing?

  • get involved in writing the acceptance criteria for every story;
  • ensure a kick off for each story happens so the programmer(s) understand(s) what is expected and any edge cases or queries are discussed;
  • work with the programmer(s) to automate tests based upon the acceptance criteria;
  • ensure a handover/walk-through happens as soon as a story is finished in development to ensure that all the acceptance criteria are met and tests have been written;
  • showcase the finished product every iteration to the business.

You’ll soon find you can provide much greater value as a tester determining whether something will work and then working alongside the development team to ensure it works as it is developed.

Five remedies to make you less mini-waterfall

Yesterday I wrote about the five signs you might see if you’re practicing mini-waterfall when you think you’re agile.

In retrospect, that list seemed rather glum, so here’s five possible remedies to a mini-waterfall problem:

  1. Break user stories down into the smallest possible deliverable thing: if a story takes the whole iteration to develop then it’s too big; break it down. I’ve written previously on here about the importance of breaking things down.
  2. Embed a product owner in your team: that way stories won’t become blocked waiting signoff because the product owner is around to make the decision and unblock them. If you really can’t have a product owner embedded in your team, at least have an empowered proxy who can unblock requirements.
  3. Bring forward pain points that will take time at the end of the project: if promoting to prod is painful, try releasing to pre-prod each week. If UAT is going to take 4 weeks at the end of your project, do it once a day every fortnight to get earlier feedback. By the time you get towards the end your users will be some familiar with it they’ll be much more comfortable taking less time.
  4. Empower team members to change the agile process as they seem fit: they need to be responsible for their actions and be flexible to change again if needed, but don’t let your team members live in fear of change or retaliation.
  5. Release often: this will most probably be the hardest one to change as it’ll encounter the most resistance. You can start in small steps. For example, you could agree to release your application services/database more frequently than your user interface to see how it handles in production. Or you could release new functionality with a feature switch so that it’s disabled in production to rehearse and refine the release process. There’s no point in delivering working software every iteration if you’re only ever going to release it once or twice a year.

Five signs you’re not agile; you’re actually mini-waterfall

Update: I’ve added five remedies to make you less waterfall in a separate post

I’ve noticed a lot of projects call themselves agile when in fact they’re mini-waterfall, also known as scrumfall. Here’s five warning signs that you’ll see if you fall into that category:

  1. Development for your user stories seems to take almost all of the iteration and only move to ‘ready for test’ during the afternoon of the last day of your iteration
  2. You have a whole lot of user stories that are waiting business ‘signoff’ and can’t be worked on
  3. You have a large chunk of time set aside at the end of the project for ‘user acceptance testing’
  4. Team members live in fear of changing something or moving a story card around something as they’re scared of being ‘told off’
  5. You develop in iterations but only release everything big bang at the end when everything is considered ‘done’

Improving your agile flow

I’ve noticed two counterforces to flow on an agile team: rework and human multitasking. It’s common knowledge that rework is wasted effort, and human multitasking should be avoided as it reduces velocity through inefficient human context-switching, and can increase further errors through insufficient attention to tasks at hand.

But luckily there’s two simple things I have found that increase flow and reduce rework and multitasking.

User Story Kickoffs

It is essential that just before development work begins on every user story that a kickoff discussion occurs. This is a casual discussion around a computer between the business analyst, tester and any programmer who is working on the user story.

In my experience this takes about ten minutes standing around someone’s desk where we read aloud the acceptance criteria from Trello and discuss any ambiguities. We ensure that everything that is needed for the story to be considered complete and ready for testing is listed and that it’s not too large nor will take too long to complete.

We have special children’s sticker on our story wall which we put onto a story card that has been properly kicked off.

User story test handovers/shoulder checks

shoulder checks are essential
shoulder checks are essential

It’s also essential that as soon as development is complete that the tester and any programmers who worked on the story gather for a quick ‘shoulder check’ or test handover. This often involves letting the tester ‘play’ with the functionality on the programmer’s machine, and running through the now completed Trello acceptance criteria. Any misunderstandings or bugs can be discussed and resolved before the card becomes ready for testing.

We have special children’s sticker on our story wall which are then added to a story card that has been handed over/shoulder checked. The aim is to have two stickers on every story card in ready for test.

How these two simple activities improve flow

By conducting a user story kickoff every time it means that everyone working on developing the functionality has a common understanding of what is required and therefore there is a lot less chance of developing something that is not needed or misunderstood which requires subsequent rework.

By conducting a story test handover/shoulder check every time it means that obvious bugs and misunderstandings are raised immediately, so they can be fixed quickly before the programmer(s) moves onto working on new user stories. If discovered later these cause the programmer to multitask and context-switch between working on bug fixes and new functionality.

But I’m too busy testing stories…

I used to think that, but now I’ve got a personal rule that regardless of what I am doing or working on, I will drop it to attend a story kickoff or test handover. The benefits of me conducting these activities outweigh any work that I need to resume after these activities are complete.

Bonus Time… is it essential your bugs are fixed?

The great thing about agile software development is that developing something and testing something are a lot closer together… but they’re still apart. It’s more efficient to get someone to fix a bug whilst it’s fresh in their memory, but it’s even more efficient to not fix it at all.

What I am proposing is instead of raising medium/minor bugs against a story to be tested, raise them as bugs in the backlog to be prioritized. Depending on your organization, your business may not consider these important enough to fix, and therefore this saves you both rework and context-switching so you can continue on developing new functionality.

It’s all about breaking things (down)

A lot of testers see their job as breaking things. I like to see my job as breaking things down. In my opinion, the only way to avoid breaking down is to break things down.

“The secret of getting ahead is getting started. The secret of getting started is breaking your complex overwhelming tasks into small manageable tasks, and then starting on the first one.”
~ Mark Twain

Software is designed to solve problems. The best way to solve problems is to break them down.

Agile software development is all about breaking things down:

  • A problem is too big so let’s break it down into a project
  • A project is too big so let’s break it down into iterations
  • An iteration is too big so let’s break it down into user stories
  • A user story is too big so let’s break it down into acceptance criteria
  • An acceptance criterion is too big so let’s break it down into some tests
  • Let’s write a failing test and make it pass

This technique came in particularly handy when writing my Einstein Minesweeper Robot.

So, if you come across a problem that seems too difficult to solve: break it down.

Is test management wrong?

I was somewhat confused by what was meant by the recent article entitled “Test Management is Wrong“. I couldn’t quite work out whether the author meant Test Management (the activity) is wrong, Test Managers (the people) are wrong or Test Management Tools (the things) are wrong, but here’s my view of these three things:

Test Management (the activity): now embedded in agile teams;
Test Managers (the people): on the way out; and
Test Management Tools (the things): gathering dust

Let me explain with an example. Most organizations see the benefit of agile ‘iterative’ development and have or are in the process of restructuring teams to work in this way. A typical transformation looks like this:

Agile Transformation

Instead of having three separate larger ‘analysis’, ‘development’ and ‘test’ teams, the organization may move to four smaller cross functional teams consisting of say one tech lead, one analyst, one tester and four programmers.

Previously a test manager managed the testing process (and testing team) probably using a test management tool such as Quality Centre.

Now, each agile team is responsible for its own quality, the tester advocates quality and encourages activities that build quality in such as accurate acceptance criteria, unit testing, automated acceptance testing, story testing and exploratory testing. These activities aren’t managed in a test management tool, but against each user story in a lightweight story management tool (such as Trello). The tester is responsible for managing his/her own testing.

Business value is defined and measured an iteration at a time by the team.

So what happens to the Analysis, Development and Test Managers in the previous structure? Depending on the size of the organization, there may be a need for a ‘center of excellent’ or ‘community practice’ in each of the areas to ensure that new ideas and approaches are seeded across the cross-functional teams. The Test Manager may be responsible for working with each tester in the teams to ensure this happens. But depending on the organization and the testers, this might not be needed. This is the same for the Analysis Manager, and to a lesser extent, the Development Manager.

Step by Step test cases (such as those in Quality Center) are no longer needed as each user story has acceptance criteria, and each team writes automated acceptance tests written for functionality they develop which acts as both automated regression tests and living documentation.

So the answer the author’s original message: no I don’t think test management is wrong, we just do it in a different way now.