AMA: Iterative vs Incremental Development

Mario asks:

I have a question to ask your post on iterative vs. Incremental Software Development:

https://watirmelon.blog/2015/02/02/iterative-vs-incremental-software-development/

In the incremental approach, the few features implemented in all of their requirements can be changed after user feedback? Or, does this only happen with the iterative approach?

My response:

Thanks for your question Mario. This can, and should, happen with both approaches, but I’d say the incremental approach is actually more likely to get customer/user feedback as it’s a more polished, albeit smaller, user experience, and therefore more likely to land in front of users. The painting analogy isn’t the best as the requirements and level of ‘done’ are pretty clear, but the general rule is to seek feedback as soon as possible, and both approaches are designed to do just that.

Avoiding LGTM PR Cultures

Introduction

Making a code change when using a distributed version control system (DVCS) like Git is usually done by packaging a change on a branch as a “pull request” (PR) which indicates the author would like the project to “pull” the change into it.

This was, and is, a key part of open source projects as it allows outside contributors to contribute to a project in a controlled way, however many internal software development teams also work in this fashion as there are many benefits of this approach over committing directly to a shared branch or trunk.

I’ve seen the pull request approach have a positive impact on software quality since pull requests facilitate discussion through peer reviews and allow running of automated tests against every commit and change that is proposed to be merged into the default branch.

taken lgtm.jpg

What is a LGTM PR culture?

I’ve also seen some negative behaviours emerge when moving to pull request based development which I’ll call a LGTM PR culture.

LGTM is a common acronym found in peer reviews of pull requests which means “Looks Good To Me”, and I’ve seen teams let unsuitable changes through with LGTM comments without doing solid peer reviews and testing.

How do you know if you have a LGTM PR culture?

One way to “test” your peer review process is by creating PRs and leaving a subtle bug or something not quite right that you know about in the PR. When it gets reviewed do you get a LGTM? I did this recently and whilst the PR didn’t even do what it was meant to do I received a LGTM 😕

2dsag2

How can you move away from a LGTM PR culture?

It’s tempting to just tell everyone to do better peer reviews but it’s not that easy!

I’ve found there’s some steps that an author of a pull request can do to facilitate better pull request reviews and lead towards a better culture.

1. Make pull requests as small as possible:

The smaller the pull request the more likely you’ll get specific and broad feedback on it – and you can then iterate on that feedback. A 500 line change is daunting to review and will lead to more LGTMs. For larger refactorings where there’ll be lots of lines changed, you can start with a small focussed change and get lots of review and discussion, and once the refactoring is established with a smaller example you can easily apply that feedback to a broader impact PR that won’t need as much feedback as you’ve established the new pattern previously.

2. Review your own pull request

Review your own work. This works best if you do something else then come back to it with a fresh mind. Anything you’re unsure about or doesn’t look right leave it as a comment on your own PR to encourage other reviewers to look closely at those areas also.

3. Include clear instructions for reviewing and testing your pull request

A list of test steps is good as well as asking for what type of feedback you’d like – you can explicitly ask reviewers something like “please leave a comment after your review listing what you tested and what areas of the code you reviewed.” This discourages shortcuts and LGTMs.

4. Test your peer review process – see above.

Conclusion

Developing software using pull requests can mean much higher quality code and less technical debt due to the feedback on peer reviews that accompany pull requests. As an author you can take steps to ensuring pull requests are easy to review and encourage a culture of effective peer reviews.

AMA: Time Estimation

Paul asks…

What is your stance on time estimation (involved people, granularity/level of detail, benefit)?

My response…

I’d like to start by stating that I’m by no means an expert on this topic; so please take what you will from what I write.

Time and effort estimation for any software development activity is very difficult to do so often we get our estimates very wrong. I believe this is because we try to do up-front time and effort estimation without fully understanding the domain or the extent of the problem we’re solving; we still have many unknown-unknowns.

We can still do detailed/granular planning, but we should try to delay the detailed estimation of these until we have more information.

What I prefer is detailed planning  up front, which involves breaking large lofty goals down into small goals. These small goals are broken down further into the smallest possible manageable unit of work that delivers something, however small that something is. It’s important to break things down to this level as this enables continuous delivery, and flexibility in scope as a project progresses.

Once these small units of work are detailed, before trying to estimate these, I think there’s validity in starting work and delivering some of these units of work. This will mean it’s possible to more accurately estimate the remaining work based upon real delivery experience.

As soon as you begin working on each unit you should get a feel for the size and effort that is required for each unit, and over a period of time (say a fortnight) you can start to work out how many of these units you can achieve (your velocity).

If you’ve got the detailed plan of how many units total you’d like to achieve, it is probably at this point where you realise that what you wanted to achieve is going to take too long or cost too much. This realisation means you need to prioritise all remaining work, and focus on what is high priority.

I’ve never seen a project finish with the same intentions as when it started, so as you progress you will find some items get completely de-prioritised (no longer in scope), some things become higher priority so they get delivered sooner, and some completely new ideas/pieces of functionality may be decided upon and included in your plan.

Since you understand what you’ve been able to deliver you can then have sensible conversations about what is feasible given the resources available.

Futurespectives are fun

Since my team (and every team at Automattic) is 100% distributed, it’s important that we meet in person a few times a year (somewhere in the world) to hang out, co-work, eat and plan together: we call these team meetups.

Two weeks ago I spent the week in La Jolla in beautiful Southern California working with my team. Each team member was asked to suggest activities/projects to work on for the meetup and I suggested we do a futurespective.

Most people are familiar with a retrospective as they’re very common in agile software development, but I’ve found futurespectives to be much less common.

A futurespective is an activity where a team can work together to create a shared vision for the future.

There’s not a huge amount of information online about how to facilitate a futurespective, so I went with this structure:

  1. Prime directive (5 mins)
  2. Check-in: clear the air (5 mins)
  3. Explain the purpose of the excercise: what we are aiming to get out of this (5 mins)
  4. Move to the future: Imagine a nirvana state (20 mins)
  5. Coming back: Success factors that got us there (20 mins)
  6. Now: what can we do to start achieving those success factors (20 mins)

Prime Directive

I found this prime directive online, and whilst it sounds a little cheesy, it set the tone for the excercise which is about working together for a better future together:

‘Hope and confidence come from proper involvement and a willingness to predict the unpredictable. We will fully engage on this opportunity to unite around an inclusive vision, and join hands in constructing a shared future.’ – Paulo Caroli and TC Caetano

Check in

There’s no point working on a team excercise to plan for the future if there’s something in the air, so it’s worthwhile just checking in on the team and how everyone is feeling about the current state of things.

Explaining the Purpose of the Excercise

The prime directive is a good start for this, but it’s worth explaining that the team will be brainstorming and working together to achieve a list of action items at the end of the excercise that will directly impact our future.

Move to the Future: Imagine a Nirvana State (20 mins)

This is where you start by setting the scene 12-18 months in the future where a particular milestone has been successfully achieved – this might be finishing a big project you’re working on, or having launched a new product etc. This is the nirvana state. Ask a question that you would like answered by this excercise: for example: ‘what does testing and quality look like on this day?’

Get each person to spend 10 mins writing sticky notes about the state of your particular question, what it is like, but not delving into how it is like this.

An example might be: ‘everyone is confident in every launch’ or ‘everyone knows what the right thing to work on is’.

As each person is finished we put these sticky notes on a wall and logically group them, and then vote on which are most important (each person is given typically three votes and marks three notes or groups with a sharpie).

Coming back: Success factors that got us there (20 mins)

From the first excercise you should have a list of three or four most end-states, and now we use these to brainstorm for about 10 minutes the success factors (hows) that got us to these end-states.

For example, a success factor for ‘everyone is confident in every launch’ could be ‘unit tests are super easy to write/run all the time (fast)’.

Once people have had time to write these up, we logically group them under our three or four headings on the wall so we can see these clearly.

Now: what can we do to start achieving those success factors (20 mins)

Our final activity is working out what we can do now to lead to these success factors which will get us to our end-goals. At this point you can either brainstorm again, or as a team start discussing what we can do.

If you need some structure you could use “Start Doing/Stop Doing/Keep Doing” to prompt for ideas, otherwise any format you want.

The goal here is after 20 mins have a list of action items that you can easily assign to someone knowing that these will lead to success factors and your end goals you’ve come up with as a team.

An example would be ‘ensure that 100% bugs are logged in one tool (GitHub)’ which can be assigned to someone.

Ensure someone is tasked with taking photos and writing up the findings, at least the action items and circulating these around.

Summary

The Futurespective we ran as a team was very useful as it had enough structure that enabled us to get through a lot of thought in a short amount of time. We did this on the first morning of our meetup and having this structured activity set the tone for the week as we could refer back to what we’d discussed in future activities during the week.

I thoroughly recommend this as a team planning tool.

 

A tale of working from trunk

Let me share with you a story about how we went from long lived feature/release branches to trunk based development, why it was really hard and whether this is something I would recommend you try.

Background

I’m familiar with three main approaches to code branching for a shared code-base:

  1. Long lived feature/release branches
  2. Short lived feature branches
  3. Trunk based development

Long lived feature/release branches

Most teams will start out using long lived feature/release branches. This is where each new project or feature branches from trunk and at a point where the branch is feature ready/stable then these changes are merged into trunk and released. The benefits of this approach is that changes are contained within a branch, so there’s little risk of non-finished changes inadvertently leaking into the main trunk, which is what is used for releases to production. The biggest downside to this approach, and why many teams move away from it, is the merging that has to happen, as each long lived feature branch needs to ultimately combine its changes with every other long lived feature branch and into trunk, and the longer the branch exists, the more it can diverge and the harder this becomes. Some people call this ‘merge hell’.

Short lived feature/release branches

Another version of feature branching is to have short lived feature branches which exist to introduce a change or feature and are merged (often automatically) into the trunk as soon as the change is reviewed and tested. This is typically done using a distributed version control system (such as GIT) and by using a pull request system. Since branches are ad-hoc/short-lived, you need a continuous integration system that supports running against all branches for this approach to work (ours doesn’t), otherwise it doesn’t work as you’d need to create a new build config every time you created a short lived feature branch.

Trunk Based Development

This is probably the simplest (and most risky) approach in that everyone works from and commits directly to trunk. This avoids the needs for merges but also means that trunk should be production ready at any point in time.

A story of moving from long lived feature/release branches to trunk based development

We have anywhere from 2-5 concurrent projects (each with a team of 8 or so developers) working off the same code base that is released to production anywhere from once to half a dozen times per week.

These project teams started out using long-lived feature/release branches specific to projects, but the teams increasingly found merging/divergence difficult – and issues would arise where a merge wasn’t done correctly, so a regression would be inadvertently released. The teams also found there would be manual effort involved in setting up our CI server to run against a new feature/release branch when it was created, and removing it when the feature/release branch was finished.

Since we don’t use a workflow based/distributed version control system, and our CI tools don’t support running against every branch, we couldn’t move to using short lived feature branches, so we decided to move to trunk-based development.

Stage One – Trunk Based Development without a Release Branch

Initially we had pure trunk based development. Everyone committed to trunk. Our CI build ran against trunk, and each build from trunk could be promoted right through to production.

Trunk Based Development without a Release Branch(1)

Almost immediately two main problems arose with our approach:

  1. Feature leakage: people would commit code that wasn’t behind a feature toggle which was inadvertently released to production. This happened a number of times no matter how many times I would tell people ‘use toggles!’.
  2. Hotfix changes using trunk: since we could only deploy from trunk, each hotfix would have to be done via trunk, and this meant the hotfix would include every change made between it and the last release (so, in the above diagram if we wanted to hotfix revision T4 and there were another three revisions, we would have to release T7 and everything else it contained). Trying to get a suitable build would often be a case of one step forward/two steps back with other unintended changes in the mix. This was very stressful for the team and often led to temporarily ‘code freezes’ whilst someone committed a hotfix into trunk and got it ready.

Stage Two – Trunk Based Development with a Release Branch

Pure trunk based development wasn’t working, so we needed some strategies to address our two biggest problems.

  1. Feature leakage: whilst this was more of a cultural/mindset change for the team learning and knowing that every commit would have to be production suitable, one great idea we did implement was TDC: test driven configuration. Since tests act as a safety net against unintended code changes (similar to double entry book-keeping), why not apply the same thinking to config? Basically we wrote unit tests against configuration settings so if a toggle was turned on without having a test that expected it to be on, it would fail the build and couldn’t be promoted to production.
  2. Hotfixing changes from trunk: whilst we wanted to develop and deploy from a constantly verified trunk, we needed a way to quickly provide a hotfix without including every other change in trunk. We decided to create a release branch, but not to release a new feature per say, but purely for production releases. A release would therefore involve deleting and recreating a release branch from trunk to avoid having any divergence. If an hotfix was needed, this could be applied directly to the release branch and the change would be merged into trunk (or the other way around), knowing that the next release would delete the release branch and start again from trunk. This alone has made the entire release process much less stressful as if a last minute change is needed for a release, or a hotfix is required, it’s now much quicker and simpler than releasing a whole new version from trunk, although that is still an option. I would say that nine out of ten of our releases are done by taking a whole new cut, whereas one or so out of ten is done via a change to the release branch.

Trunk Based Development with Release Branch(2)

Lessons Learned

It’s certainly been a ride, but I definitely feel more comfortable with our approach now we’ve ironed out a lot of the kinks.

So, the big question is whether I would recommend team/s to do trunk based development? Well, it depends.

I believe you should only consider working from trunk if:

  • you have very disciplined teams who see every-single-commit as production ready code that could be in production in a hour;
  • you have a release branch that you recreate for each release and can uses for hotfixes;
  • your teams constantly check the build monitors and don’t commit on a red build –  broken commits pile upon broken commits;
  • your teams put every new/non-complete feature/change behind a feature toggle that is toggled off by default, and tested that it is so; and
  • you have comprehensive regression test suite that can tell you immediately if any regressions have been introduced into every build.

Then, and only then, should you work all off trunk.

What have your experiences been with branching?

Extensive post release testing is sign of an unhealthy testing process

Does your organization conduct extensive post-release testing in production environments?

If you do, then it shows you probably have an unhealthy testing process, and you’ve fallen into the “let’s just test it in production” trap.

If testing in non-production environments was reflective of production behaviour, there would be no need to do production testing at all. But often testing isn’t reflective of real production behaviour, so we test in production to mitigate the risk of things going wrong.

It’s also the case that often issues are found in a QA environment don’t appear in a local development environment.

But it makes much more sense to test in an environment as close to where the code was written as possible: it’s much cheaper, easier and more efficient to find and fix bugs early.

For example, say you were testing a feature and how it behaves across numerous times of day across numerous time zones. As you progress through different test environments this becomes increasingly difficult to test:

In a local development environment: you could fake the time and timezone to see how your application behaves.
In a CI or QA environment: you could change a single server time and restart your application to see how your application behaves under various time scenarios: not as easy as ‘faking’ the time locally but still fairly easy to do.
In a pre-production environment: you’ll probably have clustered web servers so you’ll be looking at changing something like 6 or 8 server times to test this feature. Plus it will effect anyone else utilizing this system.
In a production environment: you’ll need to wait until the actual time to test the feature as you won’t be able to change the server times in production.

Clearly it’s cheaper, easier and more efficient to test changing times in an environment closer to where the code was written.

You should aim to conduct as much testing as you can in earlier test environments and taper this off so by the time you can a change into production you’ll be confident that it’s been tested comprehensively. This probably requires some change to your testing process though.

Tests Performed per Environment

How to Remedy A ‘Test in Production’ Culture

As soon as you find an issue in a later environment, ask why wasn’t this found in an earlier environment? Ultimately ask: why can’t we reproduce this in a local environment?

Some Hypothetical Examples

Example One: our tests fail in CI because of JavaScript errors that don’t reproduce on a local development environment. Looking into this we realize this is because the JavaScript is minified in CI but not in a local development environment. We make a change to enable local development environments to run tests in minified mode which reproduces these issues.

Example Two: our tests failed in pre-production that didn’t fail in QA because pre-production has a regular back up of the production database whereas QA often gets very out of date. We schedule a task to periodically restore the QA database from a production snapshot to ensure the data is reflective.

Example Three: our tests failed in production that didn’t fail in pre-production as email wasn’t being sent in production and we couldn’t test it in pre-production/QA as we didn’t want to accidentally send real emails. We configure our QA environments to send emails, but only to a white-list of specified email addresses we use for testing to stop accidental emails. We can be confident that changes to emails are tested in QA.

Summary

It’s easy to fall into a trap of just testing things in production even though it’s much more difficult and risky: things often go wrong with real data, the consequences are more severe and it’s generally more difficult to comprehensively test in production as you can’t change or fake things as easily.

Instead of just accepting “we’ll test it in production”, try instead to ask, “how can we test this much earlier whilst being confident our changes are reflective of actual behaviour?”

You’ll be much less stressed, your testing will be much more efficient and effective, and you’ll have a healthier testing process.

Testing beyond requirements? How much is enough?

At the Brisbane Software Testers Meetup last week there was a group discussion about the requirement to test beyond requirements/acceptance criteria and if you’re doing so, how much is enough? Where do you draw the line? It came from an attendee who had a manager pull him up for a production bug that wasn’t found in testing but wasn’t in the requirements. If it wasn’t in the requirements, how could he test it?

In my opinion, testing purely against requirements or acceptance criteria is never enough. Here’s why.

Imagine you have a set of perfectly formed requirements/acceptance criteria, we’ll represent as this blue blob.

Requirements

Then you have a perfectly formed software system your team has built represented by this yellow blob

System

In a perfect, yet non-existent, world, all the requirements/acceptance criteria are covered perfectly by the system, and the system exists of only the requirements/acceptance criteria.

Requirements - System

But in the real world there’s never a perfect overlap. There’s requirements/acceptance criteria that are either missed by the system (part A), or met by the system (part B). These can both be easily verified by requirements or acceptance criteria based testing. But most importantly, there are things in your system that are not specified by any requirements or acceptance criteria (part C).

Requirements - System(1)

These things in part C often exist of requirements that have been made up (assumptions), as well as implicit and unknown requirements.

The biggest flaw about testing against requirements is that you won’t discover these things in part C as they’re not requirements! But, as shown by the example from the tester meetup, even though something may not be specified as a requirement, the business can think they’re a requirement when it effects usage.

Software development should aim to have as few assumptions, implicit and unknown requirements in a system as reasonably possible. Different businesses, systems and software have different tolerances for how much effort is spent on reducing the size of these unknowns, so there’s no one size fits all answer to how much is enough.

But there are two activities that a tester can perform and champion on a team which can drastically reduce the size of these unknown unknowns.

1 – User Story Kick-Offs: I have only worked on agile software development teams over the last number of years so all functionality that I test is developed in the form of a user story. I have found the best way to reduce the number of unknown requirements in a system is to make sure every user story is kicked-off with a BA, tester and developer (often called The Three Amigos) all present and each acceptance criterion is read aloud and understood by the three. At this point, as a tester, I like to raise items that haven’t been thought of so that these can be specified as acceptance criteria and are unlikely to either make it or not make it into the system by other means or assumptions.

2 – Exploratory Testing: As a tester on an agile team I make time to not only test the acceptance criteria and specific user stories, but to explore the system and understand how the stories fit together and to think of scenarios above and beyond what has been specified. Whilst user stories are good at capturing vertical slices of functionality, their weakness, in my opinion, is they are just a ‘slice’ of functionality and often cross-story requirements may be missed or implied. This is where exploratory testing is great for testing these assumptions and raising any issues that may arise across the system.

Summary

I don’t believe there’s a clear answer to how much testing above and beyond requirements/acceptance criteria is enough. There will always be things in a system that weren’t in the requirements and as a team we should strive to reduce the things that fall into that category as much as possible given the resources and time available. It isn’t just the testers role to either just test requirements or be solely responsible/accountable for requirements that aren’t specified, the team should own this risk.