Project B: Risk does not Go Away if You Are Agile Agile Testing

In this project the aim was to replace the routing mechanism for a trading system. The trading platform had already been developed and was not the main subject of testing.

A risk-based approach to testing had been considered, but management decided this was not necessary. The main grounds were that all parts of the system were critical and therefore of equal risk. Also, there was considerable confidence in the trading platform because it had been fully tested. There was a feeling that, because agile inherently mitigates against risks, no further risk analysis was required. How true is that?

I see agile mitigating risks in two areas. The first is in the business risk category, namely that of minimizing the risk of developing the wrong system, a system that does not support business operations. This is achieved by involving users directly in the agile process to incrementally specify and sign off stories. The second is in the project risk category. An agile approach is meant to reduce the risks of delivering late by decomposing the project into stories that Scrum, or similar management approaches, can manage well.

What agile does not appear to cover is product, or system, risk. I make this as an observation and not as a criticism. Product risks are risks associated with failure of an internal component of a system. They can be related very much to architecture and design;for example, in a network system you avoid building single points of failure. A product risk register usually drives two elements in a project. By identifying high-risk components, developers can reduce the impact of their failure by redesign(e.g., by building multiple routes through a network). The Internet is perhaps the ultimate network design, having been proposed to keep operating in a situation of global conflict. The second element is to reduce the probability of failure in the first place. High reliability is achieved by focusing testing on high-risk components, and the risk register is used to identify test coverage required and the testing techniques, with metrics, that can provide them.

The lack of understanding of product risks regularly created problems that could have been forestalled. For example, transaction routers are tricky and complex and can have varying abilities to replay or rerun transactions in the case of failure. It was discovered that under several circumstances trades could be lost and not rerun. The impact on testing was severe. Financial products often require test cycles spanning days or weeks. That is, testing on day one has to be successful for day two onward to proceed. Now if some of the day one transactions are lost due to, say, a configuration error and in a router, and cannot be rerun, day two tests onward cannot be executed. If it is a router to a specific external system, then that system may have to be removed completely from the current cycle.

These risks could have been identified at the outset. However, without a risk workshop this information remained hidden until the risks turned into failures during test execution. Mitigating actions could have been put in place. Additional tests prior to major test cycles could have been run to show the router was configured correctly. Additional breakpoints could have been created. This meant additional cycles and test runs needed to be executed, consuming precious time and resources.

Another example of misunderstood risk was in the performance and capacityplanning of some key servers. The Information Technology Infrastructure Library(ITIL)best practices places much emphasis on capacity planning(i.e., can your systems cope with the expected workloads, both normal and peak). This, in turn, feeds into design, development, and testing. However, no capacity planning or related risk assessment was done. Consequently, integration and acceptance testing suffered because servers were unable to cope with even moderate test workloads, let alone volume and stress tests. The situation was only resolved with a moratorium on development and testing while new servers were procured, installed, and configured – more delays.

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd Protection Status

Agile Testing Topics