May-June crisis

During May and June 1995 the Apache project is on the brink of dissolution. Rob Hartill is the main, at times even the single, driving force behind the 0.6 and 0.7 releases. For each release in these series, there are immediate bug fixes. It might seem that whenever a bug is fixed, new bugs appear in another end of the system. The problems Skolnick complains about during work on the 0.5 release, remains. Why is that?

Software development aspect

I see several ways of reading the crisis story, depending on which view of software systems development is chosen.

Booch, Rumbaugh and Jacobson says a software development process "defines who is doing what when and how to reach a certain goal" [BOOCH1999](Booch et al. 1999, p. xvii). Apache's patch and vote process states what is to be done, how and when, but it doesn't say anything about who. An element is missing from Booch et al.'s equation of process. There are no responsibility matrices. Work and responsibilities are those of the team as a collective. Can this short-coming in the process be the reason behind the May-June crisis? The question to be answered is whether a responsibility matrix would have solved the problem at hand? The problem seems to be an organizational one, rather than a short-coming with the project's process model.

Bezroukov speaks of the problem of the lowest hanging fruit. The problem is described accordingly:

Those who can program naturally tend to work on programs they find personally interesting or programs that looks cool (editors, themes in Gnome), as opposed to applications considered dull. Without other incentives other than the joy of hacking and "vanity fair" a lot of worthwhile projects die because the initial author lost interest and nobody pick up the tag. [BEZROUKOV1999A](Bezroukov 1999)

This comment is consistent with the findings of Eric Monteiro in his study of the Internet Engineering Task Force's IPng work-group [MONTEIRO1998](Monteiro 1998). Monteiro's findings pointed towards the fact that once the challenging issues had been dealt with, activity stopped even though there was work left on the project before it could advance to the next step of the IETF work process. The outstanding issues were of a mundane character, most of them simply hard work that had to be done. Yet, no one took hold of these issues to pull the project forward, leaving the IPng in a half-finished state for a long time.

The Apache project is a volunteer effort. There is really no way of leveraging any pressure to make people submit new patches and fix bugs. If volunteers don't want to contribute any more, they leave. If enough volunteers stop contributing, the collaboration breaks down, something which happens with the Apache project during the May-June crisis. A volunteer project like Apache balances on a very fragile exchange economy. Eric Raymond [RAYMOND1999](1999) argues that participants in volunteering efforts like the Apache project must feel they are getting a sufficient return on the time and energy invested in the project. In a technical project like Apache, the return value is easiest measured in form of the technical quality of the software, but there are also social elements involved, Eric Raymond points out. When the returns on the investment drops below a certain level, people stop contributing. With this is mind, the May-June crisis may be understood as a symptom that fixing bugs is becoming so difficult or tedious that contributors to the Apache project doesn't feel they are getting return on their investments.

That Hartill has been shoving "an elephant down the eye of a needle" and that Robert Thau does the Shambhala rewrite for himself are both examples of the single individual's role and importance in software development projects. These findings are consistent with the relative participation by team members commented upon in [WALZ1993](Curtis et al. 1993, p. 72). Describing the group dynamics of their case study Curtis et al comments how the a small number of individuals dominate the design process. Like Hartill and Thau in the Apache project developer D1 of Curtis et al.'s study impacted upon the team effort both technically and administratively as he, like Hartill and Thau, tried to seek out and integrate new technical knowledge into the project. D1, like the Hartill/Thau duo, also influenced the team process by taking the initiative in the administration of team duties. Where D1 routinely volunteered to coordinate group activities, Hartill volunteers to pull releases 0.5 through the entire problematic 0.6 series single-handedly. Thau, on his side, volunteers for the other difficult issue, that of rewriting the code to allow easier integration of new features. The findings concerned with the diagnosis of the May-June crisis can therefore be said to be highly consistent with similar studies of software development efforts.

Returning to Raymond's exchange economy argument, it may be used to explain the May-June crisis in another way. Maybe it isn't the promise of bug free software that attracts contributors to Apache in the first place, but the chance of enhancing the Web technology. The state of the code makes adding new features too difficult, and people feel they aren't getting back sufficient on their investments. Reading the mailing list reveals that the server functionality remains fairly stable from the introduction of Shambhala to Apache 1.0. This sustains the idea that bug fixing is the only thing left on the agenda. Is NCSA based Apache code sufficiently stable? The sheer number of complaints on the 0.6 and 0.7 release series' bugs and the number of bug fix releases to these series speaks of a server code that is far from sufficiently stable to be deployed in a production environment. Rob Hartill even discourages people from using these series in production . Now that there isn't much new to add to the server, the incentive to fix bugs is lost.

This line of argument is akin to Eric Raymond's scratching an itch argument. Raymond says the main motivation for contributing to a volunteering effort is to scratch an itch, i.e. solve a problem or need the contributor has himself. His argument has been criticized for ignoring the community factor of an open source project. The idealized image of hacking is that of contributing to the common good. For the Apache project, the May-June crisis shows that Raymond is right. Apart from Rob Hartill, nobody seems particularly interested in participating in doing the dreary work of bug fixing.

Whether it is becoming too hard to fix bugs or that there is no incentive left to fix them is hard to say. It isn't unlikely that it might be a combination of the two factors. With this in mind, it is hard to think how a better defined who factor in the Apache process model would have prevented the May-June crisis. It seems more an organizational short-coming, as there is no way of making people into doing anything. Would another process have solved the problem? There really is no data to sustain any view on the matter. It would obviously have been easier to forced people into fixing the bugs, but there is no telling if the bug problem would have remained. Would any other process have been able to foster the amount of innovation seen thus far in the Apache project? Again, without a comparative study there is no telling.

Of course, the whole line of reasoning above hinges on the assumption that fixing bugs is a boring activity shunned by the Apache developers. Maybe there are other reasons why bug fixes aren't submitted to the project? Looking at the May-June crisis, it seems almost a tailored addendum to Naur's article on programming as theory building. The project's increasing problems with integrating new features and stabilizing the original NCSA based code may seem like a failure on the Apache developers' behalf to grasp the theory behind the NCSA code. It seems to be Naur's compiler story all over again. Judging from reading the mailing list, the source of the project's increasing problems is more prosaic. I think it is safe to conclude that the original NCSA code was poorly written, with little thought as to maintainability in mind. It is, after all, Rob McCool's first attempt at programming C.

What's interesting with the crisis is that from it emerges the next generation Apache architecture, Shambhala.

Direction, directing, drift

Central to the Apache group's existence is the Web server's development. Without a Web server to develop, they are nothing at this stage. That is why they worry what will happen when the NCSA team releases version 1.4 of their Web server. Seen from a more corporate point of view, the the Apache Web server's competitive edge is of strategic importance to the Apache group's survival. Their raison d'etre is to create an improved version of the NCSA server based on available patches. If version 1.4 of the NCSA Web server provides the same functionality, or even improves on the functionality already available in the Apache Web server, there is little justification for the Apache group to continue its work. Formulating a strategy is a means of staking out the future direction, and it would be appropriate for a newly started project to stake out its direction so the participants have a common goal to work towards.

The Apache project really doesn't stake out a clear direction.

The traditional approach to strategic information systems, that is information systems that by its competitive edge is of strategic importance to its producer, is to appraise the environment through a conscious and analytic process. Through a conscious and analytic process "threats and opportunities, … the strengths and weaknesses of the organization, key success factors and distinctive competencies are identified and translated into a range of … alternatives" [CIBORRA1994](Ciborra 1994, p. 7). Once the optimal strategy is found, it is laid out and implemented. This way the strategy is made explicit.

It has been pointed out that this way of thinking about strategy is flawed [CIBORRA1994](Ciborra 1994). The central axiom for traditional strategy forming is that anything can be resolved if only sufficient analytical power is applied to the problem. Like non-canonic work practices separates doing from learning, the traditional approach to formulating strategy focuses on abstract knowledge and cranial processes instead of situated knowledge from within the organization. "Strategy formation tends to be seen by the mechanistic school as an intentional process of design, rather than one of continuous acquisition of knowledge in various forms, i.e. learning" [CIBORRA1994](Ciborra 1994, p. 9). The central critique of traditional strategy formation is that it is difficult to plan before the fact, and that competitive advantage stems from the exploitation of unique characteristics within the organization.

The realization that an organization's strategic advantage lies in exploiting its unique characteristics goes a far way in explaining Nonaka and Takeuchi's enabling conditions fluctuation, creative chaos and autonomy. Implemented in the organization, the traditional strategy can be understood as a battle plan. In order for it to succeed, the entire organization has to walk in step and work towards a shared common goal.

The Apache group has no explicit strategy, yet they succeed. The closest they are to an explicit strategy is that they want to develop the next generation Web server, reflected in the mailing list's name: new-httpd. There is no authority to tell anybody what to do, and there is no battle plan ordering strategically important features and bug fixes to be implemented. Everyone is free to work on whatever they would like. Instead of viewing this as a potential hazard, it proves to be central in revitalizing the Apache Web server after the May-June crisis. While Rob Hartill is working hard to fix bugs in the original NCSA based architecture, an effort that is as close to seeing clear authority in the development group at this stage, the organization's freedom allows Robert Thau to implement the Shambhala architecture. With Shambhala the Apache Web server transcends its role as an improved NCSA server, becoming a fully fledged application on its own.

Not only does Shambhala revitalize the stale development effort, it proves to be strategically important as it gives the server the flexibility required for the next generation of Web server. Yet, this happens without ever formulating an explicit strategy. The strategy emerges from the organization's grassroots, addressing issues important to the developers. There is a crisis where routines and organizational frameworks break down. While the old project, improving the original NCSA Web server, goes on, Robert Thau's re-implementation introduces redundancy. From the creative chaos a possible solution, Shambhala, emerges. This is possible as all developers enjoy practically unlimited autonomy within the organization.

There is another, less fortunate side-effect of this lack of direction. It is best seen in Rob Hartill's tireless effort from May through November to get the Web server into a state where it can be labeled version 1.0. Central to Hartill's efforts is the development of a server with the stability to serve Web pages without crashing. While head of the Debian GNU/Linux distribution project, Bruce Perens is attributed to have said his job was like herding cats. This is a metaphor that can just as easily be said about Hartill's effort. While he is trying to push the project forward towards a stable release, other participants seem more interested in side-projects like SSL/Apache. Yet others simply don't see the rush and are quite happy to let the bugs be fixed as people see fit. The problem with getting the code into shape for a 1.0 release is closely connected with Bezroukov's "problem of the lowest hanging fruit" [BEZROUKOV1999A] (Bezroukov 1999), but it also shows the kind of drift to be expected from this kind of distributed collaborative work.