Chapter 8. Analysis

Table of Contents
Software systems development
May-June crisis
The knowledge-creating organization
Talking about software

A number of topics are raised in the vignettes. In this chapter I will try to analyze them and relate them to the theory.

Software systems development

The development process employed by the Apache project is a compromise. Early Apache developers seems to participate by one out of two reasons. There are those who see the project as a way of getting the number of patches under control. For them the project's function is more of the administrative nature. Without a centralized body, patch dependencies get out of control and it will become impossible to fix the server's bugs. On the other hand, there are those who regard the project as a means of exploring and expanding the Web technology. They want flexibility and ease of adding new features to the server.

The patch and vote system that emerges from the initial discussions, aim to provide an administrative framework for tracking and integrating patches into the server while retaining speed and flexibility. The process reflects the tension between control and flexibility inherent in the project.

The patch and vote system consists of two elements: revision control and quality assurance. The necessity of revision control is undeniable. Pre-Apache NCSA patches bear witness of that. The lack of a fixed baseline to base patches on, results in a dependency chaos that eventually becomes too difficult to resolve. By rolling a number of patches into a release, the team provides a common baseline to work towards. The latest release always being the baseline to base patches on. Frequent releases, one of Robert Thau's main concerns, makes patch dependencies manageable. The fewer patches not yet integrated with the code base, the fewer inter-dependencies to resolve when applying a patch.

Patches are voted for integration with a release. Before a patch may be integrated, it will have been tested by several other team members. The regular flow of events is for a developer to fix a bug or add new functionality to his own Web server. If it seems to be working, he runs diff to create a patch. The patch is then shared with the rest of the development team. Instead of developing extensive test suites, the developers run newly patched up code on high-traffic Web sites. Experience from running the code results in a vote for or against integrating the patch.

It may seem, then, that Eric Raymond is correct. Is the Apache project just about sharing code and parallelizing the debugging effort? To understand the project's strength, I believe we have to look beyond the paraphernalia of the implementation effort.

Use driven development

Models and methods is not the kind of terminology that suits the work style of the Apache project. Describing the project's working style with such a terminology would be to force the language of software engineering onto the software development practices of the Apache group. The patch and vote system may be construed a process, but is rather a technique to manage the complexity of virtual distributed work. That the whole patch and vote process is abolished while Robert Thau completes the Shambhala core supports this. None of the Apache developers are ever explicit about the project's work style beyond the patch and vote system. The work style is implicit and often fairly difficult to see even for the developers involved with the project. This is at odds with literature on software systems development. Erling Andersen, for instance, says that "We have emphasized that prototyping must not be confused with an unsystematic and unplanned work style. If prototyping is to succeed, it has to be driven along explicit guidelines and by people who are experts in the area [of prototyping]." [ANDERSEN1998](Andersen 1998, p. 350, my translation). Booch, Jacobson, and Rumbaugh 1999 explicitly states that:

There is a belief held by some that professional enterprises should be organized around the skills of highly trained individuals. They know the work to be done and just do it! They hardly need guidance in policy and procedure from the organization for which they work.

This belief is mistaken in most cases, and badly mistaken in the case of software systems development. … developers need organizational guidance, which … we refer to as the "software development process" (author's note: process as in the convergence of several methodologies). [BOOCH1999](Booch et al. 1999, p. xvii)

Apache does not have any organizational guidance which they work by. Their work style is unsystematic and unplanned. Yet, they obviously succeed in their efforts—having almost 60% share of the Web server market a good indication of success. I argue that this sheds some light on the over-emphasis on the perniphernalia of methods, models, tools and techniques in the software industry. Software engineering, where the quartet above play the lead roles, belongs to the scientific management tradition where the work process is rationalized. "The basic premise of scientific management is that one can reduce the best way to do a given job to a set of instructions and give those instructions to someone who does not know how to do it independently but who will then be able to do the job by following the instructions," [ORR1996](Orr 1996, p. 107). This set of instructions is the model and methods of software engineering. Within a sociological tradition this reduction of the job to a set of instructions is viewed as the management's effort to get control of the knowledge involved in the work process, de-skilling the labor. By de-skilling management can hire less skilled and cheaper labour and as such rationalize the organization [BRAVERMAN1974](Braverman 1974).

The central element in developing Apache is use. The roles of user and developer converge. Use plays an apparent role in testing and quality assurance. Through a brute force approach where the software is put in use, bugs are detected and new functionality tested. Raymond calls this parallelized debugging. Bugs are broadcasted to the other developers through the mailing list. But more important is the role use plays in knowledge creation, something the May-June crisis shows.

Managing complexity

The Apache project grows out of a situation where the complexity of change has grown out of hand. Patch interdependencies had become unmanageable to a certain extent. Thau and Skolnick's process discussion is to a large extent about handling the complexity of change. They each represent two disparate ideas on how the complexity is to be managed.

Skolnick proposes administrative mechanisms to handle the complexity. [SKOLNICKMAR95A] The core of Skolnick's system is a unique identification number assigned to every submitted patch. The id is to be used when discussing the patch, and every day an automated e-mail is to notify the developers of changes made to the patches. Once again the id is to be used for traceability. Skolnick's plan is to manage complexity by linking the patch, discussions about the patch, and possible bug reports the patch is to solve. His strategy, which is common in formal development environments, tries to reduce the environmental complexity of change by introducing traceability. Traceability suggests a controlled environment, where every change comes as a result of a planned course of actions. This can be as the response to a bug report, or a feature request. Both are based on the notion of the change request which controls what to be changed and often how. Through strict control of the changes to be made, Skolnick's administrative process is to reduce the complexity of change.

Thau's proposal suggests quite the opposite. [THAUMAR95B] By letting go of any administrative control on changes, he suggests to control the complexity through scope reduction. Instead of adding constraints to the changes, Thau wants to frequently base line the system. In this context each new release, informal or formal, is a baseline which all unresolved patches must use as basis. The administrative complexity of Thau's suggestion is marginal. Instead it distributes the complexity of change to each person submitting a patch. In theory, a new release may invalidate all previously submitted patches. As a patch is relative to its base file, once the base file has changed the patch no longer applies. Not every file in the software will be updated between releases. And for those updated, only patches directly dealing with code blocks that have changed will require substantial work for bringing it up to date. For other patches, it is just a question of inserting the changed code into the new release of the file, and run diff anew.

The end result, though, is a mix between Skolnick's high ceremony approach and Skolnick's bare bones approach. A voting model is introduced for quality assurance, and all patches are assigned an identification number for reference. The rest of Skolnick's model is never effectuated.

Where Skolnick's proposal is centralized, Thau's is decentralized. This is the major difference in ideology their proposals. Seen from the point of view of rationalizing and making the development effort more efficient, Thau's suggestion sees a lot of double work as patches need to be updated after a new release. By controlling the changes to be made, Skolnick's approach reduces and perhaps even eliminates this redundancy. Thau's proposal suggests that the work lost in updating patches is far less than the amount of time spent administrating a change control system. He sees control as a limiting factor on the creativity and speed of the development effort, and is willing to spend time updating patches to retain these advantages.

Thau's proposal sheds some additional light on Eric Raymond's principle "Release early. Release Often" [RAYMOND1999](Raymond 1999). While Raymond lists this principle as an enabling condition for innovation, the Apache project also shows it as an effective means of handling the complexity of change.

There is not much discussion surrounding neither Skolnick nor Thau's process proposals. Instead the project participants start using those parts of the two systems they fit. This way of voting with their feet can be interpreted as a variation on Levy's hands-on imperative. Through their actions the project members decide which elements of a process model is required. While this might be a valid interpretation, it lends much of Levy's romantic view on hacking, ignoring the nuts and bolts of everyday software systems development. A system is required to manage the complexity of change, which previous to setting up the new-httpd mailing list has been completely out of control.

Another aspect of the complexity issue, an aspect largely ignored by process and methodology literature, is the complexity posed by technology. Technical complexity is not trivial. The non-forking server vignette shows that even though Apache is developed for Unix operating systems, small implementational differences between the various operating systems and hardware differences impose a substantial technical complexity. The Apache project employ two strategies to resolving the complexity. One approach is analytical, the other uses brute force.

Upon implementing both non-forking behavior and virtual hosting in the Apache server, a group of team members discuss the various aspects of low-level network behavior for BSD sockets. Approaches are suggested, and responses given based on experience and knowledge of operating system specific behavior. This approach to managing technical complexity depends on in-depth knowledge of the combinations of technologies, knowledge that stems from experience using this technology. It is a social activity that is based on building a shared understanding of the complexities faced by the technology.

Once the discussion has reached a certain point, it is replaced by the brute force approach to managing technological complexity. The brute force approach is to implement the functionality and then try it out in as many different operating environments as possible. The many combinations of operating system and BSD implementation does prove difficult, as many of the bug reports during May-June-July deal with networking errors on certain platforms. The brute force approach is then to fix the bug, and try once again. It is a process of trial and error. Trial and error is frowned upon by efficiency afficiados, as it is regarded as a failure to fully analyse the problem at hand. The question to be raised is whether a discussion could have fully exhausted every possibility and every difficulty in the combination operating system and networking implementation. Another question is whether taking the discussion all the way and fully explore every possibility would be more time-efficient. It is difficult to say, and it is a hotly debated issue.

The above discussion may give the impression that the Apache developers manage technological complexity in a well-defined manner. It is not always so. The introduction of Shambhala, for instance, sees no prior discussion. Here Robert Thau just goes straight ahead and implements the new architecture. A brute force approach, of sorts. Discussions on the appropriateness of the solution comes more as an afterthought. Even with virtual hosting the real work starts with implementation. Unlike Shambhala, implementing virtual hosting is more of an iterative process where analysis and brute force follows in succession a number of times. Still, the two elements of managing technological complexity remains: analysis and brute force.