|The User-Developer Convergence: Innovation and Software Systems Development in the Apache Project|
The Apache project shows software systems development as an activity iterating between the collective and the individual. Individial initiative is paramount to maintaining the project's momentum. Yet, building an understanding of the software, the theory behind it so to say, is a social activity. As important as building the system, is the building of a joint understanding of the software system and its environment through collective reflection and the sharing of experiences and individual insights. With a collective mental model of the system, architectural drawings play a no important role.
While the artefact of architectural drawing is missing, it does not mean there has been no form of initial surveying to base prior to building the software. Apache starts out as a working piece of software, the original NCSA code. The original code proved inadequate in the long run, but it plays an important role in the initial surveying. The NCSA server plays the role of a working prototype. Fixing its bugs and adding new functionality gave the Apache developers something to talk about. While never expressed, the period leading up to Rober Thau's release of Shambhala can be understood as a period of collective exploration of what the Apache developers really want for their Web server. At the heart here is the dynamics between use and learning, and the importance of the user-developer convergence. It is the individual work of Robert Thau that saves the day, as he catches the essence of what the project members want of their Web server and implements it. It is Thau's personal insight and his understanding of the Web server that is the Shambhala code. Of course, Thau's insights relies on the collective understanding built through working on Apache during the first six months of 1995.
The key to understanding learning and innovation in the Apache project therefore lies in understanding the dynamics of the user-developer convergence. Innovation is not a result of detached reflection and planning. It does not come from a thorough, structured analysis of the problem domain. Instead it comes as a byproduct of direct interaction between understanding, creating and using the software. Innovation in this respect becomes understanding what is needed to make use of the emerging Web technology for different purposes. Internet Service Providers have the use for hosting a number of individual Web sites on the same server. From this need rises virtual hosting. PHP rises from the need to provide server side processing to Web pages. The user-developer convergence is learning by doing, and thereby innovating by doing.
The learning dynamics of the user-developer convergence can be explained using Nonaka and Takeuchi's theory of knowledge creation, as I have in this thesis. Pelle Ehn's makes use of Wittgenstein's language game to make a similair point in his study of the UTOPIA project [EHN1992](Ehn 1992), which would be an equally fruitful angle into my material. Both theories say something about how knowledge is shared. They highlight the social aspect of buidling a shared common understanding of the problem domain. More interestingly for software systems development in practice, is the enabling conditions for such an exchange of ideas to take place. Nonaka and Takeuchi have identified these in their work, but for the special case of software systems development I believe the addition of use is appropriate. To understand software prototypes needs to be used, becoming the focal point of an ongoing dialog. With this thesis I have tried to show how succesful such an approach can be.
While much of my analysis focuses on the knowledge side of software systems development, a goal with the vignettes has been to show the techical complexity involved with developing the Web server. While a trivial piece of software in and of itself, the complexity presented by interoperability across hardware and software platforms require a good deal of technical knowledge. While Apache is written to run on Unix platforms based on similar APIs, there are nuances between the platforms that must be taken into consideration when writing code. Hardware presents its own challenges, as an issue like symetric multi-processing for instance, affect the way software libraries behave.
The patch and vote process is an administrative way of managing the complexity of change. Shambhala is a technical sollution to solve some of the same problems, but also as a way of handling the project's technical complexity. The architecture isolates the server's core functionality, providing hooks to attach additional functionality. This shows that there are limits to how complex a software system can be, despite all efforts to collaborate on building an understanding of it. It also shows that complexity needs to be handled, and it can't always be handled in the human world. Software complexity is a technical issue that requires technical know-how to solve. Solving it, on the other hand, isn't neccesarily the result of reflection, but rather the contrary: the result of practical use.
Hacking, as shown in Apache project, is therefore to be understood as a social endeavour iterating between the collective and individual, emphasising on the skills and insights of individuals combined with collective reflection and sharing of ideas and opinons in an environment with little or no difference between learning and doing.
My case study is limited in scope. Compared with seven years of development, I have narrowed the case down to a duration of some six odd months. I have further limited its scope my focusing on the development and knowledge aspects of the development effort. By broadening the scope, there are other pieces of the Apache story that warrants a closer examination. In addition, the Apache story touches upon and hints at aspects of software systems development that might be well worth a closer look. In this section I will try to highlight a few of these tidbits of future research that appeals to me.
The Apache project shows that a traditional software engineering methodology isn't required to develop software. There is no proof to invalidate that methodology may be of use in when developing software, though. It would be ignorant, not to say arrogant, to assume that methodology plays no role in developing software. There is a reason why software engineering has gained such wide-spread adoption. However, proponents of software engineering have their own ways of rationalizing its uses. Methodology can at times seem fuelled by its own inner logic. To look at a project adherring to a formal software development process using much of the same theory and methodology as applied to the Apache project, could lead to some interesting insights into both the nature of software development methodologies and how software is really developed.
Due to its wide spread adoption, Apache is an important part of the Web infrastructure. There are two ways of looking at the infrastructural role Apache plays: a micro scale and a macro scale. At the micro scale it plays an important role in the infrastructure of a Web site. The server is the part that makes the Web pages availble on the Internet. Apart from that, there are features specific to Apache. Many Web sites make use of these features. Making great changes to Apache could mean considerable restructuring of Web sites powered by Apache. These are the kind of scaling issues Robert Thau has to take into consideration when writing the Shambhala core. There are definite elements of scaling of infrastructure in while building the justification for Shambhala, elements which I have chosen not to include in this thesis but which are an important factor in understanding how software systems are being developed.
In the initial draft this thesis included a vignette titled the "AOL debacle" that takes place around Christmas 1996. At that time AOL was the largest Internet provider in the United States, providing Internet access through its own wide area computer network system. Access to the Internet was indirectly provided through Web proxies. An upgrade to their proxy software made all Web sites powered by Apache unavilable to AOL's users. AOL claimed this was an implementational error with Apache, while the Apache group claimed the contrary. The vignette shows both the technical and organizational elements of scaling infrastructure with a large inertia of installed base. A closer examination of the debacle could provide insights into managment of open, cooperative infrastructures like the Internet, lacking a central authority to enforce standards on actors within the infrastructure. There is also a technical aspect to the debacle. It shows some of the transition strategies inherent in the HTTP, an interesting issue for better understanding how to design scalable IT infrastructure.
There is a conflict not only between labor and capital, but also between managment planning and the role of a development department plays. The conflict is apparent in large organizations where development departments are seen as instrumental in implementing the management's strategies. Deadlines are set in context of management plans. Seen in context of a software development department with little or no influence on the decision process, these deadlines seem arbitrary and are often impossible to meet.
The Apache project is a simple organization with few conflicting interests. It has none of the organizational complexities of large, commerical organizations with the constant tension of interaorgainzational interest conflicts. Previous research has looked at role software systems development plays in resolving the conflict between capital and labor, but as far as I can tell there is currently no research on developing software seen from the developers' point of view. It would be interesting to see how software is being developed in a corporate environment of constant intra-organizational tension, what role the development department plays in the power games between differing interests, and which role the developers themselves assume in such an environment.
There is no telling how the Apache project would have turned out if it did employ a more traditional software engineering approach to the development effort. As there is no data to do a comparative studt, it is therefore impossible to do a qualitative analysis of their approach. It is also impossible to tell how the Apache project's approach to software systems development would have worked in another environment, say for developing mission-critical real-time applications. Empiric studies of work practices in more traditional software engineering environments could shed some further light on the role methodologies really play in developing software systems. Of special interest would empirical studies of fairly similair projects with different approaches to software systems development, be. This could shed some light on the role of cannonical and non-cannoncial practices.
Does the Apache provide any evidence that software systems development is not an engineering discipline? It certainly shows that there is both a technical and knowledge element involved. But what about the entire engineering metaphor. Is engineering really applying mathematics and scientific methods to construction? What is the role of construction drawings in an engineering process? Is building houses simpler than software systems development because all parties involved have an understanding, Ryle's theory, of what a house is? Some work has been done on the appropriatness of comparing software systems development with building houses, but can a study of construction builders' work practices shed further light on the appropriatness of the metaphore?