The case study

The empirical basis of this thesis is a case study of the Apache project. I am looking at the project from its inception in early March 1995 until the release of Apache 1.0 by Christmas the same year. As mentioned above, the project's inception comes as a response to the state of the NCSA web server. Upon entering the story the Apache develops do not see themselves as more than a group of disgruntled users taking matters in their own hands. In fact, one participant says that the biggest problem facing the group is the newly released beta version of the NCSA 1.4 web server. Will the Apache group have the right for life, a justification for their dissident activities, if the NCSA group finally gears up and released a new version of their software?

The Apache project has no formal organization. It is not a membership organization. No prior requirements have to be met to participate. Anyone wanting to participate may do so, although most participants do contribute code to the project. There are formally no leaders. Every participant has an equal say in the decision making process. The project is made up of people using the NCSA web server, but looking to improve the software. Some of them have been exchanging bug fixes and feature enhancements for a good while. All participants are volunteers, developing Apache on their spare time. Almost exclusively all participants have full-time jobs or attend studies for a higher eduction. They are all male, they all live in the United States or Western Europe, and they all have various kinds of higher education. While the demography of the Apache team is not a part of this thesis, I will try to give an impression of the demography by short presentations of the most prominent members during the course of the next chapter.

It is difficult to know what to call the group of people hanging around the new-httpd mailing list at this stage. They are undeniably a community. As such I could have called them the new-httpd crowd. But the mailing list's name, new-httpd, suggests they are doing more than just improving the existing NCSA server. So they are more than a just a community. They are a community set to developing a new Web server, a community of developers working on the Apache web server. I have therefore chosen to call the community the Apache group, and what they do the Apache project.

Members of the Apache project are geographically distributed. During this early stage of the project the majority of participants are found in North America, but there are a few participants can be found in Western Europe. The tools used in coordinating the development effort are crude compared to the many group ware offerings available on the market today. E-mail and FTP are the primary tools used in coordinating the team. This is sufficient for their needs. To coordinate the effort the developers need means of communication, e-mail, and file sharing, FTP. The mailing list is hosted on Brian Behlendorf's Internet host hyperreal.com. hyperreal.com forms the group's focal point as it also host's the project's FTP archives and their Web pages. The FTP server is assigned an Incoming directory, where contributors upload their patches.

While the Apache project has no formal organization, it does have an institutionalized development process. The process is based on peer review, where all new code submitted has to be voted upon before it can be integrated with the with the code base. New code is submitted in the form of patches. Votes are based on the participants' experiences through applying patches to their own web servers. Once enough patches have been tested, a round of voting is announced on the new-httpd mailing list. The announcement includes the list of patches up for voting. Everyone can cast their votes. Each developers casts one vote for every patch on the list. Voting is handled through a numeric system. A vote of +1 for patches that should be included with the next release, a 0 to signal indifference about the inclusion or exclusion of a patch, and -1 to veto the patch For a patch to be integrated with the official Apache code base, it needs at least +3 vote points and no vetoes.

In this patch and vote system there are two semi-formal roles, the version builder and the vote coordinator. While the Apache voting rules and guidelines [HARTILL1995] implies that the voting coordinator role need not be assigned a specific person who holds the title over time:

A voting session can be initiated by anyone so long as a volunteer or volunteers can be found to:

in reality one person holds this title over time and hands it off to his successor [THAUAUG95A](Thau, posted on new-httpd mailing list August 31 1995). The vote coordinator's task is to initiate the call for votes on new patches submitted since the last release. Upon initiation the round of voting is to be given a deadline, at which the voting ceases and the poll is handed over to the version builder.

The version builder is to "apply approved patches to create the NEXT version of the system" [HARTILL1995](Hartill and Fielding 1995). The version builder is also supposed to be "making the changes called for by other approved (non-patch) action items, adding the approved action item descriptions to the change log, and incrementing the version number" [HARTILL1995](Hartill and Fielding 1995). Upon doing this, the source code is rolled into a tar ball and uploaded to the Apache group's FTP site.

To keep track of their releases, the Apache team makes use of a version number scheme. The scheme consists of two mandatory elements, a major version number and a minor version number. The version number is on the format <major>.<minor>. The major version number indicates the software's generation. Throughout most of the period I look at in this thesis, the major version number is 0. Once the Apache group views their code mature, they increment the major version to 1. The minor version number is used to by the version builder to indicate that he has released a new version of the software with new functionality added to it. The minor version number is incremented by one when rolling a new tar ball.

There are also one optional element used in the version scheme, the branching revision. The branching revision is an update to a minor release. It is branching in the sense that the release is meant to address bugs specific to a previous release. Instead of requiring a major upgrade of the software, the branching release only fixes bugs existing in a previous release. The branching release is therefore always exclusively a release containing bug fixes and no functionality. The branching release number is appended with to the minor version number using a dot as separator, resulting in something like 0.6.1, where the final digit, one, is the branching release to minor release 0.6. The branching release numbering scheme follows mostly the same rules as minor version, but the release manager does in periods deviate from the scheme to add extra information to the release.

Since February 1995 every e-mail sent to the new-httpd mailing list has been archived. It is reading these archives that form the empiric basis of this thesis. Before presenting excerpts of this data, I will go into some detail of the research method I have employed.