The non-forking server

It is mid-March and Rob Hartill posts his third e-mail to new-httpd that day. After having been quiet the first half of March, traffic is once again picking up on the mailing list. A 0.1 release has been out for a while, but the team is still discussing the project's name and its ultimate goals. They have therefore spent the past days discussing non-technical issues like the project name and a mission statement.

Rob Hartill is becoming increasingly famous for his work on the Internet Movie Database, IMDb. For him, IMDb is a past-time project, but it was actually this past-time that got him his current job at the Los Alamos National Lab in New Mexico. Hartill is one of the first people to employ the emerging Web technology to make an interesting application. For him the movies are only instrumental, his real interest lies in the challenge of doing something interesting with the new medium presented by the Web. Despite having a day job, Hartill puts some 20 odd hours a week into Apache and IMDb, making him a substantial contributor to the Apache project [GOSH1998](Gosh 1998).

By mid-March there are technical discussions on the agenda again. Posting his third e-mail that day, Hartill asks about the problems involved with implementing a non-forking server. He has prototyped two different approaches himself, neither of which seems to be particularly difficult nor complex. His first approach sets up N processes to handle incoming requests. When a process accepts an incoming connection, a new process steps up to wait for the the next connection. His second approach makes use of the BSD socket library's own scheduling mechanism when several processes simultaneously accept connections to the same port. A parent process is used to avoid race conditions where several processes processes accepts the same connection. Hartill suspects there might be issues with both approaches, but he can't see any himself [HARTILLMAR95D](Hartill, posted on the new-httpd mailing list March 14 1995).

There might be, Cliff Skolnick replies [SKOLNICKMAR95C](Skolnick, posted on the new-httpd mailing list March 14 1995), a problem with the way BSD sockets implements multiple accepts to a single port on multi-processor systems. At least with Solaris 2.x and SGI, there is. There is no defined behavior for multiple accept calls on multi-processor systems. This is not entirely true, corrects Rob McCool who programmed the original NCSA Web server:

The BSD spec doesn't define it, but all BSD implementations take care of this well. This includes SunOS, HP-UX, AIX, IRIX (even MP), OSF/1, and BSDI. [MCCOOLMAR95A](McCool, posted on the new-httpd mailing list March 14 1995)

Besides, the problem isn't merely isolated to multi-processor systems. McCool is experiencing the same issues on both single- and multi-processor systems running on the Solaris operating system.

Another general problem with non-forking, replies Robert Thau [THAUMAR95F](Thau, posted on the new-httpd mailing list March 14 1995), is the danger of memory leakage. This isn't a problem with forking servers as the handling process always dies after returning its reply to the client. Any memory leakages in the handling process are cleaned up as the process dies. With non-forking the processes do not die, and memory leaks accumulate. These are issues pertaining forking servers in general.

Thau also mentions a problem specific to the Apache code [THAUMAR95F](Thau, posted on the new-httpd mailing list March 14 1995). As the original code is forking, it has no concept of a per-transaction status. Because the handling process dies after finishing off the request, there is no need to reset variables holding values specific for the request. A non-forking server will have to clear all transaction-specific variables before a new request may be handled. There is a third approach to non-forking, concludes Thau, the approach taken by Rob McCool in the NetSite server. Unfortunately, there are issues with this approach as well.

Rob McCool implemented the original NCSA Web server code back in 1994. He has since left the NCSA team, and is now employed by Netscape. At Netscape he develops the company's Web server, NetSite. McCool has kept a rather low profile on the new-httpd list, his follow-up on Thau's e-mail being his first posting that month. McCool confirms both the memory leak and Thau's objections with the NetSite's servers non-forking code. "If I remember the NCSA code right, it leaks memory and file descriptors like a sieve. strdup() is used in a couple of places, though most of the code opts for the stack-based approach of memory allocation. At the time, doing all of that made sense…" [MCCOOLMAR95A](McCool, posted on the new-httpd mailing list March 14 1995), continuing with a description of the pros and cons with the mechanism employed by the NetSite server.

NetSite uses what McCool calls a mob process approach to non-forking. Robert Thau describes the mob process approach as "a bunch of processes all doing accepts on the same socket" [THAUMAR95F](Thau, posted on the new-httpd mailing list March 14 1995). Long requests tend to starve the resources, McCool says confirming Thau's objections to the mob process approach. As processes are not freed fast enough to handle incoming requests, the server may become unresponsive under heavy load. The mob process approach is to start all available Web server processes upon startup. This is done to avoid the shared memory requirements associated with coordinating growth and reduction of a process set. Requests that can't be handled when there are no available processes, are queued by the operating system. However, there is a limit to the number of connections queued before the operating system starts rejecting incoming connections. This, McCool concludes [MCCOOLMAR95A](McCool, posted on the new-httpd mailing list March 14 1995), is better than an uncontrolled spawning of new processes.

A few days later Brandon Long replies. Brandon Long has taken over McCool's old position as Web server developer with the NCSA team. Long, like McCool, has kept a low profile prior to this discussion. Now, however, he joins in to explain the non-forking approach taken by the next release of the NCSA server [LONGMAR95A](Long, posted on the new-httpd mailing list March 16 1995).

Like the mob process mechanism, Brandon Long has opted for forking a number of child processes at startup. Unlike the NetSite design, he has chosen to let the parent process act as scheduler. It hands off incoming connections to one of the child processes, before returning to accept new incoming connections itself. This approach is prone to the same limitation as NetSite's approach, in that long requests starve the resources by not freeing processes fast enough. To overcome this, Long lets the Web server revert to a fork-and-die state under heavy load. The Web server continues to operate in this state until one or more of the initial processes are available again. It is a slight improvement of the original NCSA code.

Over the next few days a short discussion ensues over the NCSA 1.4 non-forking. Rob Hartill suggests serialization before a call to accept [HARTILLMAR95E](Hartill, posted on the new-httpd mailing list March 16 1995), but Rob McCool argues that this will indeed make the parent process a bottleneck as handling one request will require three process switches [MCCOOLMAR95B](McCool, posted on the new-httpd mailing list March 17 1995). Hartill replies that his suggested system is only required for multi-processor systems, as it has the potential of staying one step ahead of each connection [HARTILLMAR95G](Hartill, posted on the new-httpd mailing list March 17 1995). With that the thread of discussion dies without having reached a conclusion. No patch to implement non-forking is submitted, and the issue stays dormant for another two weeks.

Late March and early April the NCSA team releases two beta versions of their upcoming 1.4 series of Web servers. Rob Hartill is not particularly impressed with the series' non-forking mechanism. He says it is "fundamentally flawed" [HARTILLAPR95A](Hartill, posted on the new-httpd mailing list April 9 1995). Through stress testing, Hartill has discovered a concurrency issue with 1.4's non-forking code. Despite Hartill's objections, Robert Thau goes through with one of his many side projects, and merges the NCSA 1.4 non-forking mechanism with the Apache code.

Thau restarts the non-forking discussion again in mid-April by suggesting to release a beta of Apache with the NCSA 1.4 code. His only objection to a new Apache release with the non-forking code is that it will be using NCSA 1.4 code before the NCSA team has released their 1.4 server publicly [THAUAPR95A](Thau, posted on the new-httpd mailing list April 12 1995). Apart from certain reservations regarding legalese, the NCSA team does not mind this [LONGAPR95A](Long, posted on the new-httpd mailing list April 12 1995) [FRANKAPR95A](Frank, posted on the new-httpd mailing list April 12 1995). Brian Behlendorf, on the other hand, is not too keen on releasing a non-forking Apache yet [BEHLENDORFAPR95A](Behlendorf, posted on the new-httpd mailing list April 12 1995).

Returning to the issues Hartill has expressed about the NCSA's non-forking code, Behlendorf would like to wait with releasing a non-forking Apache. Instead of serializing the processes, which is the approach taken by Brandon Long in the NCSA code, Behlendorf would rather do multiple accepts on a single port. He believes this to be a better approach if the multi-processor issues can be resolved. Brandon Long is apparently not too happy with his non-forking code. He joins the discussion to back Behlendorf's argument [LONGAPR95A](Long, posted on the new-httpd mailing list April 12 1995).

Once again the discussion sputters and dies without there being reached a conclusion. When Apache release 0.6 comes up for votes, there is no mention of non-forking code.

Commentary

The vignette shows the number of concerns involved with implementing non-forking behavior in the Apache server. It also shows the complexity with which the issues intertwine, the sheer amount of caveats to take into account, and the depth and breadth of technical knowledge required to understand the problem complex. In addition to directly related technical concerns, the developers discuss indirectly related topics like interprocess communication, licensing and the morality of using other people's code, to mention a few subtopics. The main topic of discussion is the best non-forking architecture for multiple software and hardware platforms. The challenge is to come up with the most effective architecture for non-forking that works equally well on single- and multi-processor systems.

Berkley sockets, BSD sockets, or simply sockets is a TCP/IP implementation originally written for BSD Unix in 1983. Sockets, provides both kernel level support by implementing the TCP/IP stack and an abstract programmers' interface (abbrev. API) to program networked software. Its story is not unlike that of Apache. Without going into details, the code was made publicly available on the Internet leading to widespread use by most Unix versions. Many Unix versions started with some version of the Berkley networking code, including the API. These are often called Berkley-derived implementations. As such the API's function calls are a de facto standard when developing networked application for Unix [STEVENS1988](Stevens 1998). The problem is, as shown in the vignette, slight differences in behavior between the various derived sockets implementations.

The issue of forking vs. non-forking is not an issue particular to Apache, nor to Web servers for that matter. It is an issue related with Berkley sockets and the sequential nature of single process applications. "To allow for many processes within a single host to use TCP communication facilities simultaneously, the TCP provides a set of addresses or ports within each host" [RFC793](Postel 1981). The server listens to a specific port. The server process is ready to accept incoming connections with a call to the socket API's accept(2) function. The accept is blocking, first returning the incoming socket's file descriptor when a peer connects to the port. As single-threaded applications are sequential, the server can't return to accepting new connections until it is done processing the connection just having been made. As long as the accepting process is handling a connection, no new connections can be accepted and the server may seem to be non-responsive. For low-traffic, quick request–response cycles, such a mechanism is acceptable, but it does not scale at all under heavy load.

In multi-tasking operating systems like Unix, concurrency is used to resolve the problem. By spawning a child process as a new connection is accepted, control of the newly accepted connection can be handed to the child process. The parent process immediately returns to accepting new connections as soon as it has handed control to its child. The child processes the connection, and dies when the connection terminates. Spawning new processes is handled by the Unix system call fork(2). Handling multiple requests this way is consequently called forking.

The main drawback with forking is the overhead involved. The system call creates an exact copy of the original process. The contents of the parent's memory area is copied into a new memory area. While some Unix version have optimized the forking process, the overhead involved is still substantial. Forking is therefore a suboptimal solution when it comes to performance. Forking a new process for each incoming connection may become a bottle-neck under heavy system load and/or in situations where great many connections are made in a short time-span. This is the background for discussing non-forking with Apache.

To avoid the overhead associated with on-demand forking, the non-forking server will fork a number of child processes upon startup. These processes are pooled. There are several strategies to manage the pool's processes. The best strategy for pooling is the real topic of the non-forking discussions on new-httpd. The limiting factor with choosing the right design for non-forking is that Apache is supposed to work on a number of Unix versions, and on single- as well as multi-processor systems. It is the combination of multi-processor systems and sockets derivatives that are giving the Apache team a challenge. As Rob McCool notes, the behavior of serialized accepts on multi-processor systems has no specified behavior for various Unix versions' sockets derivatives.

Concurrency is another problem with developing for multi-processor systems. Single-processor systems maintain an illusion of concurrency by splitting processor time between the processes. Processing is still sequential, although performed in small time slices. Synchronization is therefore possible by defining volatile sections. A volatile section is a code segment that can't be shifted out of the processor until it is done processing. It is a good way of avoiding race conditions in concurrent code on single-processor systems. On systems with multiple processors, several processes actually do perform in parallel. Volatile sections can't ensure against race conditions, as code that leading to the race condition may actually be running in parallel on another processor. This increases the complexity of concurrency and synchronization on multi-processor systems.

It is this synchronization that seems to be leading to the problems Rob Hartill notes with Brandon Long's original non-forking code in NCSA 1.4.