Caching Servers
For the caching servers in the front layer of your server architecture you need have a clear understanding of TCP connection scalability issues.
The first thing you may notice as the load on your system increases,
is that the cache server process runs out of file handles (unless its
start script increases the right kernel parameter). This is because
the operating system uses one file handle for each connection, and on
many systems the default number of handles a single user process is
allowed to create is 1024. This problem can be temporarily fixed with
the ulimit -u
command (on Linux and FreeBSD). To
fix it more permanently you need to edit
/etc/sysctl.conf
(on Linux and FreeBSD) or
/etc/system
(on Solaris). You can set the maximum
number of file handles up to several hundred thousand, so there is no
real limitation here.
The operating system set up TCP connections between a local port and an anonymous port on the requesting host:
cache01:2323 -> otherhost:1237
Port numbers are defined in the TCP protocol as an unsigned 16-bit number which gives a maximum of 65535 ports. The local port number can, however, be re-used for connections to different hosts:
cache01:2323 -> otherhost:1237 cache01:2323 -> yetanotherhost:4545
This means that the maximum theoretical number of connections a cache server can handle is:
(65535 - reserved-ports) * incoming-ip-addresses
where reserved-ports is the number of ports reserved for system services by the operating system (usually 1024).
For this to work well, the load balancer in front of the cache must be transparent: that is, it must supply the IP address of the request source and not its own IP address.
For example, if three users are visiting your web site:
user1:2213 -> load-balancer:80 -> cache01:80 user2:1212 -> load-balancer:80 -> cache01:80 user3:5333 -> load-balancer:80 -> cache01:80
then ideally, cache01
should see the IP addresses
of the requesting clients (user1
,
user2
and user3
) rather than the
IP of the load balancer. Your cache server will then be able to handle
as many TCP connection as your load balancer can pass on (given that
your operating system kernel manages to allocate and recycle enough
TCP connections fast enough).
If this is not possible then an alternative (but less satisfactory solution) is to increase the maximum number of possible connections by adding additional interfaces (and corresponding IP addresses) to the load balancer and/or the cache server.