![]() |
![]() |
Home / Documentation / 1.0 / mod_perl 1.0 User Guide / | ![]() |
![]() |
![]() |
||||
![]() |
![]() |
|||
![]() |
![]() |
|||
![]() |
||||
![]() |
![]() ![]() ![]() |
![]() |
||
![]() |
Choosing the Right Strategy | ![]() |
||
![]() |
||||
![]() |
![]() |
![]() |
||
![]() |
||||
![]() ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
||
![]() |
![]() |
![]() |
![]() |
![]() |
||
![]() |
![]() |
![]() |
![]() |
![]() |
||
![]() |
![]() |
![]() |
![]() |
![]() |
||
![]() |
![]() |
![]() |
![]() |
![]() |
||
![]() |
![]() |
![]() |
![]() |
![]() |
||
![]() |
||
![]() |
||
![]() |
||
![]() |
||
![]() |
||
![]() |
||
|
||
![]() |
This document discusses various mod_perl setup strategies used to get the best performance and scalability of the services.
There is no such thing as the right strategy in the web server business, although there are many wrong ones. Never believe a person who says: "Do it this way, this is the best!". As the old saying goes: "Trust but verify". There are too many technologies out there to choose from, and it would take an enormous investment of time and money to try to validate each one before deciding which is the best choice for your situation.
With this in mind, I will present some ways of using standalone mod_perl, and some combinations of mod_perl and other technologies. I'll describe how these things work together, offer my opinions on the pros and cons of each, the relative degree of difficulty in installing and maintaining them, and some hints on approaches that should be used and things to avoid.
To be clear, I will not address all technologies and tools, but limit this discussion to those complementing mod_perl.
Please let me stress it again: do not blindly copy someone's setup and hope for a good result. Choose what is best for your situation -- it might take some effort to find out what that is.
In this chapter we will discuss
There are several different ways to build, configure and deploy your mod_perl enabled server. Some of them are:
Having one binary and one configuration file (one big binary for mod_perl).
Having two binaries and two configuration files (one big binary for mod_perl and one small binary for static objects like images.)
Having one DSO-style binary and two configuration files, with mod_perl available as a loadable object.
Any of the above plus a reverse proxy server in http accelerator mode.
If you are a newbie, I would recommend that you start with the first option and work on getting your feet wet with Apache and mod_perl. Later, you can decide whether to move to the second one which allows better tuning at the expense of more complicated administration, or to the third option -- the more state-of-the-art-yet-suspiciously-new DSO system, or to the fourth option which gives you even more power.
The first option will kill your production site if you serve a lot of static data from large (4 to 15MB) webserver processes. On the other hand, while testing you will have no other server interaction to mask or add to your errors.
This option allows you to tune the two servers individually, for maximum performance.
However, you need to choose between running the two servers on multiple ports, multiple IPs, etc., and you have the burden of administering more than one server. You have to deal with proxying or fancy site design to keep the two servers in synchronization.
With DSO, modules can be added and removed without recompiling the server, and their code is even shared among multiple servers.
You can compile just once and yet have more than one binary, by using different configuration files to load different sets of modules. The different Apache servers loaded in this way can run simultaneously to give a setup such as described in the second option above.
On the down side, you are playing at the bleeding edge.
You are dealing with a new solution that has weak documentation and is still subject to change. It is still somewhat platform specific. Your mileage may vary.
The DSO module (mod_so
) adds size and complexity to your binaries.
Refer to the section "Pros and Cons of Building mod_perl as DSO for more information.
Build details: Build mod_perl as DSO inside Apache source tree via APACI
The fourth option (proxy in http accelerator mode), once correctly configured and tuned, improves the performance of any of the above three options by caching and buffering page results.
The next part of this chapter discusses the pros and the cons of each of these presented configurations. Real World Scenarios Implementation describes the implementation techniques of these schemes.
We will look at the following installations:
The first approach is to implement a straightforward mod_perl server. Just take your plain Apache server and add mod_perl, like you add any other Apache module. You continue to run it at the port it was using before. You probably want to try this before you proceed to more sophisticated and complex techniques.
The advantages:
Simplicity. You just follow the installation instructions, configure it, restart the server and you are done.
No network changes. You do not have to worry about using additional ports as we will see later.
Speed. You get a very fast server and you see an enormous speedup from the first moment you start to use it.
The disadvantages:
The process size of a mod_perl-enabled Apache server is huge (maybe 4MB at startup and growing to 10MB and more, depending on how you use it) compared to typical plain Apache. Of course if memory sharing is in place, RAM requirements will be smaller.
You probably have a few tens of child processes. The additional memory requirements add up in direct relation to the number of child processes. Your memory demands are growing by an order of magnitude, but this is the price you pay for the additional performance boost of mod_perl. With memory prices so cheap nowadays, the additional cost is low -- especially when you consider the dramatic performance boost mod_perl gives to your services with every 100MB of RAM you add.
While you will be happy to have these monster processes serving your scripts with monster speed, you should be very worried about having them serve static objects such as images and html files. Each static request served by a mod_perl-enabled server means another large process running, competing for system resources such as memory and CPU cycles. The real overhead depends on the static object request rate. Remember that if your mod_perl code produces HTML code which includes images, each one will turn into another static object request. Having another plain webserver to serve the static objects solves this unpleasant obstacle. Having a proxy server as a front end, caching the static objects and freeing the mod_perl processes from this burden is another solution. We will discuss both below.
Another drawback of this approach is that when serving output to a client with a slow connection, the huge mod_perl-enabled server process (with all of its system resources) will be tied up until the response is completely written to the client. While it might take a few milliseconds for your script to complete the request, there is a chance it will be still busy for some number of seconds or even minutes if the request is from a slow connection client. As in the previous drawback, a proxy solution can solve this problem. More on proxies later.
Proxying dynamic content is not going to help much if all the clients are on a fast local net (for example, if you are administering an Intranet.) On the contrary, it can decrease performance. Still, remember that some of your Intranet users might work from home through slow modem links.
If you are new to mod_perl, this is probably the best way to get yourself started.
And of course, if your site is serving only mod_perl scripts (close to zero static objects, like images), this might be the perfect choice for you!
For implementation notes, see the "One Plain and One mod_perl enabled Apache Servers" section in the implementations chapter.
As I have mentioned before, when running scripts under mod_perl you will notice that the httpd processes consume a huge amount of virtual memory -- from 5MB to 15MB and even more. That is the price you pay for the enormous speed improvements under mod_perl. (Again -- shared memory keeps the real memory that is being used much smaller :)
Using these large processes to serve static objects like images and html documents is overkill. A better approach is to run two servers: a very light, plain Apache server to serve static objects and a heavier mod_perl-enabled Apache server to serve requests for dynamic (generated) objects (aka CGI).
From here on, I will refer to these two servers as httpd_docs (vanilla Apache) and httpd_perl (mod_perl enabled Apache).
The advantages:
The heavy mod_perl processes serve only dynamic requests, which allows the deployment of fewer of these large servers.
MaxClients
, MaxRequestsPerChild
and related parameters can now
be optimally tuned for both httpd_docs
and httpd_perl
servers,
something we could not do before. This allows us to fine tune the
memory usage and get better server performance.
Now we can run many lightweight httpd_docs
servers and just a few
heavy httpd_perl
servers.
An important note: When a user browses static pages and the base
URL in the Location window points to the static server, for example
http://www.example.com/index.html
-- all relative URLs
(e.g. <A HREF="/main/download.html">
) are being served by
the light plain Apache server. But this is not the case with
dynamically generated pages. For example when the base URL in the
Location window points to the dynamic server --
(e.g. http://www.example.com:8080/perl/index.pl
) all relative URLs
in the dynamically generated HTML will be served by the heavy mod_perl
processes. You must use fully qualified URLs and not relative ones!
http://www.example.com/icons/arrow.gif
is a full URL, while
/icons/arrow.gif
is a relative one. Using <BASE
HREF="http://www.example.com/">
in the generated HTML is another
way to handle this problem. Also, the httpd_perl
server could
rewrite the requests back to httpd_docs
(much slower) and you still
need the attention of the heavy servers. This is not an issue if you
hide the internal port implementations, so the client sees only one
server running on port 80
. (See Publishing Port Numbers other than 80)
The disadvantages:
An administration overhead.
The need for two configuration files.
The need for two sets of controlling scripts (startup/shutdown) and watchdogs.
If you are processing log files, now you probably will have to merge the two separate log files into one before processing them.
Just as in the one server approach, we still have the problem of a mod_perl process spending its precious time serving slow clients, when the processing portion of the request was completed a long time ago. Deploying a proxy solves this, and will be covered in the next section.
As with the single server approach, this is not a major disadvantage if you are on a fast network (i.e. Intranet). It is likely that you do not want a buffering server in this case.
Before you go on with this solution you really want to look at the Adding a Proxy Server in http Accelerator Mode section.
For implementation notes see the "One Plain and One mod_perl enabled Apache Servers" section in implementations chapter.