mod_perl_tuning man page on IRIX

Man page or keyword search:  
man Server   31559 pages
apropos Keyword Search (all sections)
Output format
IRIX logo
[printable version]

MOD_PERL_TUNINGUser Contributed Perl DocumentatMOD_PERL_TUNING(1)

NAME
       mod_perl_tuning - mod_perl performance tuning

DESCRIPTION
       Described here are examples and hints on how to configure
       a mod_perl enabled Apache server, concentrating on tips
       for configuration for high-speed performance.  The primary
       way to achieve maximal performance is to reduce the
       resources consumed by the mod_perl enabled HTTPD pro
       cesses.

       This document assumes familiarity with Apache configura
       tion directives some familiarity with the mod_perl config
       uration directives, and that you have already built and
       installed a mod_perl enabled Apache server.  Please also
       read the mod_perl documentation that comes with mod_perl
       for programming tips.  Some configurations below use fea
       tures from mod_perl version 1.03 which were not present in
       earlier versions.

       These performance tuning hints are collected from my expe
       riences in setting up and running servers for handling
       large promotional sites, such as The Weather Channel's
       "Blimp Site-ings" game, the MSIE 4.0 "Subscribe to Win"
       game, and the MSN Million Dollar Madness game.

BASIC CONFIGURATION
       The basic configuration for mod_perl is as follows.  In
       the httpd.conf file, I add configuration parameters to
       make the "http://www.domain.com/programs" URL be the base
       location for all mod_perl programs.  Thus, access to
       "http://www.domain.com/programs/printenv" will run the
       printenv script, as we'll see below.  Also, any *.perl
       file will be interpreted as a mod_perl program just as if
       it were in the programs directory, and *.rperl will be
       mod_perl, but without any HTTP headers automatically sent;
       you must do this explicitly.  If you don't want these last
       two, just leave it out of your configuration.

       In the configuration files, I use /var/www as the "Server
       Root" directory, and /var/www/docs as the "DocumentRoot".
       You will need to change it to match your particular setup.
       The network address below in the access to perl-status
       should also be changed to match yours.

       Additions to httpd.conf:

	# put mod_perl programs here
	# startup.perl loads all functions that we want to use within mod_perl
	Perlrequire /var/www/perllib/startup.perl
	<Directory /var/www/docs/programs>
	  AllowOverride None
	  Options ExecCGI
	  SetHandler perl-script
	  PerlHandler Apache::Registry
	  PerlSendHeader On
	</Directory>

	# like above, but no PerlSendHeaders
	<Directory /var/www/docs/rprograms>
	  AllowOverride None
	  Options ExecCGI
	  SetHandler perl-script
	  PerlHandler Apache::Registry
	  PerlSendHeader Off
	</Directory>

	# allow arbitrary *.perl files to be scattered throughout the site.
	<Files *.perl>
	  SetHandler perl-script
	  PerlHandler Apache::Registry
	  PerlSendHeader On
	  Options +ExecCGI
	</Files>

	# like *.perl, but do not send HTTP headers
	<Files *.rperl>
	  SetHandler perl-script
	  PerlHandler Apache::Registry
	  PerlSendHeader Off
	  Options +ExecCGI
	</Files>

	<Location /perl-status>
	  SetHandler perl-script
	  PerlHandler Apache::Status
	  order deny,allow
	  deny from all
	  allow from 204.117.82.
	</Location>

       Now, you'll notice that I use a "PerlRequire" directive to
       load in the file startup.perl.  In that file, I include
       all of the "use" statements that occur in any of my
       mod_perl programs (either from the programs directory, or
       the *.perl files).  Here is an example:

	#! /usr/local/bin/perl
	use strict;

	# load up necessary perl function modules to be able to call from Perl-SSI
	# files.  These objects are reloaded upon server restart (SIGHUP or SIGUSR1)
	# if PerlFreshRestart is "On" in httpd.conf (as of mod_perl 1.03).

	# only library-type routines should go in this directory.

	use lib "/var/www/perllib";

	# make sure we are in a sane environment.
	$ENV{GATEWAY_INTERFACE} =~ /^CGI-Perl/ or die "GATEWAY_INTERFACE not Perl!";

	use Apache::Registry ();       # for things in the "/programs" URL

	# pull in things we will use in most requests so it is read and compiled
	# exactly once
	use CGI (); CGI->compile(':all');
	use CGI::Carp ();
	use DBI ();
	use DBD::mysql ();

	1;

       What this does is pull in all of the code used by the pro
       grams (but does not "import" any of the module methods)
       into the main HTTPD process, which then creates the child
       processes with the code already in place.  You can also
       put any new modules you like into the /var/www/perllib
       directory and simply "use" them in your programs.  There
       is no need to put "use lib "/var/www/perllib";" in all of
       your programs.  You do, however, still need to "use" the
       modules in your programs.  Perl is smart enough to know it
       doesn't need to recompile the code, but it does need to
       "import" the module methods into your program's name
       space.

       If you only have a few modules to load, you can use the
       PerlModule directive to pre-load them with the same
       effect.

       The biggest benefit here is that the child process never
       needs to recompile the code, so it is faster to start, and
       the child process actually shares the same physical copy
       of the code in memory due to the way the virtual memory
       system in modern operating systems works.

       You will want to replace the "use" lines above with mod
       ules you actually need.

       Simple Test Program

       Here's a sample script called printenv that you can stick
       in the programs directory to test the functionality of the
       configuration.

	#! /usr/local/bin/perl
	use strict;
	# print the environment in a mod_perl program under Apache::Registry

	print "Content-type: text/html\n\n";

	print "<HEAD><TITLE>Apache::Registry Environment</TITLE></HEAD>\n";

	print "<BODY><PRE>\n";
	print map { "$_ = $ENV{$_}\n" } sort keys %ENV;
	print "</PRE></BODY>\n";

       When you run this, check the value of the GATEWAY_INTER
       FACE variable to see that you are indeed running mod_perl.

REDUCING MEMORY USE
       As a side effect of using mod_perl, your HTTPD processes
       will be larger than without it.	There is just no way
       around it, as you have this extra code to support your
       added functionality.

       On a very busy site, the number of HTTPD processes can
       grow to be quite large.	For example, on one large site,
       the typical HTTPD was about 5Mb large.  With 30 of these,
       all of RAM was exhausted, and we started to go to swap.
       With 60 of these, swapping turned into thrashing, and the
       whole machine slowed to a crawl.

       To reduce thrashing, limiting the maximum number of HTTPD
       processes to a number that is just larger than what will
       fit into RAM (in this case, 45) is necessary.  The draw
       back is that when the server is serving 45 requests, new
       requests will queue up and wait; however, if you let the
       maximum number of processes grow, the new requests will
       start to get served right away, but they will take much
       longer to complete.

       One way to reduce the amount of real memory taken up by
       each process is to pre-load commonly used modules into the
       primary HTTPD process so that the code is shared by all
       processes.  This is accomplished by inserting the "use Foo
       ();" lines into the startup.perl file for any "use Foo;"
       statement in any commonly used Registry program.	 The idea
       is that the operating system's VM subsystem will share the
       data across the processes.

       You can also pre-load Apache::Registry programs using the
       "Apache::RegistryLoader" module so that the code for these
       programs is shared by all HTTPD processes as well.

       NOTE: When you pre-load modules in the startup script, you
       may need to kill and restart HTTPD for changes to take
       effect.	A simple "kill -HUP" or "kill -USR1" will not
       reload that code unless you have set the "Perl
       FreshRestart" configuration parameter in httpd.conf to be
       "On".

REDUCING THE NUMBER OF LARGE PROCESSES
       Unfortunately, simply reducing the size of each HTTPD pro
       cess is not enough on a very busy site.	You also need to
       reduce the quantity of these processes.	This reduces mem
       ory consumption even more, and results in fewer processes
       fighting for the attention of the CPU.  If you can reduce
       the quantity of processes to fit into RAM, your response
       time is increased even more.

       The idea of the techniques outlined below is to offload
       the normal document delivery (such as static HTML and GIF
       files) from the mod_perl HTTPD, and let it only handle the
       mod_perl requests.  This way, your large mod_perl HTTPD
       processes are not tied up delivering simple content when a
       smaller process could perform the same job more effi
       ciently.

       In the techniques below where there are two HTTPD configu
       rations, the same httpd executable can be used for both
       configurations; there is no need to build HTTPD both with
       and without mod_perl compiled into it.  With Apache 1.3
       this can be done with the DSO configuration -- just con
       figure one httpd invocation to dynamically load mod_perl
       and the other not to do so.

       These approaches work best when most of the requests are
       for static content rather than mod_perl programs.  Log
       file analysis become a bit of a challenge when you have
       multiple servers running on the same host, since you must
       log to different files.

       TWO MACHINES

       The simplest way is to put all static content on one
       machine, and all mod_perl programs on another.  The only
       trick is to make sure all links are properly coded to
       refer to the proper host.  The static content will be
       served up by lots of small HTTPD processes (configured not
       to use mod_perl), and the relatively few mod_perl requests
       can be handled by the smaller number of large HTTPD pro
       cesses on the other machine.

       The drawback is that you must maintain two machines, and
       this can get expensive.	For extremely large projects,
       this is the best way to go.

       TWO IP ADDRESSES

       Similar to above, but one HTTPD runs bound to one IP
       address, while the other runs bound to another IP address.
       The only difference is that one machine runs both servers.
       Total memory usage is reduced because the majority of
       files are served by the smaller HTTPD processes, so there
       are fewer large mod_perl HTTPD processes sitting around.

       This is accomplished using the httpd.conf directive
       "BindAddress" to make each HTTPD respond only to one IP
       address on this host.  One will have mod_perl enabled, and
       the other will not.

       TWO PORT NUMBERS

       If you cannot get two IP addresses, you can also split the
       HTTPD processes as above by putting one on the standard
       port 80, and the other on some other port, such as 8042.
       The only configuration changes will be the "Port" and log
       file directives in the httpd.conf file (and also one of
       them does not have any mod_perl directives).

       The major flaw with this scheme is that some firewalls
       will not allow access to the server running on the alter
       nate port, so some people will not be able to access all
       of your pages.

       If you use this approach or the one above with dual IP
       addresses, you probably do not want to have the *.perl and
       *.rperl sections from the sample configuration above, as
       this would require that your primary HTTPD server be
       mod_perl enabled as well.

       Thanks to Gerd Knops for this idea.

       USING ProxyPass WITH TWO SERVERS

       To overcome the limitation of the alternate port above,
       you can use dual Apache HTTPD servers with just slight
       difference in configuration.  Essentially, you set up two
       servers just as you would with the two port on same IP
       address method above.  However, in your primary HTTPD con
       figuration you add a line like this:

	ProxyPass /programs http://localhost:8042/programs

       Where your mod_perl enabled HTTPD is running on port 8042,
       and has only the directory programs within its Document
       Root.  This assumes that you have included the mod_proxy
       module in your server when it was built.

       Now, when you access http://www.domain.com/programs/print
       env it will internally be passed through to your HTTPD
       running on port 8042 as the URL http://localhost:8042/pro
       grams/printenv and the result relayed back transparently.
       To the client, it all seems as if it is just one server
       running.	 This can also be used on the dual-host version
       to hide the second server from view if desired.

       Thanks to Bowen Dwelle for this idea.

       SQUID ACCELERATOR

       Another approach to reducing the number of large HTTPD
       processes on one machine is to use an accelerator such as
       Squid (which can be found at http://squid.nlanr.net/Squid/
       on the web) between the clients and your large mod_perl
       HTTPD processes.	 The idea here is that squid will handle
       the static objects from its cache while the HTTPD pro
       cesses will handle mostly just the mod_perl requests once
       the cache is primed.  This reduces the number of HTTPD
       processes and thus reduces the amount of memory used.

       To set this up, just install the current version of Squid
       (at this writing, this is version 1.1.22) and use the
       RunAccel script to start it.  You will need to reconfigure
       your HTTPD to use an alternate port, such as 8042, rather
       than its default port 80.  To do this, you can either
       change the httpd.conf line "Port" or add a "Listen" direc
       tive to match the port specified in the squid.conf file.
       Your URLs do not need to change.	 The benefit of using the
       "Listen" directive is that redirected URLs will still use
       the default port 80 rather than your alternate port, which
       might reveal your real server location to the outside
       world and bypass the accelerator.

       In the squid.conf file, you will probably want to add
       "programs" and "perl" to the "cache_stoplist" parameter so
       that these are always passed through to the HTTPD server
       under the assumption that they always produce different
       results.

       This is very similar to the two port, ProxyPass version
       above, but the Squid cache may be more flexible to fine
       tune for dynamic documents that do not change on every
       view.  The Squid proxy server also seems to be more stable
       and robust than the Apache 1.2.4 proxy module.

       One drawback to using this accelerator is that the log
       files will always report access from IP address 127.0.0.1,
       which is the local host loopback address.  Also, any
       access permissions or other user tracking that requires
       the remote IP address will always see the local address.
       The following code uses a feature of recent mod_perl ver
       sions (tested with mod_perl 1.16 and Apache 1.3.3) to
       trick Apache into logging the real client address and giv
       ing that information to mod_perl programs for their pur
       poses.

       First, in your startup.perl file add the following code:

	use Apache::Constants qw(OK);

	sub My::SquidRemoteAddr ($) {
	  my $r = shift;

	  if (my ($ip) = $r->header_in('X-Forwarded-For') =~ /([^,\s]+)$/) {
	    $r->connection->remote_ip($ip);
	  }

	  return OK;
	}

       Next, add this to your httpd.conf file:

	PerlPostReadRequestHandler My::SquidRemoteAddr

       This will cause every request to have its "remote_ip"
       address overridden by the value set in the "X-For
       warded-For" header added by Squid.  Note that if you have
       multiple proxies between the client and the server, you
       want the IP address of the last machine before your accel
       erator.	This will be the right-most address in the X-For
       warded-For header (assuming the other proxies append their
       addresses to this same header, like Squid does.)

       If you use apache with mod_proxy at your frontend, you can
       use Ask Bjrn Hansen's mod_proxy_add_forward module from
       ftp://ftp.netcetera.dk/pub/apache/ to make it insert the
       "X-Forwarded-For" header.

SUMMARY
       To gain maximal performance of mod_perl on a busy site,
       one must reduce the amount of resources used by the HTTPD
       to fit within what the machine has available.  The best
       way to do this is to reduce memory usage.  If your
       mod_perl requests are fewer than your static page
       requests, then splitting the servers into mod_perl and
       non-mod_perl versions further allows you to tune the
       amount of resources used by each type of request.  Using
       the "ProxyPass" directive allows these multiple servers to
       appear as one to the users.  Using the Squid accelerator
       also achieves this effect, but Squid takes care of decid
       ing when to acccess the large server automatically.

       If all of your requests require processing by mod_perl,
       then the only thing you can really do is throw a lot of
       memory on your machine and try to tweak the perl code to
       be as small and lean as possible, and to share the virtual
       memory pages by pre-loading the code.

AUTHOR
       This document is written by Vivek Khera.	 If you need to
       contact me, just send email to the mod_perl mailing list.

       This document is copyright (c) 1997-1998 by Vivek Khera.

       If you have contributions for this document, please post
       them to the mailing list.  Perl POD format is best, but
       plain text will do, too.

       If you need assistance, contact the mod_perl mailing list
       at modperl@perl.apache.org first (send 'subscribe' to mod
       perl-request@apache.org to subscribe). There are lots of
       people there that can help. Also, check the web pages
       http://perl.apache.org/ and http://www.apache.org/ for
       explanations of the configuration options.

       $Revision: 1.14 $ $Date: 2002/03/25 02:57:59 $

2002-03-25		  mod_perl-1.27	       MOD_PERL_TUNING(1)
[top]

List of man pages available for IRIX

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net