I learned about Analog through a friend whose web site is hosted at Cornerhost. I loved the taglines after "You'll fine three things here:"
Cornerhost uses a GNU/GPL utility called "Analog", a well-designed and very customizeable utility which reads Apache's standard log file format, processes and formats the data into HTML code. Point your browser to the resulting output for an easy and clear method of analyzing your web server's access logs.
You could setup a cron job to run nightly to run Analog to keep the html page up to date using the following command:
poseidon~ user$ sudo crontab -e
This will edit the root crontab file. If you haven't setup any other cron jobs for the root user, you will see an empty file.
Add a line to the bottom which reads:
0 2 * * * * /usr/local/bin/analog -g/usr/local/bin/analog.cfg
This will run the analog utility a 2:00 AM every day. Replace the path in the above line with the location you installed analog. For more information about crontabs, try
man crontab
In the crontab entry above, the "-g" option tells analog where to find its configuration file. There are tons of configurable options which you can set here, but you can figure that out on your own.
For my use, and since I've been learning Perl and PHP, I decided to write a CGI script which both calls Analog and redirects the browser to the newly generated HTML page. I wanted to do this because it would be my first attempt at making a call to the Unix (Oops - Mac OS X) shell from within a cgi script. I learned a few things during the process.
Most importantly, I learned about "tainted" data and what that "-T" option _really_ means on the perl shebang line:
#!/usr/bin/perl -wT
Here is an explanation of the -w and -T options from one of the many excellent resources at O'Reilly:
The next safety net is the -w option. This turns on warnings that Perl
will then give you if it finds any of a number of problems in your code.
Each of these warnings is a potential bug in your program and should be
investigated. In modern versions of Perl (since 5.6.0) the -w option has
been replaced by the use warnings pragma, which is more flexible than the
command-line option so you shouldn't use -w in new code.
The final safety net is the -T option. This option puts Perl into "taint
mode." In this mode, Perl inherently distrusts any data that it receives
from outside the program's source -- for example, data passed in on the
command line, read from a file, or taken from CGI parameters.
Tainted data cannot be used in an expression that interacts with the
outside world -- for example, you can't use it in a call to system or as
the name of a file to open. The full list of restrictions is given in the
perlsec manual page.
In order to use this data in any of these potentially dangerous
operations you need to untaint it. You do this by checking it against a
regular expression. A detailed discussion of taint mode would fill an
article all by itself so I won't go into any more details here, but using
taint mode is a very good habit to get into -- particularly if you are
writing programs (like CGI programs) that take unknown input from users.
Actually there's one other option that belongs in this set and that's -d.
This option puts you into the Perl debugger. This is also a subject
that's too big for this article, but I recommend you look at "perldoc
perldebug" or Richard Foley's Perl Debugger Pocket Reference.
Anyway, here is the Perl script I came up with:
#!/usr/bin/perl -wT # analog.pl # perl script which calls analog to generate html access report # 10-05-06 use CGI; # include debugging perl module use CGI::Carp qw(warningsToBrowser fatalsToBrowser); # setup variables $cgi = CGI->new; # This is considered "tainted" data, and as such is insecure. $ENV{'PATH'} = '/bin:/usr/bin'; delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'}; $path = $ENV{'PATH'}; # $path now NOT tainted @analog = ("/usr/local/bin/analog", "-g/usr/local/bin/analog.cfg"); $serverIP='192.168.1.2'; # Note this will only work from the LAN $report='/analog/index.html'; # www must have write access # Call the analog utility which generates the http access report. # analog needs to be set for execute permissions (755 or a+x) # Two perl routines ara available for calling the system shell. # exec() effectively calls the shell and never returns. # system() calls the system shell and returns flow back # to the perl script after the shell has finished # Since this will run with setuid as www, be sure both the analog # report file and enclosing folder are a+w enabled. system @analog; # Write the redirect to load the report # print $cgi->redirect( "http://$serverIP$report" ); # The first way I did this was writing a complete HTML page with page # title, etc and a message to the user indicating the redirection. # I changed it to the above which seems much simpler. I've included # the alternate method below for reference: #print $cgi->header; #print $cgi->start_html(-title=>"HTTP Access Log", -author=>"Charles \ Estabrooks", -meta=>"refresh" ); #print ' \ #LoadingOrpheus Analog Report. Please wait a moment... \ #'; #print $cgi->end_html;