Method for Incremental and Scheduled Statistics for Analog
Specially indicated for
a massive virtual hosting environments.
Also covering Report Magic customization of results.
Copyright (c) 2000 by Jaume Teixi.
You are free to distribute this software under the terms of the GNU General Public License.
Please consider this method as a work in progress not already tested on a other systems than:
-Debian GNU/Linux 2.2 with: Analog 4.01, Report Magic 1.41, Apache 1.3.12
-Unix like system with Apache
-Analog and Report Magic
-Perl-5 or a perl interpreter
We are on a virtual hosting environment with a lot of domains who generates a lot of web traffic that Apache is serving and registering on a separate log files for each domain.
We have a separate Analog configuration file for each domain where as one of the main aspects of Analog is the customization (the other is the speed:) we have specific analysis options for each domains (customer needs like focus on some part of the website, omit analysis of customer own access to its domain, etc.).
As we want some cosmetics to the reports we also have setup a Report Magic configuration file for each domain.
Of course we want to rollover log files once analyzed.
But now we consider analysis of web traffic a task to be done by itself doesn't wasting our daily work time ;-)
So this method consists of:
A cron task will be executed on the last day of each month, will run Analog analyzing log files for each domain and adding analysis to previous cached Analog reports, then will move analyzed log files to an archive folder, will restart apache in order to startup with new log files, will run Report Magic in order to make some cosmetic to the reports and will notify webmaster and customer of the domain that the new report has been generated.
We are running Apache with a lot of virtual domains
All domains are generating logs into /var/www/logs/customerdomain1.com.log and all logs are rotated each week.
Our own company domains are generating logs into /var/www/logs/ourcompanydomains/companydomain1.com.log
We also have setup Apache
with an alias rule for each virtual host that says:
Alias /stats /var/www/reports/customerdomain1.com
Documentation on how to handle these things.
5. SCHEDULING CRON
Just edit your /etc/crontab and add the following:
28 * * root /etc/rmagic/missa > /var/log/missa.log
15 4 29 * * root /etc/rmagic/missa > /var/log/missa.log 2>&1 &
15 4 30 * * root /etc/rmagic/missa > /var/log/missa.log 2>&1 &
15 4 31 * * root /etc/rmagic/missa > /var/log/missa.log 2>&1 &
It will just run missa on
days 28 to 31 at 4:15 am and log results into /var/log/missa.log
6. MISSA ORGANIZATION
Creating /etc/rmagic/missa file:
This is the main file, first will check if is last day of month and if it is then will start running our automated Analog and Rmagic files, keeping processed log files on an Analog cache file and moving processed logs in order to not to process again.
will contain which missa files to process:
/etc/rmagic/missa_clients is for processing our clients domains
/etc/rmagic/missa_ours is for processing our domains (probably on a different path or machine)
/etc/rmagic/missa_total will process over Analog cache's in order to get global statistics
7. ANALOG SETUP
We have setup separate analog file for each virtual domain: analog_customerdomain1.com, analog_customerdomain2.com, etc. Where we can specify specific report for each customer
Important Analog customization
So we will run Analog tacking info of previous stats on customerdomain1.com.cache and we will process logs from customerdomain1.com.log* that will take actual log customerdomain1.com.log and rotated logs such as customerdomain1.com.log.10.gz
Analog will produce output on computer format in customerdomain1.com.dat and will cache this info on customerdomain1.com.cache.new
docs in order to get more info on it.
8. REPORT MAGIC SETUP
Also we have each Report Magic config file for each virtual domain: rmagic_customerdomain1.com, rmagic_customerdomain2.com, and so on.
Important part for Report
Magic - Missa customization:
File_In = /var/www/reports/output/customerdomain1.com.dat
File_Out = /var/www/reports/customerdomain1.com/
So read Analog report from customerdomain1.com.dat, make Report Magic html's cosmetics and output it all on /customerdomain1.com/
Magic documentation will help you on handle these things.
9. THE MISSA PROCESS
You need to create /etc/rmagic/missa_clients: This file will contain 6 lines for each virtual host:
a: just run analog
for this virtual host with his own customized report:
analog +G +g/etc/rmagic/analog_customerdomain1.com
b: move processed logs
to another part
mv /var/www/logs/customerdomain1.com.log* /var/oldlgs/
c: gracefully restart
apache in order to get up with cleaned log files
d: move *.cache.new
to just *.cache because will be historic reports for next month.
mv /var/www/reports/cache/customerdomain1.com.cache.new /var/www/reports/cache/customerdomain1.com.cache
e: notify webmaster
(I guess if you're reading this: you) and (your) customer through missa_clients_email
perl -s /etc/rmagic/missa_clients_email -Email="firstname.lastname@example.org" -Webmaster="email@example.com" -Servername="customerdomain2.com"
Create /etc/rmagic/missa_ours: is the same for missa_clients but with specific parts for our company domains.
Create /etc/rmagic/missa_total. This file ony will run analog_global and rmagic_global that will process all Analog cached reports for all customers virtual hosts. Then will run analog_total and rmagic_total that will process Analog cached reports from customers plus our company cached reports. And of course will notify for email us about this
Important parts form analog_global:
File_In = /var/www/reports/output/global.dat
File_Out = /var/www/reports/global/
Important parts from analog_total:
File_In = /var/www/reports/output/total.dat
File_Out = /var/www/reports/total/
10. FINAL CONSIDERATIONS:
As on missa_total will run analog_global and analog_total this means that total report will have all requests from our customers and our own domains so is analog_global (all customer doms) plus missa_ours (that runs analog over our own doms), but this will produce some wrong outputs: for example "Number of Hosts" (Cannot difference if a host has requested a domain from our customers and a domain for our company, will produce 2 counts when really is 1). But for bytes and requests we will have a good global summary.
As stayed above this is a
work in progress and probably you will find some ease improvements to this
so please sent it to me.