my Method for  Incremental and Scheduled Statistics for Analog...........
 
 


_____________________________

MISSA  ver. 0.1
Method for Incremental and Scheduled Statistics for Analog

Specially indicated for a massive virtual hosting environments.
Also covering Report Magic customization of results.
_____________________________





The original post on Dec, 1st 2000 http://www.mail-archive.com/analog-help@lists.isite.net/msg07116.html

NEWS Fri Sep 7 11:06:50 CEST 2001
New version under development, will be released as a Debian package.
Stay tuned...




0. COPYRIGHT
Copyright (c) 2000 by Jaume Teixi.
You are free to distribute this software under the terms of the GNU General Public License.
 

1. ADVERTISEMENT
Please consider this method as a work in progress not already tested on a other systems than:
 -Debian GNU/Linux 2.2 with: Analog 4.01, Report Magic 1.41, Apache 1.3.12
 

2. REQUIREMENTS
 -Unix like system with Apache
 -Analog and Report Magic
 -Perl-5 or a perl interpreter
 -Cron
 

3. PURPOSE
We are on a virtual hosting environment with a lot of domains who generates a lot of web traffic that Apache is serving and registering on a separate log files for each domain.
We have a separate Analog configuration file for each domain where as one of the main aspects of Analog is the customization (the other is the speed:) we have specific analysis options for each domains (customer needs like focus on some part of the website, omit analysis of customer own access to its domain, etc.).
As we want some cosmetics to the reports we also have setup a Report Magic configuration file for each domain.
Of course we want to rollover log files once analyzed.

But now we consider analysis of web traffic a task to be done by itself doesn't wasting our daily work time ;-)

So this method consists of:
A cron task will be executed on the last day of each month, will run Analog analyzing log files for each domain and adding analysis to previous cached Analog reports, then will move analyzed log files to an archive folder, will restart apache in order to startup with new log files, will run Report Magic in order to make some cosmetic to the reports and will notify webmaster and customer of the domain that the new report has been generated.
 

4. ENVIRONMENT
We are running Apache with a lot of virtual domains
All domains are generating logs into /var/www/logs/customerdomain1.com.log and all logs are rotated each week.
Our own company domains are generating logs into /var/www/logs/ourcompanydomains/companydomain1.com.log

We also have setup Apache with an alias rule for each virtual host that says:
 Alias /stats  /var/www/reports/customerdomain1.com

Read Apache Documentation on how to handle these things.
 

5. SCHEDULING CRON
Just edit your /etc/crontab and add the following:

15 4    28 * *  root    /etc/rmagic/missa > /var/log/missa.log 2>&1 &
15 4    29 * *  root    /etc/rmagic/missa > /var/log/missa.log 2>&1 &
15 4    30 * *  root    /etc/rmagic/missa > /var/log/missa.log 2>&1 &
15 4    31 * *  root    /etc/rmagic/missa > /var/log/missa.log 2>&1 &

It will just run missa on days 28 to 31 at 4:15 am and log results into /var/log/missa.log
 

6. MISSA ORGANIZATION
Creating /etc/rmagic/missa file:
This is the main file, first will check if is last day of month and if it is then will start running our automated Analog and Rmagic files, keeping processed log files on an Analog cache file and moving processed logs in order to not to process again.

/etc/rmagic/missa   will contain which missa files to process:
/etc/rmagic/missa_clients is for processing our clients domains
/etc/rmagic/missa_ours  is for processing our domains (probably on a different path or machine)
/etc/rmagic/missa_total  will process over Analog cache's in order to get global statistics
 

7. ANALOG SETUP
We have setup separate analog file for each virtual domain: analog_customerdomain1.com, analog_customerdomain2.com, etc. Where we can specify specific report for each customer

Important Analog customization for Missa:
 ...
 REFREPEXCLUDE http://www.customerdomain1.com/*
 LOGFILE /var/www/logs/customerdomain1.com.log*
 CACHEFILE /var/www/reports/cache/customerdomain1.com.cache
 OUTPUT  COMPUTER
 OUTFILE /var/www/reports/output/customerdomain1.com.dat
 CACHEOUTFILE /var/www/reports/cache/customerdomain1.com.cache.new
 HOSTNAME "customerdomain1.com"
 HOSTURL http://www.customerdomain1.com
 ...
So we will run Analog tacking info of previous stats on customerdomain1.com.cache and we will process logs from customerdomain1.com.log* that will take actual log customerdomain1.com.log and rotated logs such as customerdomain1.com.log.10.gz
Analog will produce output on computer format in customerdomain1.com.dat and will cache this info on customerdomain1.com.cache.new

Read Analog docs in order to get more info on it.
 

8. REPORT MAGIC SETUP
Also we have each Report Magic config file for each virtual domain: rmagic_customerdomain1.com, rmagic_customerdomain2.com, and so on.

Important part for Report Magic - Missa customization:
 ...
 [statistics]
 File_In = /var/www/reports/output/customerdomain1.com.dat
 ...
 [reports]
 File_Out = /var/www/reports/customerdomain1.com/
 ...
So read Analog report from customerdomain1.com.dat, make Report Magic html's cosmetics and output it all on /customerdomain1.com/

Report Magic documentation will help you on handle these things.
 

9. THE MISSA PROCESS
You need to create /etc/rmagic/missa_clients: This file will contain 6 lines for each virtual host:

 a: just run analog for this virtual host with his own customized report:
 analog +G +g/etc/rmagic/analog_customerdomain1.com

 b: move processed logs to another part
 mv /var/www/logs/customerdomain1.com.log* /var/oldlgs/

 c: gracefully restart apache in order to get up with cleaned log files
 apachectl graceful

 d: move *.cache.new to just *.cache because will be historic reports for next month.
 mv /var/www/reports/cache/customerdomain1.com.cache.new /var/www/reports/cache/customerdomain1.com.cache

 e: notify webmaster (I guess if you're reading this: you) and (your) customer through missa_clients_email perl process.
 perl -s /etc/rmagic/missa_clients_email -Email="info@customerdomain2.com" -Webmaster="webmaster@ourhostingserver.com" -Servername="customerdomain2.com"

Create /etc/rmagic/missa_ours: is the same for missa_clients but with specific parts for our company domains.

Create /etc/rmagic/missa_total. This file ony will run analog_global and rmagic_global that will process all Analog cached reports for all customers virtual hosts. Then will run analog_total and rmagic_total that will process Analog cached reports from customers plus our company cached reports. And of course will notify for email us about this

Important parts form analog_global:
 ...
 REFREPEXCLUDE http://www.ourhostingserver.com/*
 LOGFILE /tmp/nothing_logged.log
 CACHEFILE /var/www/reports/cache/*
 OUTPUT  COMPUTER
 OUTFILE /var/www/reports/output/global.dat
 ...
and rmagic_global:
 ...
 [statistics]
 File_In = /var/www/reports/output/global.dat
 ...
 [reports]
 File_Out = /var/www/reports/global/
 ...

Important parts from analog_total:
 ...
 REFREPEXCLUDE http://www.ourhostingserver.com/*
 LOGFILE /tmp/nothing_logged.log
 CACHEFILE /var/www/reports/cache/*
 CACHEFILE /var/www/reports/cache/ourcompanydomains/*
 OUTPUT  COMPUTER
 OUTFILE /var/www/reports/output/total.dat
 ...
and rmagic_total:
 ...
 [statistics]
 File_In = /var/www/reports/output/total.dat
 ...
 [reports]
 File_Out = /var/www/reports/total/
 ...

10. FINAL CONSIDERATIONS:
As on missa_total will run analog_global and analog_total this means that total report will have all requests from our customers and our own domains so is analog_global (all customer doms) plus missa_ours (that runs analog over our own doms), but this will produce some wrong outputs: for example "Number of Hosts" (Cannot difference if a host has requested a domain from our customers and a domain for our company, will produce 2 counts when really is 1). But for bytes and requests we will have a good global summary.

As stayed above this is a work in progress and probably you will find some ease improvements to this so please sent it to me.
Thanks.
 
 

_____________________________
 
 
 

jaume@teixi.net