POPFile snapshot_stats Utility

The Snapshot Stats utility is a tool to capture "snapshots" of POPFile's accuracy statistics by periodically running via the task scheduler (or a cron job) and updating an Excel compatible CSV file containing your accuracy history.

The following data items are captured each run;

This version has been tested in a Windows environment with versions 0.19.x and 0.20.x of POPFile and version 0.21.0 of POPFile. It is not compatible with earlier versions of POPFile. The author believes that the utility is platform independent and will work properly on non-Windows POPFile installs, but has not tested on those platforms.

POPFile is an automatic email classification tool authored by John Graham-Cumming available from SourceForge.

Instructions for use

  1. Download the correct version of script to your POPFile install directory, normally c:\Program Files\Popfile by clicking below;

  2. Open your task scheduler and add a scheduled task.

You're done. The task scheduler will run the snapshot_stats script at the time(s) you scheduled, the script will update the snapshots_stats.csv file in your POPFile folder. You can periodically open that file with Excel to view your historical stats and analyze them in various ways.

Sample CSV File

The following is a sample of the CSV file created by snapshot_stats when run against the author's POPFile installation on May 25, 2003. Note that this example shows only one set of statistics for each bucket. In the real world, an entire history of snapshots (taken at come recurring interval, like daily) would be captured in the file.

BucketName,BucketColor,UnixTimestamp,Timestamp,BucketUniqueWords,BucketWordCount,BucketMailsClassified,BucketFalsePositives,BucketFalseNegatives,GlobalWordCount,GlobalDownloads,GlobalMessages,GlobalErrors,LastResetDate
normal,blue,1053935655,Mon May 26 00:54:15 2003,8647,45189,125,2,0,69939,34152,132,2,Sun May 25 00:45:53 2003
spam,red,1053935655,Mon May 26 00:54:15 2003,6790,24750,7,0,2,69939,34152,132,2,Sun May 25 00:45:53 2003

Commandline Options

The script accepts commandline options to optionally override the separator character or quotes used in producing the CSV file.

Usage Examples

Changing the default comma separator to a semi-colon.

perl snapshot_stats.pl -csv_separator ;

Changing the default quote character to a double-quote mark (most shell scripts will require you to escape it as shown in the example).

perl snapshot_stats.pl -csv_quote \"

Changing the default comma separator to a colon and the default quote character to a single quote.

perl snapshot_stats.pl -csv_separator : -csv_quote \'

A Note about Changing The Separator Character

Important Note: If you have already begun using snapshot_stats and want to change the separator character, you must either delete the snapshot_stats.csv file in your POPFile directory or manually edit it with a text editor to change the existing separator characters in the file to match the new separator character. If you fail to do so, you will end up with a snapshot_stats.csv file that has mixed separator's in the various rows of the file.

FAQS

Copying

Copyright (C) 2003 - 2007 Scott W. Leighton

Licensed under the terms of the GNU General Public License.

Contributed to the POPFile project under the terms of the POPFile License Agreement.


Back to POPFile Utilities