March 06, 2007
Log File Monitoring
Real time log monitoring on Windows
Lets say that you have an Windows Server (NT, 2000,XP,etc) application that generates log files, and you want to monitor that logs for particular messages, especially those pertaining to errors. By application we are referring to any kind of program that runs as a service in windows, such as Siebel, PeopleSoft, databases such as Oracle, MSSQL, web servers, or all manner of custom built integration systems that may support an enterprise. Many applications create log files for the purpose of having the log file monitored, but how would you actually implement the monitoring of the log directory in an automated fashion? That's always up to the administrator, with no built in automated way to do it on Windows.
The most common method is to use some programming language to open up the directory your monitoring, open each file, read through it, and compare the file contents to the error strings you're looking for. Here is an example I wrote in perl, from my Open Source page. A good manual way to do this on Windows 2000 is to search the contents of a directory, and search within the files for the phrase. XP it can be done too, but it requires indexing the folder in question. However, this kind of searching generally falls into the "After the fact" category of log file checking, similar to web log checking. What if you want to monitor a log file, all the time, and be alerted when an event happens with no manual steps?
Using a scripting/programming language, you could write a program and open the file and search periodically. On UNIX you would probably use grep or the built in utilities. You would also want to do something with the results. You could output the results of the search, or email, or whatever you want your script to do depending on how it was written. Then you have to worry about scheduling the script, how to run it in the background, how often, and how to deal with the results. If you have multiple servers and multiple directories, you may have to run it many times on each machine. What if you want to incorporate multiple phrases or errors in the search string? While different scripting applications can handle this, a key problem always crops up: performance. Opening a directory and searching through every file for errors is expensive. And depending on how you implement it, trying to get real time performance can be very expensive.
VA2 was written to handle this. It uses windows API calls to reduce overhead, and does not open whole directories and scan the entire thing every time a log file error is being searched for. Furthermore, it provides several key features if you're looking to monitor log files for any application:
- Runs as real time process. As soon as log file is updated it will be checked, reducing total CPU load
- Perl regular expressions. Without writing scripts and doing all the dirty work to start them up, kick them off, etc, handle multiple directories, you can user simple or complicated Perl searches (regular expressions) to find errors in your log files
- Event handling. When a log string matches with VA2 Error Definition, an event is generated. Built in VA2 mechanism can respond to different events different ways, for example emailing them, starting or stopping servers, etc. If you need to monitor a log directory for errors, this is a more flexible way of handling log file errors than writing an application that email every single error for example.
Lets compare the steps needed to monitor a log file using a home built script and VA2. We'll use Perl as the example of scripting languages:
- Download a perl distribution and install it (or try writing a file parsing system with VB Script...it can be done, if anybody has an example send it my way and I'll post it)
- Learn enough of the language in question to open files, search for strings, do something with the output
- Use some type of scheduling application to kick off the scripts to monitor the log files at regular intervals, note this will be cpu intensive
- Do something with the output, for example send the results to a file or email. Sending email would take extra coding, and be careful not to hardcode email addresses in the system or you might suddenly get several thousand alerts. If you create an output file that needs to be manually examined, the whole effort may not be worth it as it doesn't automate much.
- Maintain and enhance scripts, handle multiple directories, schedules etc.
With VA2, there are some necessary steps:
- Download and install VA2, including RDBMS support (you can use MSDE, MSSQL, Oracle, DB2 or MySql). MSDE is free and has been heavily tested with VA2.
- Install the VA2 Central Service and Local Service Monitor according to the documentation. http://recursivetechnology.com/documentation/VA2Documentation.html
- The VA2 LSM will be installed on every machine that you want to monitor. Create an error definition and associate it with the directory you want to monitor
This example shows that you will be looking for the string "Failed at invoking service" under application appserver:PREPROD1. If that error string ever appears in the directory you are monitoring, an event will be generated with a level of 0, type listed in Event Type field, and Event Sub Type listed in the Event Sub Type field.
You may ask: Ok, it is searching for "Failed at invoking service" string, but in what directory? The answer is to look at the appserver:PREPROD1 software element, and you can determine what directory is being searched. You can also create new software elements, to point to any directory you need to be monitored.
In this case, the e:\sblppr1752\siebsrvr\log\ directory is being monitored. You notice that you can also monitor a NT Service with a click of the button. That is also a feature of VA2. With VA2 you can monitor any Windows NT log file, in real-time, and instantly react to search strings by generating events.
Here is a screenshot of the events once generated:
Although there are some setup steps with VA2, there is also a cost to doing custom script writing. VA2 has the ability to email events when they happen, and to run Reaction scripts, either on a remote machine or a central server. If your using Siebel, there is built in interface to send Siebel Server commands when an event is detected.
Event handling diagram: (more information available at http://recursivetechnology.com/documentation/Tutorial_event_routing.html)
Here is a review of the main benefits of monitoring log files with VA2:
- No need to learn new programming languages and extensive architecture to monitor log files
- Monitor multiple directories on multiple machines with the same architecture
- Respond in real time to log file events, increasing CPU efficiency
- De-couple the monitoring of log files and the response to those events
- React to events with email notifications, which also include schedules for who to email at which time.
- React to events (from log files or other sources) and take corrective actions, reactions can be programmed to happen on any machine monitored by VA2.
The most difficult part of log file monitoring is knowing what to monitor. Application design can help with that, by defining at what points and why you may want to monitor log files. Siebel is a good example, VA2 was initially built to monitor Siebel. The log files Siebel produces are fairly standardized, but there is still isn't definite cases where you always want to monitor Siebel log files. The closest that comes to that is "Process Exited with error" but even that message you don't always care about. Often the customization of Siebel, for example for integration processes, will result in applications that kick out customized errors. With good application design, you can design the errors you are looking for under critical situations, and use log file monitoring to check for the errors.
If you do have a situation where you definitely want to monitor a log file for known strings, its likely that you'll want to have infrastructure like VA2 assisting you with the monitoring, and handling the results of the monitoring. VA2 is not limited to monitoring Siebel log files, and the same infrastructure can be used to monitor multiple custom applications.