Writing a Simple Filter With Perl
One cool thing you can do with perl is write a filter. We often tail various log files, but just want to know about specific bits. If we want to see more, we can look at the actual logs; however, for the purposes of the current activity, we don't need much information. Say your logs looked like this:
10.1.165.45 [30/Jul/2005:10:06:08 -0500] /docspath/nat/part164.html 200
10.78.9.213 [30/Jul/2005:10:06:21 -0500] /docspath/perl/man/latex_and_perl.html 304
10.78.9.213 [30/Jul/2005:10:06:21 -0500] /docspath/perl/man/images/perl-title.png 304
10.78.9.213 [30/Jul/2005:10:06:21 -0500] /docspath/perl/man/images/redballdot.png 304
10.121.2.33 [30/Jul/2005:10:06:47 -0500] /docspath/perl/index.html 200
10.220.1.136 [30/Jul/2005:10:06:49 -0500] /docspath/nat/missing.html 404
10.121.2.33 [30/Jul/2005:10:06:53 -0500] /docspath/perl/themes/thm/style/style.css 200
10.121.2.33 [30/Jul/2005:10:06:55 -0500] /docspath/perl/perlimages/perllogo.jpg 200
10.121.2.33 [30/Jul/2005:10:06:55 -0500] /docspath/perl/themes/thm/images/bg.gif 200
10.121.2.33 [30/Jul/2005:10:06:59 -0500] /docspath/perl/favicon.ico 200
|
We already know the time and date, since we are tailing the log in realtime. With a simple filter like this one:
#!/usr/bin/perl
while (<>){
s/\/docspath\///g;
s/\[.*\]//g;
print;
}
|
we can tail the logs to get rid of the extra path information and the timestamp:
tail -f access_log | logfilt.pl
|
The above log would appear like this:
10.1.165.45 nat/part164.html 200
10.78.9.213 perl/man/latex_and_perl.html 304
10.78.9.213 perl/man/images/perl-title.png 304
10.78.9.213 perl/man/images/redballdot.png 304
10.121.2.33 perl/index.html 200
10.220.1.136 nat/missing.html 404
10.121.2.33 perl/themes/thm/style/style.css 200
10.121.2.33 perl/perlimages/perllogo.jpg 200
10.121.2.33 perl/themes/thm/images/bg.gif 200
10.121.2.33 perl/favicon.ico 200
|
Now, you could also do conditionals in the filter and add information if needed. For instance, if you see a 404, you could provide the referrer. Here is a version of the script that shows how you can cut out extra stuff from the path, as well as format the IP address so it lines up as you tail:
#!/usr/bin/perl
while (<>){
chomp;
s/\/docspath\///g;
s/\[.*\]//g;
s/usrman\/images/u\/img/g;
s/image/mg/g;
s/theme/tm/g;
@threeofus=split(" ",$_);
printf ("%15s ",$threeofus[0]);
print $threeofus[1]." ";
print $threeofus[2]."\n";
}
|
First we remove the end of the line with chomp so we can add \n when and if we want. We then cut out and summarize the paths. Finally, we split out the IP, path, and result and print it with the correct padding for the IP address so they all line up. Here is how it looks with the padding:
10.78.252.23 nat/art355.html 200
10.78.252.23 nat/tms/nattm/style/style.css 200
10.78.252.23 nat/mgs/translogo.gif 200
10.78.252.23 nat/mgs/print.gif 200
217.112.204.144 nat/art28.html 200
10.86.88.22 nat/allarts.html 200
|
|
|