Today : Wed, 18 Sep 24 .


INFN-PADOVA wiki


Fabric Management

Notes

PmWiki

edit SideBar

Logwatch

Page: Site.Logwatch - Last Modified : Mon, 30 Mar 09

Logwatch sensor for DGAS

In order to verify if DGAS service on HLR is working well we create a logwatch sensor on prod-ce-02 CE that checks if glite-dgas-pushd is sending records correctly to prod-hlr-01 (grep on EXITSTATUS=255). Here is the configuration we set up:

prod-ce-02

  • Install logwatch-7.3.6-1, perl-Email-Valid and perl-Net-SSH
  • Create a SSH public key (with no password) to be copied in prod-hlr-01:
ssh-keygen -t rsa
  • Create the following configuration files:
    • /etc/logwatch/conf/services/dgas.conf (to define general information on the service to be monitored)
##########################################################################
# /etc/logwatch/conf/services/dgas.conf
##########################################################################

Title = "dgas"

# Which logfile group...
LogFile = dgas

# vi: shiftwidth=3 tabstop=3 et
  * /etc/logwatch/conf/logfiles/dgas.conf (to define what log files to be parsed)
##########################################################################
# /etc/logwatch/conf/logfiles/dgas.conf
##########################################################################

# What actual file?  Defaults to LogPath if not absolute path....
LogFile = /opt/glite/var/log/dgas_ce_pushd.log
Archive = /opt/glite/var/log/dgas_ce_pushd.log.*

# Use the following date filter
*ApplyDgasDate

# vi: shiftwidth=3 tabstop=3 et
  • Create the following scripts:
    • /etc/logwatch/scripts/shared/applydgasdate (to match date in glite-dgas-pushd.log format)
##########################################################################
# /etc/logwatch/scripts/shared/applydgasdate
##########################################################################

use Logwatch ':dates';

my $Debug = $ENV{'LOGWATCH_DEBUG'} || 0;

$SearchDate = TimeFilter('%b %e %H:%M:%S');

# The date might be "Dec 09", but it needs to be "Dec  9"...
$SearchDate =~ s/ 0/  /;

if ( $Debug > 5 ) {
   print STDERR "DEBUG: Inside ApplyDgasDate...\n";
   print STDERR "DEBUG: Looking For: " . $SearchDate . "\n";
}

while (<STDIN>) {
   defined($ThisLine = <STDIN>);
   if ($ThisLine =~ m/^$SearchDate: /o) {
      print $ThisLine;
   } elsif ($ThisLine =~ m/(Mon|Tue|Wed|Thu|Fri|Sat|Sun) $SearchDate \d{4}: /o) {
      print $ThisLine;
   }
}

# vi: shiftwidth=3 syntax=perl tabstop=3 et
  * /etc/logwatch/scripts/services/dgas (to parse the log and match EXITSTATUS=255)
##########################################################################
# /etc/logwatch/scripts/services/dgas
##########################################################################

use strict;
use Logwatch ':all';
use Net::SSH qw(ssh_cmd); 
use Email::Valid;

my $conn = 'root@prod-hlr-01.pd.infn.it';
my $command = "service glite-dgas-hlrd restart";
my $count = 0;

my $sendmail = "/usr/sbin/sendmail -t";
my $from = "From: root\n";
my $to = "To: grid-services-pd\@infn.it\n"; 
my $subject = "Subject: DGAS pushd check\n";
my $content;

while (defined(my $ThisLine = <STDIN>)) {
   if ($ThisLine =~ /EXITSTATUS=255/) {
      #print "$ThisLine\n";   # debug
      $count++;
   }
}

if ($count > 3) {
   ssh_cmd($conn, $command);
   $content = "DGAS pushd fails $count times since last check!\nThe service glite-dgas-hlrd has been restarted on HLR.";
   print "\nPushd faults: " . $count . "\n";
   # mail
   open(SENDMAIL, "|$sendmail") or die "Cannot open $sendmail: $!";
   print SENDMAIL $from;
   print SENDMAIL $to;
   print SENDMAIL $subject;
   print SENDMAIL "Content-type: text/plain\n\n";
   print SENDMAIL $content;
   close(SENDMAIL); 
}

exit(0);

# vi: shiftwidth=3 tabstop=3 syntax=perl et
  • Create the following cron:
    • /etc/cron.d/dgas-pushd-check (enable periodical check)
*/30 * * * * root (date; rm -rf /var/cache/logwatch/*; /usr/sbin/logwatch --service dgas --range 'since 30 minutes ago for those minutes' --print) >> /var/log/dgas-pushd-check.log 2>&1
  • Periodically check the log /var/log/dgas-pushd-check.log if all is OK:
    • if you see only a date, the check made no work
    • if you see a message like the following one, the check has restarted the service glite-dgas-hlrd on prod-hlr-01:
 Thu Mar 12 14:00:02 CET 2009

 ################### Logwatch 7.3.6 (05/19/07) #################### 
        Processing Initiated: Thu Mar 12 14:01:12 2009
        Date Range Processed: since 30 minutes ago for those minutes
                              ( 2009-Mar-12 13h 31m / 2009-Mar-12 14h 01m )
                              Period is minute.
      Detail Level of Output: 0
              Type of Output: unformatted
           Logfiles for Host: prod-ce-02.pd.infn.it
  ################################################################## 

 --------------------- dgas Begin ------------------------ 

 Pushd faults: 4

 ---------------------- dgas End ------------------------- 


 ###################### Logwatch End ######################### 

prod-hlr-01

  • Append the SSH public key of prod-ce-02 to /root/.ssh/authorized_keys2 in order to enable SSH command from prod-ce-02:
scp root@prod-ce-02:/root/.ssh/id_rsa.pub ~/prod-ce-02_id_rsa.pub
cat ~/prod-ce-02_id_rsa.pub >> /root/.ssh/authorized_keys2

Powered by PmWiki
Skin by CarlosAB

looks borrowed from http://haran.freeshell.org/oswd/sinorca
More skins here