Automated NetWorker Checklist Dashboard
This script helps you get a periodical
status of your backup infrastructure to help you proactively
monitor your infra. The script can be used to gather information
from multiple servers and formatting all the information into a
single email and sending it to the required recipients. The entire
script bundle contains 2 scripts.
health_check.pl | This is the collector script that runs should be copied on the backup servers for which the information has to be collected. |
health_check_master.pl | This is the master script that manages the initiation of the collector script and formats all the outputs into an email to be sent to the intended recipients. |
This script was developed and tested on
- Platform : Linux and UNIX
- Application : DELL EMC NetWorker 8.x
Prerequisites
There are a few perquisites that you need to meet before using
this script.
- Set up ssh without password for the user that you are going to use to run the master script. Incase you are not aware of the procedure follow the procedure as described here.
sendmail
is installed and configured on the server running the master script
Input
The input to this script is being provided via an external file
backup_server_list.txt
. Make sure keep the file at the same location as the script and
in case you are using this in the cron tab then edit the variable
in the script to the complete path to this file to avoid errors
during the execution of the script.
Along with the external file you might also want to assign appropriate values to the below listed variables in the master script.
Variable | Value description |
---|---|
$username |
username for which the ssh key has been stored and can be used to login into the destination server without password. |
$server_details_file_name |
Complete path to the backup_server_list.txt
file on the local system.
|
$email_from |
Sender's email address. |
$email_to |
Recipient's email address. |
$email_subject |
Subject for the email that would contain the consolidated report. |
Sample email report
The resulting email will contain the following fields
Fields | Description |
---|---|
Backup Server nsr Filesystem status | Current utilization of the NSR related mountpoints |
Long savegroup | Backup running for more than 24 hrs |
Any star volume | Volumes that are never used in the currnet NetWorker datazone |
Any drive in service mode | Devices currently in service mode |
Any drive in disabled mode | Devices currently in disabled mode |
Volume going to default pool | Any volumes using the Default pool |
Cannot write Issue | Errors due to backup devices |
Media Critical | Critical media alerts |
Connectivity issues | Connection issue |
Savegroup failures | Number of failed backup jobs |
Savegroup skipped | Number of save groups that are skipped for any reason |
health_check_master.pl
#! /usr/bin/perl -w
# Details regarding this script can be found on http://www.crazyrov.com/prod/dp/dp_script_scd.php
my $today_date = `date +'%d-%m-%Y'`;
my @months = qw( Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec );
my @days = qw(Sun Mon Tue Wed Thu Fri Sat Sun);
(my $sec,my $min,my $hour,my $mday,my $mon,my $year,my $wday,my $yday,my $isdst) = localtime();
my $username = "user";
my $server_details_file_name = "/tmp/backup_server_list.txt";
my $email_holder = "/tmp/email_holder.txt";
my $email_from="sender\@.crazyrov.net";
my $email_to="reciepent\@crazyrov.net";
my $email_subject=" CUSTOMER_NAME Healtcheck Dashboard";
my @server_lists;
print "------------------ Script Initiated ------------------\n";
@server_lists = `cat /tmp/backup_server_list.txt`;
email_header($email_holder, $email_subject, $email_from, $email_to);
open(my $fh, '>>', $email_holder) or die "Could not open file '$email_holder' $!";
for my $server (@server_lists)
{
chomp($server);
print `date`, " Gathering data for $server\n";
my @temp = `ssh $username\@$server /tmp/health_check.pl`;
print $fh "<tr style=\"font-size: small; white-space:nowrap;background-color: #FCF3CF\">";
print $fh @temp;
print $fh "<tr>";
print `date`, " Completed\n\n";
}
close($fh);
email_footer($email_holder);
print "\n------------------ Sending email now ------------------\n";
` cat $email_holder | /usr/sbin/sendmail -t `;
print "------------------ Script Completed ------------------\n";
# ----------------------------- Subroutines --------------------------------------------------
# Add the email header and the static items.
sub email_header{
my $holder = $_[0];
open(my $fh, '>', $holder) or die "Could not open file '$holder' $!";
print $fh "From: $_[2]
To: $_[3]
Subject: $_[1]
Content-Type: text/html
MIME-Version: 1.0
<!doctype html>
<html>
<head>
<meta charset=\"utf-8\">
<title>CUSTOMER_NAME Healthcheck Dashboard by Rovin D'Souza</title>
</head>
<body style=\"font-family:Segoe, 'Segoe UI', 'DejaVu Sans', 'Trebuchet MS', Verdana, sans-serif \">
<div id=\"main_heading\" align=\"center\" style=\"margin-top: 20px;margin-bottom:20px;font-size:larger;background:#00628B;height:60px;\">
<b><p><span style=\"color:#FFFFFF;height: 100%\">".$email_subject."</span></p></b>
</div>
<div class=\"content\" align=\"center\" style=\"margin-left:2%;margin-right:2%\">
<table style=\"font-family:Segoe, 'Segoe UI', 'DejaVu Sans', 'Trebuchet MS', Verdana, sans-serif \"width=\"100%\" border=\"0\">
<tbody style=\"height:20px\">
<tr>
<td> </td>
</tr>
</tbody>
</table>
<table id=\"main_table\" style=\"font-family:Segoe, 'Segoe UI', 'DejaVu Sans', 'Trebuchet MS', Verdana, sans-serif \" width=\"100%\">
<caption id=\"table_caption\" style=\"text-align=left; background:#FFFFFF\">
<b><p><span style=\"text-align=left; background:#FFFFFF\">".$mday."-".$months[$mon]."-".$year."</span></p></b>
</caption>
<tbody id=\"main_table_body\">
<tr id = \"main_table_header\" style=\"background-color:#81A594;\">
<th scope=\"col\"><span>Backup Server</span></th>
<th scope=\"col\"><span>Filesystem</span></th>
<th scope=\"col\"><span>Long savegroup</span></th>
<th scope=\"col\"><span>Any star volume</span></th>
<th scope=\"col\"><span>Any drive in service mode</span></th>
<th scope=\"col\"><span>Any drive in disabled mode</span></th>
<th scope=\"col\"><span>Volume going to default pool</span></th>
<th scope=\"col\"><span>Cannot write Issue</span></th>
<th scope=\"col\"><span>Media Critical</span></th>
<th scope=\"col\"><span>Connectivity issues</span></th>
<th scope=\"col\"><span>Savegroup failures</span></th>
<th scope=\"col\"><span>Savegroup skipped</span></th>
</tr>";
close ($fh);
}
sub email_footer{
my $holder = $_[0];
open(my $fh, '>>', $holder) or die "Could not open file '$holder' $!";
print $fh "
</tbody>
</table>
</div>
<table width=\"100%\" border=\"0\">
<tbody style=\"height:50px\">
<tr>
<td> </td>
</tr>
</tbody>
</table>
</body>
</html> ";
close ($fh);
}
health_check.pl
#! /usr/bin/perl -w
# Details regarding this script can be found on http://www.crazyrov.com/prod/dp/dp_script_scd.php
my $system_hostname = `hostname`;
my $today_date = `date +'%d-%m-%Y'`;
my $nsr_capacity = "";
my $long_running_groups_count = "";
my $new_tapes = 0;
my @new_tape_list;
my @tapes_service;
my @tapes_disabled;
my @tapes_default;
my @cannot_write;
my @tape_waiting;
my @connectivity_issues;
my @groups_failed;
my @groups_overrun;
my @jb_names;
print (" <th style=\"background-color: #F8C471 \"><span>", uc $system_hostname , "</span></th>\n");
# Size of the nsr mount point
$nsr_capacity = `df -Ph | awk '\$NF == "/nsr" {print \$5}'`;
#$nsr_capacity = `cat command_1_output.txt | awk ' \$NF == "/nsr" {print \$5} '`;
#print ("nsr is at ", $nsr_capacity,"\n");
print ("<th><span style=\"font-weight: normal\">", $nsr_capacity , "</span></th>\n");
# Long running groups
$long_running_groups_count = `ps -eaf|grep savegrp |awk '\$5 !~ /[0-9]:[0-9]/ {print \$0}' | wc -l`;
#$long_running_groups_count = `cat command_2_output.txt | wc -l`;
#print ("Long running groups are ", $long_running_groups_count,"\n\n");
if($long_running_groups_count > 0){
print ("<th><span style=\"font-weight: normal\">", $long_running_groups_count , "</span></th>\n");
} else {
print ("<th><span style=\"font-weight: normal\"> None </span></th>\n");
}
# unlabled tapes
#nsrjb -j librayname| grep -i "*"
@jb_names =`printf '. type: nsr jukebox; enabled: Yes\n show name\n print \n' | /usr/sbin/nsradmin -i- | grep -i name | awk -F' ' '{print \$2}'`;
foreach my $jb (@jb_names){
chomp($jb);
chop($jb);
@new_tape_list = push(@new_tape_list, `/usr/sbin/nsrjb -j $jb | grep "*" | awk '{print \$3}' | grep -v in`);
$new_tapes = $new_tapes + scalar @new_tape_list;
}
#@new_tape_list = `cat command_3_output.txt `;
#push(@new_tape_list , `cat command_3_output.txt `);
#$new_tapes = $new_tapes + scalar @new_tape_list;
#print ("The number of unlabeled/unrecognized tapes at ", scalar @new_tape_list, " \n\n");
if(scalar @new_tape_list > 0){
print ("<th><span style=\"font-weight: normal\">", scalar @new_tape_list , "</span></th>\n");
} else {
print ("<th><span style=\"font-weight: normal\"> None </span></th>\n");
}
# drive in services mode or disabled, if yes home many
@tapes_service = `/usr/sbin/nsrmm | grep -i service | awk '{print \$7}'`;
#@tapes_service = `cat command_4_output.txt | grep -i service | awk '{print \$7}'`;
#print ("There are ",scalar @tapes_service," in service mode\n");
if(scalar @tapes_service > 0){
print ("<th><span style=\"font-weight: normal\">", scalar @tapes_service , "</span></th>\n");
} else {
print ("<th><span style=\"font-weight: normal\"> None </span></th>\n");
}
@tapes_disabled = `/usr/sbin/nsrmm | grep -i disabled | awk '{print \$7}'`;
#@tapes_disabled = `cat command_4_output.txt | grep -i disabled | awk '{print \$7}'`;
#print ("There are ",scalar @tapes_disabled," in disabled mode\n\n");
if(scalar @tapes_disabled > 0){
print ("<th><span style=\"font-weight: normal\">", scalar @tapes_disabled , "</span></th>\n");
} else {
print ("<th><span style=\"font-weight: normal\"> None </span></th>\n");
}
# detect any backups using the default pool, waiting for default pool
@tapes_default =`grep -i waiting /nsr/logs/daemon.raw| grep -i default`;
#print ("There are ",scalar @tapes_default," waits for default tapes\n\n");
if(scalar @tapes_default > 0){
print ("<th><span style=\"font-weight: normal\">", scalar @tapes_default , "</span></th>\n");
} else {
print ("<th><span style=\"font-weight: normal\"> None </span></th>\n");
}
# Cannot write Issue
@cannot_write = `grep -i "cannot write" /nsr/logs/daemon.raw`;
if(scalar @cannot_write > 0){
print ("<th><span style=\"font-weight: normal\">", scalar @cannot_write , "</span></th>\n");
} else {
print ("<th><span style=\"font-weight: normal\"> None </span></th>\n");
}
# Ctitical media alerts
@tape_waiting = `/usr/sbin/nsr_render_log -S yesterday /nsr/logs/daemon.raw |grep -i critical | grep 'media request' | tail | awk -F "'" '{print \$2}'`;
#@tape_waiting = `printf ". type: nsr\n show alert message \n print \n" | /usr/sbin/nsradmin -s $system_hostname -i- | grep -i waiting`;
if(scalar @tape_waiting > 0){
print ("<th><span style=\"font-weight: normal\"> Critital Media alerts</span></th>\n");
} else {
print ("<th><span style=\"font-weight: normal\"> No Alerts </span></th>\n");
}
# Connectivity issues
# grep -i error /nsr/logs/daemon.log| grep -i RPC
@connectivity_issues = `/usr/sbin/nsr_render_log -S yesterday /nsr/logs/daemon.raw | grep -i RPC`;
if(scalar @connectivity_issues > 0){
print ("<th><span style=\"font-weight: normal\"> RPC errors found </span></th>\n");
} else {
print ("<th><span style=\"font-weight: normal\"> No Errors </span></th>\n");
}
# Savegroup failures
# ls -lrt /nsr/logs/savegroups/ |grep -i failed|tail
@groups_failed = `nsr_render_log -S yesterday /nsr/logs/daemon.raw | grep -i savegroup | grep -i "failure alert" | awk -F ":" '{print \$4}'`;
#@groups_failed = `cat command_5_output.txt`;
my $i=0;
for my $group (@groups_failed){
$groups_failed[$i] = (split(" ",((split(",", $group))[0])))[0];
$i++;
}
#print ( "Groups that have failed in the past 24hrs is \n",@groups_failed, "\n\n");
if(scalar @groups_failed > 0){
print ("<th><span style=\"font-weight: normal\">", scalar @groups_failed," Failures </span></th>\n");
} else {
print ("<th><span style=\"font-weight: normal\"> No Failures </span></th>\n");
}
# Savegroup skipped
# grep -i "savegrp is already running" /nsr/logs/daemon*.log|grep $(date -d "yesterday" '+%D')
@groups_overrun = `/usr/sbin/nsr_render_log -S yesterday /nsr/logs/daemon.raw | grep -i "savegrp is already running" | awk -F ":" '{print \$4}'`;
#@groups_overrun = `cat command_6_output.txt | awk -F ":" '{print \$4}'`;
$i=0;
for my $group (@groups_overrun){
$groups_overrun[$i] = (split(" ",((split(",", $group))[0])))[1];
$i++;
}
#print ( "Groups that have overrun and skipped in the past 24hrs is \n ",@groups_overrun, "\n\n");
if(scalar @groups_overrun > 0){
print ("<th><span style=\"font-weight: normal\">", scalar @groups_overrun," over-run and skipped </span></th>\n");
} else {
print ("<th><span style=\"font-weight: normal\"> No over-runs </span></th>\n");
}
Support/Contact
Email your feedback to support@crazyrov.com.