NetAdminTools.com
 
SignalQ Sites:
NetAdminTools - Coprolite - SpotBridge - NAW
RoboCoop - AreWeDown - SolarPower - SysAdminTools
Xfig - Gold Loaf - GeekPapa - FixGMC - FixRambler
Categories:
GNU/Linux | Homebrew designs | Perl | Ruby | Administration | Backup/Recovery | Bugs/Fixes | Certification | Database | Email | File/Print | Hardware | Information Grab Bag | Interoperability | GNU/Linux ABCs | Monitoring | Name Resolution | Network Services | Networking | Remote Control | Security | Desktop | Web | BSD | Solaris | GIAGD | ERP | REALbasic

Last 30 Days | Last 60 Days | Last 90 Days | All Articles | GNU/Linux Reference OS Build | MCJ How-to | MCJ Presentation Config | Keywords | RSS



Categories:
·GNU/Linux
·Homebrew designs
·Perl
·Ruby
·Administration
·Backup/Recovery
·Bugs/Fixes
·Certification
·Database
·Email
·File/Print
·Hardware
·Information Grab Bag
·Interoperability
·GNU/Linux ABCs
·Monitoring
·Name Resolution
·Network Services
·Networking
·Remote Control
·Security
·Desktop
·Web
·BSD
·Solaris
·GIAGD
·ERP
·REALbasic
·All Categories


Monitoring and Automatic Recovery of Services with Monit
Topic:Monitoring   Date: 2006-04-27
Printer Friendly: Print

spacerspacer
<<  <   >  >>

Subject

Monit is a small, easy to configure monitoring system for *nix systems that will attempt to restart services that have failed. Grab the tarball, extract, configure, make, and make install:

[usr-1@srv-1 ~]$ tar -xzf mon*4.7*.gz
[usr-1@srv-1 ~]$ cd mon*7
[usr-1@srv-1 monit-4.7]$ ./configure
checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
.
.
.
monit has been configured with the following options:
Architecture: LINUX
SSL support: enabled
SSL include directory: /usr/include
SSL library directory: /usr/lib
resource monitoring: enabled
resource code: sysdep_LINUX.c
Compiler flags: -g -O2 -Wall -D _REENTRANT -I/usr/include
Linker flags: -lpthread -lcrypt -lresolv -lnsl  
-L/usr/lib -lssl -lcrypto
pid file location: /var/run
[usr-1@srv-1 monit-4.7]$
[usr-1@srv-1 monit-4.7]$ make
bison -y -dt p.y
/bin/mv -f y.tab.h tokens.h
flex -i l.l
gcc -c -DLINUX -I. -I./device -I./http -I./process -I./protocols 
.
.
.
protocols/rdate.o protocols/rsync.o protocols/smtp.o protocols/ssh.o 
protocols/tns.o device/sysdep_LINUX.o process/sysdep_LINUX.o 
y.tab.o lex.yy.o   -lfl -lpthread -lcrypt -lresolv -lnsl  -L/usr/lib 
-lssl -lcrypto -o monit 
[usr-1@srv-1 monit-4.7]$ 
[usr-1@srv-1 monit-4.7]$ su
Password: 
[root@srv-1 monit-4.7]# make install
/usr/bin/install -c  -m 755 -d /usr/local/bin || exit 1
/usr/bin/install -c  -m 755 -d /usr/local/man/man1 || exit 1
/usr/bin/install -c  -m 555 -s monit /usr/local/bin || exit 1
/usr/bin/install -c  -m 444 monit.1 /usr/local/man/man1/monit.1 || exit 1
[root@srv-1 monit-4.7]#

The configuration file is stored in /etc/monitrc. The top part of the configuration file sets the polling intervals, logging options, and web interface options. After that, just add on sections for the services to check and recover. Here is a sample config file that checks sshd:

 
[root@srv-1 usr-1]# cat /etc/monitrc
set daemon 120 # Poll at 2-minute intervals
set logfile syslog facility log_daemon
set alert root@localhost 
set httpd port 2812 and use address localhost
allow localhost   # Allow localhost to connect
allow admin:monit # Allow Basic Auth
check process sshd with pidfile /var/run/sshd.pid
start program  "/etc/init.d/sshd start"
stop program  "/etc/init.d/sshd stop"
if failed port 22 protocol ssh then restart
if 5 restarts within 5 cycles then timeout
[root@srv-1 usr-1]#

Let's start the monit daemon:

[root@srv-1 usr-1]# monit
Starting monit daemon with http interface at [localhost:2812]
[root@srv-1 usr-1]# 
[root@srv-1 usr-1]# tail /var/log/messages
Apr 27 08:36:20 srv-1 monit[3258]: Starting monit daemon with http interface 
at [localhost:2812] 
Apr 27 08:36:20 srv-1 monit[3260]: Starting monit HTTP server 
at [localhost:2812] 
Apr 27 08:36:20 srv-1 monit[3260]: monit HTTP server started 
Apr 27 08:36:20 srv-1 monit[3260]: Monit started 

The logon, as we set in the monitrc, is admin with a password of monit:



Here is what the administration web console looks like:



For a test, let's stop sshd and try and connect from another host:

[root@srv-1 usr-1]# /etc/init.d/sshd stop
Stopping sshd:                                             [  OK  ]
[root@srv-1 usr-1]# 
srv-5:~ usr4$ ssh usr-1@10.50.100.1
ssh: connect to host 10.50.100.1 port 22: Connection refused

Just wait a bit and try and reconnect:

srv-5:~ usr4$ ssh usr-1@10.50.100.1
Last login: Thu Apr 27 08:37:52 2006 from 10.50.100.200
[usr-1@srv-1 ~]$

We are back in! The logs show that monit did what it was supposed to do:

 
Apr 27 08:52:25 srv-1 monit[3260]: 'sshd' process is not running
Apr 27 08:52:25 srv-1 monit[3260]: 'sshd' trying to restart
Apr 27 08:52:25 srv-1 monit[3260]: 'sshd' start: /etc/init.d/sshd
Apr 27 08:52:25 srv-1 sshd:  succeeded
Apr 27 08:54:25 srv-1 monit[3260]: 'sshd' process is running with pid 4113

Rock!


People:
Places:
Things:
Times:





Please read our Terms of Use and our Privacy Policy
Microsoft, Windows, Windows XP, Windows 2003, Windows 2000, and NT are either trademarks or registered trademarks of Microsoft Corporation. NetAdminTools.com is not affiliated with Microsoft Corporation. Linux is a registered trademark of Linus Torvalds, and refers to the Linux kernel. The operating system of most distributions that contain the Linux kernel is GNU/Linux. All logos and trademarks in this site are property of their respective owner. Copyright 1997-2010 NetAdminTools.com