exim, perl and snmp! oh my! ian norton shadowcat - agaton
TRANSCRIPT
Exim, Perl and SNMP!Oh my!
Ian NortonShadowcat Systems Ltd.
Exim, Perl and SNMP!Oh my!
Working with Net SNMP Extend.
Ian NortonShadowcat Systems Ltd.
What's this talk about?
● Using Net SNMP extend● Monitoring arbitrary things● Using Exim as an example● Cleverness with OpenNMS
Email systems are evil.
Email systems are evil.
Exim is less evil :)
Exim
● Open source● Mail Transport Agent (MTA)● Mail Delivery Agent (MDA)● Cambridge University
In a previous life....
In a previous life....
University infrastructure
InternetHub1
Hub1Hub1
Hub1Hub1
Exchange
Unixmail
Dept 1 Dept 2
Standard service monitoring
● Connect TCP port 25● Check banner● How long did it take?
And so it begins...
● Messages delayed● 4xx temporary failures● Badly behaved senders don't retry
Everything is fine!
The monitoring says so!
idn facepalms
Everything is not fine
● One hub has an issue● Spam Assassin is dead● No problem identified● The monitoring is wrong
Service was affected.Messages were delayed.
Service was affected.Messages were delayed.
#fail
So what's the problem?
● Spam Assassin process went away● Exim unable to scan messages● Temporary reject (safe thing to do)
How do we test?
Testing
● Spam Assassin provides spamc● Create a batch SMTP file:
helo testing.example.comMAIL FROM: "Person, A" <[email protected]>RCPT TO: [email protected]: Testing viagraTo: [email protected]: "Person, A" <[email protected]>
Test SMTP session.QUIT
Testing
● spamc -c -B < bsmtp-file● Exit status zero or one
Enter Perl
● List of files to test & expected return● Test files and check return
# Files to test along with expected exit valuemy $files = { "01spamassassinnormalmessage.txt" => 0, "02spamassassinspammymessage.txt" => 1,};
my $fail = 0;
# Scan each of the files in the hashforeach my $file (keys(%$files)) {
# Check the file system("/usr/bin/spamc c B < $file > /dev/null");
# Bitshift $? to get the exit code my $exitval = $CHILD_ERROR >> 8;
# Compare the exit value to that expected if($exitval != $files>{$file}) { $fail = 1; last; }}
# Files to test along with expected exit valuemy $files = { "01spamassassinnormalmessage.txt" => 0, "02spamassassinspammymessage.txt" => 1,};
my $fail = 0;
# Scan each of the files in the hashforeach my $file (keys(%$files)) {
# Check the file system("/usr/bin/spamc c B < $file > /dev/null");
# Bitshift $? to get the exit code my $exitval = $CHILD_ERROR >> 8;
# Compare the exit value to that expected if($exitval != $files>{$file}) { $fail = 1; last; }}
Files totest
Expectedexit
value
Write status to a file
open(my $fh, '>', '/tmp/sastatus') or die('Cannot open file');print($fh $fail);close($fh);
So we know #fail.
Now what?!
SNMP
● Add SNMP!
● snmpd.confextend sastatus /bin/cat /tmp/sastatus
SNMP
● snmpwalk$ snmpwalk v2c c public 127.0.0.1 .1.3.6.1.4.1.8072.1.3.2 NETSNMPEXTENDMIB::nsExtendNumEntries.0 = INTEGER: 1NETSNMPEXTENDMIB::nsExtendCommand."sastatus" = STRING: /bin/catNETSNMPEXTENDMIB::nsExtendArgs."sastatus" = STRING: /tmp/sastatusNETSNMPEXTENDMIB::nsExtendInput."sastatus" = STRING: NETSNMPEXTENDMIB::nsExtendCacheTime."sastatus" = INTEGER: 5NETSNMPEXTENDMIB::nsExtendExecType."sastatus" = INTEGER: exec(1)NETSNMPEXTENDMIB::nsExtendRunType."sastatus" = INTEGER: runonread(1)NETSNMPEXTENDMIB::nsExtendStorage."sastatus" = INTEGER: permanent(4)NETSNMPEXTENDMIB::nsExtendStatus."sastatus" = INTEGER: active(1)NETSNMPEXTENDMIB::nsExtendOutput1Line."sastatus" = STRING: 1NETSNMPEXTENDMIB::nsExtendOutputFull."sastatus" = STRING: 1NETSNMPEXTENDMIB::nsExtendOutNumLines."sastatus" = INTEGER: 1NETSNMPEXTENDMIB::nsExtendResult."sastatus" = INTEGER: 0NETSNMPEXTENDMIB::nsExtendOutLine."sastatus".1 = STRING: 1
SNMP
● snmpwalk$ snmpwalk v2c c public 127.0.0.1 .1.3.6.1.4.1.8072.1.3.2 NETSNMPEXTENDMIB::nsExtendNumEntries.0 = INTEGER: 1NETSNMPEXTENDMIB::nsExtendCommand."sastatus" = STRING: /bin/catNETSNMPEXTENDMIB::nsExtendArgs."sastatus" = STRING: /tmp/sastatusNETSNMPEXTENDMIB::nsExtendInput."sastatus" = STRING: NETSNMPEXTENDMIB::nsExtendCacheTime."sastatus" = INTEGER: 5NETSNMPEXTENDMIB::nsExtendExecType."sastatus" = INTEGER: exec(1)NETSNMPEXTENDMIB::nsExtendRunType."sastatus" = INTEGER: runonread(1)NETSNMPEXTENDMIB::nsExtendStorage."sastatus" = INTEGER: permanent(4)NETSNMPEXTENDMIB::nsExtendStatus."sastatus" = INTEGER: active(1)NETSNMPEXTENDMIB::nsExtendOutput1Line."sastatus" = STRING: 1NETSNMPEXTENDMIB::nsExtendOutputFull."sastatus" = STRING: 1NETSNMPEXTENDMIB::nsExtendOutNumLines."sastatus" = INTEGER: 1NETSNMPEXTENDMIB::nsExtendResult."sastatus" = INTEGER: 0NETSNMPEXTENDMIB::nsExtendOutLine."sastatus".1 = STRING: 1
Now we can monitor!
OpenNMS
● poller-configuration.xml <service name="SAStatus" interval="300000" userdefined="false" status="on"> <parameter key="retry" value="2"/> <parameter key="timeout" value="3000"/> <parameter key="port" value="161"/> <parameter key="oid" value=".1.3.6.1.4.1.8072.1.3.2.4.1.2.9.115.97.45.115.116.97.116.117.115.1"/> <parameter key="operator" value="="/> <parameter key="operand" value="0"/> </service>
<monitor service="SAStatus" classname="org.opennms.netmgt.poller.monitors.SnmpMonitor" />
OpenNMS
● poller-configuration.xml <service name="SAStatus" interval="300000" userdefined="false" status="on"> <parameter key="retry" value="2"/> <parameter key="timeout" value="3000"/> <parameter key="port" value="161"/> <parameter key="oid" value=".1.3.6.1.4.1.8072.1.3.2.4.1.2.9.115.97.45.115.116.97.116.117.115.1"/> <parameter key="operator" value="="/> <parameter key="operand" value="0"/> </service>
<monitor service="SAStatus" classname="org.opennms.netmgt.poller.monitors.SnmpMonitor" />
Commandoutput
OpenNMS
● poller-configuration.xml <service name="SAStatus" interval="300000" userdefined="false" status="on"> <parameter key="retry" value="2"/> <parameter key="timeout" value="3000"/> <parameter key="port" value="161"/> <parameter key="oid" value=".1.3.6.1.4.1.8072.1.3.2.4.1.2.9.115.97.45.115.116.97.116.117.115.1"/> <parameter key="operator" value="="/> <parameter key="operand" value="0"/> </service>
<monitor service="SAStatus" classname="org.opennms.netmgt.poller.monitors.SnmpMonitor" />
Externalcommandnumber
Erk.
#fail.
+++OUT OF CHEESE+++
OpenNMS
● poller-configuration.xml <service name="SAStatus" interval="300000" userdefined="false" status="on"> <parameter key="retry" value="2"/> <parameter key="timeout" value="3000"/> <parameter key="port" value="161"/> <parameter key="oid" value=".1.3.6.1.4.1.8072.1.3.2.4.1.2.9.115.97.45.115.116.97.116.117.115.1"/> <parameter key="operator" value="="/> <parameter key="operand" value="0"/> </service>
<monitor service="SAStatus" classname="org.opennms.netmgt.poller.monitors.SnmpMonitor" />
Length ofidentifier
“sa-status”Thanks to roskens on
#opennms for pointing outthat this was wrong.
Cheers! :)
OpenNMS
● poller-configuration.xml <service name="SAStatus" interval="300000" userdefined="false" status="on"> <parameter key="retry" value="2"/> <parameter key="timeout" value="3000"/> <parameter key="port" value="161"/> <parameter key="oid" value=".1.3.6.1.4.1.8072.1.3.2.4.1.2.9.115.97.45.115.116.97.116.117.115.1"/> <parameter key="operator" value="="/> <parameter key="operand" value="0"/> </service>
<monitor service="SAStatus" classname="org.opennms.netmgt.poller.monitors.SnmpMonitor" />
“sa-status”
OpenNMS
● poller-configuration.xml <service name="SAStatus" interval="300000" userdefined="false" status="on"> <parameter key="retry" value="2"/> <parameter key="timeout" value="3000"/> <parameter key="port" value="161"/> <parameter key="oid" value=".1.3.6.1.4.1.8072.1.3.2.4.1.2.9.115.97.45.115.116.97.116.117.115.1"/> <parameter key="operator" value="="/> <parameter key="operand" value="0"/> </service>
<monitor service="SAStatus" classname="org.opennms.netmgt.poller.monitors.SnmpMonitor" />
Linenumber ofcommand
output
OpenNMS
● poller-configuration.xml <service name="SAStatus" interval="300000" userdefined="false" status="on"> <parameter key="retry" value="2"/> <parameter key="timeout" value="3000"/> <parameter key="port" value="161"/> <parameter key="oid" value=".1.3.6.1.4.1.8072.1.3.2.4.1.2.9.115.97.45.115.116.97.116.117.115.1"/> <parameter key="operator" value="="/> <parameter key="operand" value="0"/> </service>
<monitor service="SAStatus" classname="org.opennms.netmgt.poller.monitors.SnmpMonitor" />
Commandoutput
“sa-status”
Linenumber ofcommand
output
Length ofidentifier
“sa-status”
Provisioning
● Foreign source
● Node
That works!
That works!
But it sucks!
Issues
● Stale file?● Not SNMP tables - no way to instance map● Works for simple data● Can have multiple lines● Assumes ordering with multiple lines
Issues
● Want to add a service● Want to remove a service● Want to re-order my file● …...
Issues
● Want to add a service● Want to remove a service● Want to re-order my file (OCD attack)● …...
Mail queue size
● Want to track destination domains● Which are queueing● Which are local, remote, etc
Our single file approachdoes not scale.
Our single file approachdoes not scale.
Add cloud more files!
Output to two files
● Keys● Values
● The data can change● OpenNMS will map together for us● Add a timestamp
Output to two files
# Open the files to work withmy $file = $path . $counter . '_';my $keys_file = IO::File>new( $file . 'keys', 'w' ) or croak($OS_ERROR);my $stats_file = IO::File>new( $file . 'stats', 'w' ) or croak($OS_ERROR);
my $timestamp = time();
# Output the current timestamp to both files$keys_file>print("$timestamp\n");$stats_file>print("$timestamp\n");
# Output the keys and stats to the relevant files.foreach my $key ( sort( keys( %{$stats} ) ) ) { $keys_file>print("$key\n"); $stats_file>print( $stats>{$key} . "\n" );}
# Close the File::IO objects$keys_file>close();$stats_file>close();
Generate the data
● Exim has exipick● --flatq option is designed for parsing● Options for extra data
Generate the data# Set the exipick binary & CLI optionsmy $exipick_cmd = "/usr/sbin/exipick";my $exipick_opt = "flatq showvars message_size,deliver_freeze";
# Initialise interesting domain counters and the two catchall domains.my %domains = ('exchange.example.com' => 0, 'dept1.example.com' => 0, '*.example.com' => 0, # Catchall for other example.com 'internal' => 0, # Counter for all internal 'external' => 0, # Catchall for everything else);
# Run the exipick command and process the output.open(my $exipick, '|', "$exipick_cmd $exipick_opt");
# Loop through the command outputwhile(<$exipick>) { chomp;
my $line = $_;
# Split the showvars with spaces rather than semicolons $line =~ s/;/ /g;
# Send the exipick line to be processed. process_line($line);}
close($exipick);
Generate the datasub process_line { my $line = shift;
# Sample exipick output line with email address changed to anonymise: # 4d message_size='9265' deliver_freeze='' 1E9hWO0001AH4A <> [email protected]
# Split the output line using a regexp if($line =~ m/.* message_size='(\d*)' deliver_freeze='(.*)' (.*.*.*) <(.*)> (.*)/){ my $msg_size = $1; my $msg_frozen = $2; my $msg_id = $3; my $msg_sender = $4; my $msg_recipient = $5;
# Send the recipient email address to process domain for a breakdown. process_domain($msg_recipient); }}
Generate the datasub process_domain { my $email = shift;
# Get the domain from the email address. my $domain = $email; $domain =~ s/^.*@//;
# Is this an internal domain? if($domain =~ m/example.com$/) { $domains{'internal'}++; }
# If this domain exists, increment the counter for that domain. if(defined($domains{$domain})) { $domains{$domain}++; }
# Increment the catchall counter for *.example.com addresses elsif ($domain =~ m/example.com$/) { $domains{'*.example.com'}++; }
# Increment the external counter for everything else else { $domains{'external'}++; }}
Generate the data
$VAR1 = { 'dept1.example.com' => 34, 'exchange.example.com' => 26, 'external' => 43, 'internal' => 110, '*.example.com' => 50 };
● Generates this data:
● Which writes our two files
/tmp/mailq_keys
/tmp/mailq_stats
SNMP
● snmpd.conf
extend mailqkeys /bin/cat /tmp/mailq_keysextend mailqstats /bin/cat /tmp/mailq_stats
SNMP
● snmpwalk$ snmpwalk v2c c public 127.0.0.1 NETSNMPEXTENDMIB::nsExtendOutLineNETSNMPEXTENDMIB::nsExtendOutLine."mailqkeys".1 = STRING: 1534221421134720NETSNMPEXTENDMIB::nsExtendOutLine."mailqkeys".2 = STRING: *.example.comNETSNMPEXTENDMIB::nsExtendOutLine."mailqkeys".3 = STRING: exchange.example.comNETSNMPEXTENDMIB::nsExtendOutLine."mailqkeys".4 = STRING: dept1.example.comNETSNMPEXTENDMIB::nsExtendOutLine."mailqkeys".5 = STRING: internalNETSNMPEXTENDMIB::nsExtendOutLine."mailqkeys".6 = STRING: externalNETSNMPEXTENDMIB::nsExtendOutLine."mailqstats".1 = STRING: 1534221421134720NETSNMPEXTENDMIB::nsExtendOutLine."mailqstats".2 = STRING: 50NETSNMPEXTENDMIB::nsExtendOutLine."mailqstats".3 = STRING: 26NETSNMPEXTENDMIB::nsExtendOutLine."mailqstats".4 = STRING: 34NETSNMPEXTENDMIB::nsExtendOutLine."mailqstats".5 = STRING: 110NETSNMPEXTENDMIB::nsExtendOutLine."mailqstats".6 = STRING: 43
Now we can monitor!
OpenNMS
<resourceType name="eximMailQueueInst" label="Exim mail queue"> <persistenceSelectorStrategy class="org.opennms.netmgt.collectd.PersistRegexSelectorStrategy"> <parameter key="matchexpression" value="not(#eximMailQueueKey matches '^\d+$')" /> </persistenceSelectorStrategy>
<storageStrategy class="org.opennms.netmgt.dao.support.SiblingColumnStorageStrategy"> <parameter key="siblingcolumnname" value="eximMailQueueKey" /> <parameter key="replaceall" value="s/\s/_/" /> </storageStrategy> </resourceType>
OpenNMS
<group name="eximmailq" ifType="ignore"> <mibObj oid=".1.3.6.1.4.1.8072.1.3.2.4.1.2.10.109.97.105.108.113.45.107.101.121.115" instance="eximMailQueueInst" alias="eximMailQueueKey" type="string" /> <mibObj oid=".1.3.6.1.4.1.8072.1.3.2.4.1.2.11.109.97.105.108.113.45.115.116.97.116.115" instance="eximMailQueueInst" alias="eximMailQueueStat" type="gauge" /> </group>
<systemDef name="Exim monitoring"> <sysoidMask>.1.3.6.1.4.1.8072.3.2.10</sysoidMask> <collect> <includeGroup>eximmailq</includeGroup> </collect> </systemDef>
Graphs!
# Reports list for Eximreports=exim.eximMailQueueKey, exim.eximMailQueueStat
report.exim.eximMailQueueStat.name=Messagesreport.exim.eximMailQueueStat.columns=eximMailQueueStatreport.exim.eximMailQueueStat.propertiesValues=eximMailQueueKeyreport.exim.eximMailQueueStat.type=eximMailQueueInstreport.exim.eximMailQueueStat.command=title="{eximMailQueueKey}" \ DEF:messages={rrd1}:eximMailQueueStat:AVERAGE \ LINE2:messages#0000ff:"Messages" \ GPRINT:messages:AVERAGE:" Avg \\: %2.0lf %s" \ GPRINT:messages:MIN:" Min \\: %2.0lf %s" \ GPRINT:messages:MAX:" Max \\: %2.0lf %s\\n"
Graphs!
OpenNMS
● Now we get thresholding● And all the fun of the NMS!
Further expansion
● Antivirus● Total number of messages● Log file parsing● Message transit times● RBL monitoring
Questions?
Thanks!
Slides available at:http://agaton.scsys.co.uk/~iann/talks/
idn on irc.perl.org and irc.freenode.net