Xymon Mailing List Archive search

SMTP checks

12 messages in this thread

list Ian Diddams · Tue, 28 Sep 2021 13:52:12 +0000 (UTC) ·
xymon server 4.3.28, centos 7.9
mail server centos 6 (about to be replaced anyway)

I'm in a bit of a quandary...
we have an smtp check of a server ("oscar").

Literally for "years" (3+) the xymon in house smtp check has been more than happy.

Then since Sept 11th [1] its been flipping constantly beyween green and happy to red 
eg
Service listening but unavailable (connect timeout)


Seconds: 0.001842000
But if I run a telnet test from the xymon server itself it always connects immediately, and responds instantaneously.

eg

$ telnet oscar 25
Trying <redacted>...Connected to oscar.
Escape character is '^]'.
220 a.b.co.uk ESMTP Postfix
mail
501 5.5.4 Syntax: MAIL FROM:<address>
quit
221 2.0.0 Bye
Connection closed by foreign host.
$


All xymon logs show are emails being sent because of the alerts, ad falling statuses that are a result of the constant flipping from red to green. maillog has nothing untoward, ditto messages.

Anyone any ideas what on earth is suddenly occurring that may explain this recent timeout issue?? any expereince of similar?
didds


[1] no changes made around that time etc.? Networks confirm nothing shoing in their logs etc.
list Dennis Riley · Tue, 28 Sep 2021 11:13:25 -0400 ·
Could some other machine have been brought up using the same IP? Or
"Oscar" have more than one IP associated with it?

-----Original Message-----
From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of Ian Diddams via
Xymon
Sent: Tuesday, September 28, 2021 9:52 AM
To: xymon <xymon at xymon.com>
Subject: [Xymon] SMTP checks
list Jeremy Laidman · Wed, 29 Sep 2021 17:54:17 +1000 ·
Ian

[disclaimer: I don't use the smtp check]

I'm wondering if the "connect timeout" relates not to the TCP connection
but the SMTP conversation. I would do a packet capture during the
interaction between xymonnet and the mail server, and see if it closely
matches your telnet interaction, or deviates.

The smtp check is defined in protocols.cfg

[smtp]
   send "mail\r\nquit\r\n"
   expect "220"
   options banner
   port 25

This expects a 220 in the response after the mail\r\nquit\r\n is sent:

'*It will then expect a response beginning with "220".*' (my emphasis on
the word "then")

Your interaction shows 221 instead of 220, after the mail/quit strings. But
if this is the problem, then protocols.cfg has been wrong for many, many
years. I suspect (contrary to what I infer from the man page) that "expect"
looks in the strings for the whole interaction, not just the ones after the
"send" string is sent. In which case, the protocols.cfg entry for smtp
should work fine.

However, you might like to test removing the "expect" or altering to look
for 221 instead of 220.

Cheers
Jeremy


On Wed, 29 Sept 2021 at 00:22, Ian Diddams via Xymon <xymon at xymon.com>
wrote:
---------- Forwarded message ----------
From: Ian Diddams <user-7fbf34ed5219@xymon.invalid>
To: xymon <xymon at xymon.com>
Cc:
Bcc:
Date: Tue, 28 Sep 2021 13:52:12 +0000 (UTC)
Subject: SMTP checks
quoted from Ian Diddams
xymon server 4.3.28, centos 7.9
mail server centos 6 (about to be replaced anyway)

I'm in a bit of a quandary...

we have an smtp check of a server ("oscar").

Literally for "years" (3+) the xymon in house smtp check has been more
than happy.

Then since Sept 11th [1] its been flipping constantly beyween green and
happy to red
eg

Service listening but unavailable (connect timeout)


Seconds: 0.001842000


But if I run a telnet test from the xymon server itself it always connects
immediately, and responds instantaneously.

eg

$ telnet oscar 25

Trying <redacted>...
Connected to oscar.
quoted from Ian Diddams
Escape character is '^]'.
220 a.b.co.uk ESMTP Postfix
mail
501 5.5.4 Syntax: MAIL FROM:<address>
quit
221 2.0.0 Bye
Connection closed by foreign host.
$


All xymon logs show are emails being sent because of the alerts, ad
falling statuses that are a result of the constant flipping from red to
green. maillog has nothing untoward, ditto messages.

Anyone any ideas what on earth is suddenly occurring that may explain this
recent timeout issue?  any expereince of similar?

didds


[1] no changes made around that time etc.  Networks confirm nothing shoing
in their logs etc.


---------- Forwarded message ----------
From: Ian Diddams via Xymon <xymon at xymon.com>
To: xymon <xymon at xymon.com>
Cc:
Bcc:
Date: Tue, 28 Sep 2021 13:52:12 +0000 (UTC)
Subject: [Xymon] SMTP checks

list Jeremy Laidman · Wed, 29 Sep 2021 17:56:35 +1000 ·
Oh, by the way, the latest Xymon uses "ehlo xymonnet\r\nquit\r\n" as the
send string (with the same expect string). I doubt that would make a
difference. But you might be running a buggy xymonnet.
quoted from Jeremy Laidman

On Wed, 29 Sept 2021 at 17:54, Jeremy Laidman <user-0608abae5e7c@xymon.invalid> wrote:
Ian

[disclaimer: I don't use the smtp check]

I'm wondering if the "connect timeout" relates not to the TCP connection
but the SMTP conversation. I would do a packet capture during the
interaction between xymonnet and the mail server, and see if it closely
matches your telnet interaction, or deviates.

The smtp check is defined in protocols.cfg

[smtp]
   send "mail\r\nquit\r\n"
   expect "220"
   options banner
   port 25

This expects a 220 in the response after the mail\r\nquit\r\n is sent:

'*It will then expect a response beginning with "220".*' (my emphasis on
the word "then")

Your interaction shows 221 instead of 220, after the mail/quit strings.
But if this is the problem, then protocols.cfg has been wrong for many,
many years. I suspect (contrary to what I infer from the man page) that
"expect" looks in the strings for the whole interaction, not just the ones
after the "send" string is sent. In which case, the protocols.cfg entry for
smtp should work fine.

However, you might like to test removing the "expect" or altering to look
for 221 instead of 220.

Cheers
Jeremy


On Wed, 29 Sept 2021 at 00:22, Ian Diddams via Xymon <xymon at xymon.com>
wrote:
---------- Forwarded message ----------
From: Ian Diddams <user-7fbf34ed5219@xymon.invalid>
To: xymon <xymon at xymon.com>
Cc:
Bcc:
Date: Tue, 28 Sep 2021 13:52:12 +0000 (UTC)
Subject: SMTP checks
xymon server 4.3.28, centos 7.9
mail server centos 6 (about to be replaced anyway)

I'm in a bit of a quandary...

we have an smtp check of a server ("oscar").

Literally for "years" (3+) the xymon in house smtp check has been more
than happy.

Then since Sept 11th [1] its been flipping constantly beyween green and
happy to red
eg

Service listening but unavailable (connect timeout)


Seconds: 0.001842000


But if I run a telnet test from the xymon server itself it always
connects immediately, and responds instantaneously.

eg

$ telnet oscar 25
Trying <redacted>...
Connected to oscar.
Escape character is '^]'.
220 a.b.co.uk ESMTP Postfix
mail
501 5.5.4 Syntax: MAIL FROM:<address>
quit
221 2.0.0 Bye
Connection closed by foreign host.
$


All xymon logs show are emails being sent because of the alerts, ad
falling statuses that are a result of the constant flipping from red to
green. maillog has nothing untoward, ditto messages.

Anyone any ideas what on earth is suddenly occurring that may explain
this recent timeout issue?  any expereince of similar?

didds


[1] no changes made around that time etc.  Networks confirm nothing
shoing in their logs etc.


---------- Forwarded message ----------
From: Ian Diddams via Xymon <xymon at xymon.com>
To: xymon <xymon at xymon.com>
Cc:
Bcc:
Date: Tue, 28 Sep 2021 13:52:12 +0000 (UTC)
Subject: [Xymon] SMTP checks

list Ian Diddams · Wed, 29 Sep 2021 09:58:45 +0000 (UTC) ·
secondary question...

what triggers the smtp check?

Ive had instances where the test goes red, and then just a few seconds later back to green.? So this isnt a standard "every 5 minutes" clearly ?

...
Wed Sep 29 08:44:39 2021 green 0:02:13 
Wed Sep 29 08:44:15 2021 red 0:00:24 
Wed Sep 29 07:44:36 2021 green 0:59:39 
...

anybody know?
cheers
ian
list Ian Diddams · Wed, 29 Sep 2021 10:18:19 +0000 (UTC) ·
 I also can see from the maillog that when the smtp check occurs 

Sep 29 11:05:01 oscar postfix/smtpd[27269]: connect from unknown[x.x.x.x]
Sep 29 11:05:01 oscar postfix/smtpd[27269]: improper command pipelining after MAIL from unknown [x.x.x.x]Sep 29 11:05:01 oscar postfix/smtpd[27269]: disconnect from unknow [x.x.x.x]

so obviously the smtp check is sending something postfix doesnt like/handle.? That isnt a bhiggy in itself - bgut it nojw makes me wonder out of curiosity exactly what the logic of? the smtp check is..

I also tried Jeremy's example of "ehlo xymonnet\r\nquit\r\n" in protocols.cfg - that also has the same warning/error in maillog 
?improper command pipelining after EHLO from unknown[x.x.x.x]

all jolly wierd,....
ian


    On Wednesday, 29 September 2021, 10:59:09 BST, Ian Diddams via Xymon <xymon at xymon.com> wrote:    secondary question...
quoted from Ian Diddams

what triggers the smtp check?

Ive had instances where the test goes red, and then just a few seconds later back to green.? So this isnt a standard "every 5 minutes" clearly ?

...
Wed Sep 29 08:44:39 2021 green 0:02:13 Wed Sep 29 08:44:15 2021 red 0:00:24 Wed Sep 29 07:44:36 2021 green 0:59:39 ...

anybody know?
cheers
ian
list Lars Kollstedt · Wed, 29 Sep 2021 13:38:49 +0200 ·
On Mittwoch, 29. September 2021 12:18:19 CEST Ian Diddams via Xymon wrote:
 improper command pipelining after EHLO from unknown[x.x.x.x]
Hello Ian,

the following works for me without this log message:

[smtp]
   send "ehlo xymonnet\r\n"
   expect "220"
   send "quit\r\n"
   options banner
   port 25

But I have the sends and expects commented out for smtps (with authentication) 
due to TLS 1.3 issues at the moment.

Kind regards,
	Lars

-- 
Lars Kollstedt

Telefon: +49 6151 16-71027
E-Mail:  user-0f90394071da@xymon.invalid

man-da.de GmbH
Dolivostra?e 11
64293 Darmstadt

Sitz der Gesellschaft: Darmstadt
Registergericht: Amtsgericht Darmstadt
Handelsregisternummer: HRB 9484
Gesch?ftsf?hrer: Andreas Ebert
list Jeremy Laidman · Wed, 29 Sep 2021 22:31:03 +1000 ·
Lars, I could be wrong, but I think I remember someone posting that the
send/expect doesn't support multiples for a single protocol. So your
send/expect/send sequence might just be a send/expect sequence.

If I append a second expect and send after the first instances in the
[smtp] section (of protocols.cfg) and then run "xymonnet --dump=before" I
see this:

10007 2021-09-29 22:05:48  Name      : smtp
10007 2021-09-29 22:05:48    Sendtext: extra-send\r\n
10007 2021-09-29 22:05:48    Sendlen : 12
10007 2021-09-29 22:05:48    Exp.text: extra-expect
10007 2021-09-29 22:05:48    Exp.len : 12
10007 2021-09-29 22:05:48    Exp.ofs : 0
10007 2021-09-29 22:05:48    Flags   : 1
10007 2021-09-29 22:05:48    Port    : 25

It has ignored the original instances, and only sees the ones I added
after. So it looks like xymonnet only takes the last send, and last expect,
that are defined for a protocol.

If this is the case, your [smtp] definition is essentially "send quit\r\n"
and "expect 220".

In fact, running xymonnet via strace, with your configuration, shows this:

connect(3, {sa_family=AF_INET, sin_port=htons(25),
sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in
progress)
write(3, "quit\r\n", 6)                 = 6
read(3, "220 <servername> ESMTP Postfix\r\n221 2.0.0 Bye\r\n", 4095) = 60
close(3)

The "ehlo string" was completely ignored, and was not sent to the mail
server.

Also, note that the quit was sent BEFORE the banner was read. This suggests
to me that the "expect" looks at the banner that is sent before the "send"
command is sent. In fact, the only way I could make my test go green was to
either remove "--checkresponse" or to match the start of the banner, which
is "220". I can match stuff after the 220 (eg expect "220 <servername>
ESMTP") but whatever is in the expect string must be at the start of the
banner.

J
quoted from Lars Kollstedt

On Wed, 29 Sept 2021 at 21:39, Lars Kollstedt <user-0f90394071da@xymon.invalid> wrote:
On Mittwoch, 29. September 2021 12:18:19 CEST Ian Diddams via Xymon wrote:
 improper command pipelining after EHLO from unknown[x.x.x.x]
Hello Ian,

the following works for me without this log message:

[smtp]
   send "ehlo xymonnet\r\n"
   expect "220"
   send "quit\r\n"
   options banner
   port 25

But I have the sends and expects commented out for smtps (with
authentication)
due to TLS 1.3 issues at the moment.

Kind regards,
        Lars

--
Lars Kollstedt

Telefon: +49 6151 16-71027
E-Mail:  user-0f90394071da@xymon.invalid

man-da.de GmbH
Dolivostra?e 11
64293 Darmstadt

Sitz der Gesellschaft: Darmstadt
Registergericht: Amtsgericht Darmstadt
Handelsregisternummer: HRB 9484
Gesch?ftsf?hrer: Andreas Ebert

list Jeremy Laidman · Wed, 29 Sep 2021 22:42:18 +1000 ·
Ian

I believe the postfix warning is because there's a "mail" command before a
"helo" or "ehlo" command. Pipelining is when Postfix checks to see that the
mail client is waiting for the response to one command, before sending the
next command, rather than just sending a "pipeline" of commands without
checking the results of each. Spamming robots do this kind of thing. You
can either modify the postfix configuration (see reject_unauth_pipelining),
or adjust the [smtp] protocol to not start sending a mail message. I'd
suggest sending "ehlo xymonnet\r\nquit\r\n".

What triggers the smtp check is:
a) you have "smtp" on the host's line in hosts.cfg, and
b) you have xymonnet being launched every 5 minutes, which reads hosts.cfg,
sees "smtp" and looks up the [smtp] section in protocols.cfg

When there's a test failure, xymonnet-again runs every minute and does a
rapid re-check of any failed tests. So a test that goes red can go green
again a minute later. In fact, xymonnet and xymonnet-again don't seem to be
synchronised at all, so I guess it's possible for xymonnet to run
immediately after xymonnet has finished.

J

On Wed, 29 Sept 2021 at 20:18, Ian Diddams via Xymon <xymon at xymon.com>
quoted from Jeremy Laidman
wrote:
---------- Forwarded message ----------
From: Ian Diddams <user-7fbf34ed5219@xymon.invalid>
To: xymon <xymon at xymon.com>
Cc:
Bcc:

Date: Wed, 29 Sep 2021 10:18:19 +0000 (UTC)
Subject: Re: [Xymon] SMTP checks
quoted from Ian Diddams
I also can see from the maillog that when the smtp check occurs


Sep 29 11:05:01 oscar postfix/smtpd[27269]: connect from unknown[x.x.x.x]
Sep 29 11:05:01 oscar postfix/smtpd[27269]: improper command pipelining

after MAIL from unknown [x.x.x.x]
Sep 29 11:05:01 oscar postfix/smtpd[27269]: disconnect from unknow
quoted from Ian Diddams
[x.x.x.x]

so obviously the smtp check is sending something postfix doesnt
like/handle.  That isnt a bhiggy in itself - bgut it nojw makes me wonder
out of curiosity exactly what the logic of  the smtp check is..

I also tried Jeremy's example of "ehlo xymonnet\r\nquit\r\n" in
protocols.cfg - that also has the same warning/error in maillog

 improper command pipelining after EHLO from unknown[x.x.x.x]

all jolly wierd,....

ian


On Wednesday, 29 September 2021, 10:59:09 BST, Ian Diddams via Xymon <
xymon at xymon.com> wrote:


secondary question...

what triggers the smtp check?

Ive had instances where the test goes red, and then just a few seconds
later back to green.  So this isnt a standard "every 5 minutes" clearly ?

...
Wed Sep 29 08:44:39 2021 green 0:02:13
Wed Sep 29 08:44:15 2021 red 0:00:24
Wed Sep 29 07:44:36 2021 green 0:59:39
...

anybody know?

cheers

ian


---------- Forwarded message ----------
From: Ian Diddams via Xymon <xymon at xymon.com>
To: xymon <xymon at xymon.com>
Cc:
Bcc:

Date: Wed, 29 Sep 2021 10:18:19 +0000 (UTC)
Subject: Re: [Xymon] SMTP checks

list R P Herrold · Wed, 29 Sep 2021 12:36:54 -0400 (EDT) ·
quoted from Jeremy Laidman
On Wed, 29 Sep 2021, Jeremy Laidman wrote:
The smtp check is defined in protocols.cfg

[smtp]
   send "mail\r\nquit\r\n"
   expect "220"
   options banner
   port 25

This expects a 220 in the response after the mail\r\nquit\r\n is sent:
The RFC requires waiting until "220 " .... note the trailing 
spacce

-- Russ herrold
list Lars Kollstedt · Fri, 01 Oct 2021 10:57:39 +0200 ·
quoted from Jeremy Laidman
On Mittwoch, 29. September 2021 14:31:03 CEST Jeremy Laidman wrote:
read(3, "220 <servername> ESMTP Postfix\r\n221 2.0.0 Bye\r\n", 4095) = 60
close(3)

The "ehlo string" was completely ignored, and was not sent to the mail
server.
Hi Jeremy,

The message above is also what I'm seeing in the Xymon message, and you are right. There is of course no output from the EHLO response.

This change avoids the log message on the mailserver, but the xymon server only sends quit, and looks at the initial greeting banner. Which is still better than nothing. But doesn?t he with the other configurations? And probably I could leave the ehlo simply away.

The piplining log messages are thrown on the mailserver because the xymon server sends multiple commands on the same time. But if you are sending only ehlo the Mailserver will complain about inpropperly closed sessions.
quoted from Jeremy Laidman

Kind regards
	Lars

-- 
Lars Kollstedt

Telefon: +49 6151 16-71027
E-Mail:  user-0f90394071da@xymon.invalid

man-da.de GmbH
Dolivostra?e 11
64293 Darmstadt

Sitz der Gesellschaft: Darmstadt
Registergericht: Amtsgericht Darmstadt
Handelsregisternummer: HRB 9484
Gesch?ftsf?hrer: Andreas Ebert
list Ian Diddams · Fri, 1 Oct 2021 12:12:22 +0000 (UTC) ·
Mystery solved!

There was a malformed subnet entry in postfix main.cf which when sorted cleared all the alerts up!
cheers all