xinetd bug or new DoS attack?

Discussion:

David Cook

2004-01-02 18:02:23 UTC

Aloha fellow xinetd users...

We've identified a potential problem in the current version of xinetd (2.3.12)
that may be a defect of the code itself, or a possible new form of DOS attack.

We see the following on our console:

Jan 1 17:14:31 puna xinetd[16732]: [ID 385394 daemon.error] Deactivating service ftp due to excessive incoming connections. Restarting in 15 seconds.
Jan 1 17:14:46 puna xinetd[16732]: [ID 385394 daemon.error] Activating service ftp
Jan 1 17:15:41 puna xinetd[16732]: [ID 254256 daemon.error] service pop3, accept: Too many open files (errno = 24)
Jan 1 17:15:45 last message repeated 52347 times

If you look at the times... you can see that a user hitting our FTP caused
it to deactivate at 17:14:31 and then it came back up (as it should) 15
seconds later. I did a netstat and confirmed that indeed, a user from France
had been bombarding our FTP ports - so that's correct.

However, approximatly one minute later you can see that XINETD says that
pop3 accept: has too many open files (errno = 24). Looking at our netstat
we do NOT see any flood of popper hits associated with this.

When we get the ACCEPT error XINETD enters a infinite loop where it pounds
syslogd with that error message (note, it repeated 52,347 times). BTW, we're
running Solaris 8 (5.8). Both syslogd and xinetd start running up huge
CPU loads. Killing xinetd solves the problem and restarting it once load
drops makes things ok.

We noted this happening about 2 months ago - and at that time it appeared
to happen about every 25 days or so. But this week we got it several times,
including times where the server was at a lull (like JAN 1) - ALL of
them were accompanied by someone hitting the FTP port and causing
deactivation/reactivation.

In looking at the xinetd source code - the file 'connection.c' has the
following chunk (starting at line 50):

if( SC_WAITS( scp ) ) {
cp->co_descriptor = SVC_FD( sp );
} else {
cp->co_descriptor = accept( SVC_FD( sp ), &(cp->co_remote_address.sa),
&sin_len ) ;
M_SET( cp->co_flags, COF_NEW_DESCRIPTOR ) ;
}

if ( cp->co_descriptor == -1 )
{
msg( LOG_ERR, func, "service %s, accept: %m", SVC_ID( sp ) ) ;
return( FAILED ) ;
}

Note the 'accept:' in the 'msg' - this is the only place we could find in
the source where that string appears - so we're 99.999% certain this is
the line issuing the error.

Error number 24 is EMFILE (The per-process descriptor table is full) thus
indicating that no more descriptors were available when the 'accept' call
was invoked.

If you happen to follow how this routine is called - it is called through
a series of routines which start in the infinite loop in the main. However,
what happens in this case is the return(FAILED) causes the loop to try again,
in a dead heat for the most part.

We have placed a TEMPORARY fix - to avoid load going high while we're not
monitoring - by chainging the if detecting the error to this:

if ( cp->co_descriptor == -1 )
{
if (errno == 24) {
msg( LOG_ERR, func, "XINETD STOPPED DUE TO NO MORE DESCRIPTORS");
exit(1);
}
msg( LOG_ERR, func, "service %s, accept: %m", SVC_ID( sp ) ) ;
return( FAILED ) ;
}

You can see here that we are looking specifically for error number 24, and
if we see it we issue a message to the log and EXIT - hopefully that SHOULD
stop xinetd and the race condition. Since I put this minor change in last
night we havn't had the occurance yet - so I have yet to prove that it
works (if it does work, I'm intending on placing a SLEEP(10) at the start
of the xinetd main and having that error message also invoke another version
of xinetd before exiting - that should keep it up an running).

Anyone else experiencing this problem and is there any known fix?

Mahalo

Steve G

2004-01-02 18:34:56 UTC

Permalink

Post by David Cook
We've identified a potential problem in the current version
of xinetd (2.3.12) that may be a defect of the code itself,
or a possible new form of DOS attack.

Yes indeed. There are several more places where this could pop up
in the code and we don't handle it well. At best, they can do a
DOS attack.

The correct action is to suspend the offending services when this
occurs. This will close & remove the listening descriptor from
the select loop and you won't see the problem any more. Ending or
restarting xinetd is not good either.

I will code something up and put into cvs this afternoon. Thanks
for the bug report.

-Steve Grubb

__________________________________
Do you Yahoo!?
Find out what made the Top Yahoo! Searches of 2003
http://search.yahoo.com/top2003

Steve G

2004-01-02 21:14:39 UTC

Permalink

Hi,

I committed a fix for this into cvs. Anyone that would like to
try it out please download the latest from cvs.

-Steve Grubb

__________________________________
Do you Yahoo!?
Find out what made the Top Yahoo! Searches of 2003
http://search.yahoo.com/top2003

David Cook

2004-01-05 01:50:38 UTC

Permalink

Aloha Steve:

The day you issued the CVS fix I installed the change. Today we were
attacked again and the change did not work. Perplexed... I discovered why:

In your changes to 'connection.c' you placed:

if (errno == ENFILE)
cps_service_stop(sp, "no available descriptors");
else

However, the error being reported is:

Jan 4 19:37:05 puna xinetd[13130]: [ID 254256 daemon.error] service pop3, accept: Too many open files (errno = 24)

Glancing at the errno.h we discover:

#define ENFILE 23 /* File table overflow */
#define EMFILE 24 /* Too many open files */

Thus, the block should have been EMFILE (M not N):

if (errno == EMFILE)
cps_service_stop(sp, "no available descriptors");
else

Typo :) Do you agree? I've made the changes on my end (only to connection.c)
and recompiled. We'll see if this does it.

Mahalo Nui

--
David Cook -- Cookware Inc. -- ***@cookware.com
TQworld and tranquility: www.TQworld.com
Cookware Corporate: www.cookwareinc.com
Hawaii/Asia Office: (808) 966-5049 (david cook)
Mainland US Office: (317) 769-5049 (deborah sellers)

Have you had tranquility today? Play now at http://www.tqworld.com

Steve G

2004-01-05 03:36:03 UTC

Permalink

Post by David Cook
Typo :) Do you agree?

Hmmm. The man 2 accept page says:

EMFILE The per-process limit of open file descriptors has been
reached.

ENFILE The system maximum for file descriptors has been reached.

I guess I overlooked one. It should be an 'OR' condition...I just
fixed it in cvs. You can find the patch here:

http://www.xinetd.org/pipermail/cvs-xinetd/2004-January/000221.html

Thanks for the follow up. I, for one, would like hear feedback
next time you are attacked. It's good to know the problem was
handled.

-Steve Grubb

__________________________________
Do you Yahoo!?
Find out what made the Top Yahoo! Searches of 2003
http://search.yahoo.com/top2003

David Cook

2004-01-05 16:27:37 UTC

Permalink

Post by Steve G
EMFILE The per-process limit of open file descriptors has been
reached.
ENFILE The system maximum for file descriptors has been reached.
I guess I overlooked one. It should be an 'OR' condition...I just

I agree - though in my original report (and all the instances we've seen)
the only error actually being reported (generated) is EMFILE - but of course,
both ENFILE and EMFILE should be trapped in there.

I've updated both builtin.c and connection.c as per your changes and will
let you know if/when we see the new message appear.

Thanks for your speedy turnaround!!

Matthias Andree

2004-01-06 23:15:56 UTC

Permalink

Post by Steve G

Post by David Cook
Typo :) Do you agree?

EMFILE The per-process limit of open file descriptors has been
reached.
ENFILE The system maximum for file descriptors has been reached.
I guess I overlooked one. It should be an 'OR' condition...I just
http://www.xinetd.org/pipermail/cvs-xinetd/2004-January/000221.html
Thanks for the follow up. I, for one, would like hear feedback
next time you are attacked. It's good to know the problem was
handled.

Wait a second.

The service is disabled for 15 seconds when the connection rate is too
high? That's the same non-working scheme found in inetd, disable any
service that connects more than N times per minute for 10 minutes. It
doesn't handle high or excessive load. Disabling the service for some
time will bring people back in the time when the service is up, so it
disables itself again... no good.

If so, what advantage is xinetd over tcpsvd or tcpserver, apart from
IPv6?

--
Matthias Andree

Encrypt your mail: my GnuPG key ID is 0x052E7D95

Steve G

2004-01-07 03:01:26 UTC

Permalink

Post by Matthias Andree
The service is disabled for 15 seconds when the connection
rate is too high?

Nooo. This bug is more like the tcp/wait problem. It causes
xinetd to go into a tight loop sending syslog messages. The only
way to break the loop is to close the descriptor and remove it
from the select mask.

The bug he described was that the per process fd limit was
exhausted. This means accept fails, which causes it to go back to
select, and the listening descriptor is still readable, accept
fails again, etc. This is easy to do since linux has about a 2
minute turn around on used descriptors due to so_linger.

Its been my experience that its rare to hit the per process limit
in most situations, but it is there. This is a different problem
than cps.

-Steve Grubb

__________________________________
Do you Yahoo!?
Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
http://hotjobs.sweepstakes.yahoo.com/signingbonus

Matthias Andree

2004-01-07 20:43:23 UTC

Permalink

Post by Steve G
Its been my experience that its rare to hit the per process limit
in most situations, but it is there. This is a different problem
than cps.

Ah, ok, thanks for clearing this up.

That might BTW be a reason to pre-fork a master server per service when
the load can be high (which is what tcpsvd and tcpserver to by design -
they handle only one service, so you get to run one daemon per service).

--
Matthias Andree

Encrypt your mail: my GnuPG key ID is 0x052E7D95