I've been trying to load test my mongrel2-based comet handler, and I ran into the hot_max/hot_dividend knobs. I read the original blog post where superpoll was introduced, but I don't think I get it. How is something determined to be "hot", and how can a connection become "not hot"? What I'm running into is that I've set max_fd to 1024 (that's my user's ulimit), but I can only connect with 253 simultanous clients without messing with hot_dividend. I can lower the dividend to 2, which allows me to connect with ~512 clients, but lowering it to 1 makes mongrel2 fail by running out of memory. I don't understand what I'm doing though; are all new connections considered to be "hot" until they've had a while to be inactive? Could I get the full 1024 if I just spawned connections more slowly? I tried starting connections with a quarter-second pause between each, but even with the time it took to hit 253 connections, all of them were still hot.
On Mon, Jan 10, 2011 at 04:56:12PM -0600, tsuraan wrote: > I don't understand what I'm doing though; are all new connections > considered to be "hot" until they've had a while to be inactive? > Could I get the full 1024 if I just spawned connections more slowly? > I tried starting connections with a quarter-second pause between each, > but even with the time it took to hit 253 connections, all of them > were still hot. Alright, I figured this out and it's idiotic. The problem only comes up if you're hitting a URL that points at a 0mq handler. The 0mq handler requests (because of how 0mq 2.0.x is designed) have to get shoved into the poll set, since they can't go into epoll. The means if you don't have your handler running then it'll queue these up and then you get the "too many files" error. It's not really too many files though, it's more too many pending requests to your handlers. Cuases of this are that in mqsend and mqrecv I'm doing an mqwait rather than trying to do a non-blocking recv/send first and then waiting if I get nothing. I'll look at doing that today and that should get rid of a lot of the problem. The only remaining issue is what to do when it really does fill up like this? -- Zed A. Shaw http://zedshaw.com/
> Cuases of this are that in mqsend and mqrecv I'm doing an mqwait rather > than trying to do a non-blocking recv/send first and then waiting if I > get nothing. I'll look at doing that today and that should get rid of a > lot of the problem. > > The only remaining issue is what to do when it really does fill up like > this? Can you detect that the queue is full and just start replying to new requests with a 5xx error? 503 looks appropriate.
On Fri, Jan 14, 2011 at 04:50:29PM -0600, tsuraan wrote: > > Cuases of this are that in mqsend and mqrecv I'm doing an mqwait rather > > than trying to do a non-blocking recv/send first and then waiting if I > > get nothing. I'll look at doing that today and that should get rid of a > > lot of the problem. > > > > The only remaining issue is what to do when it really does fill up like > > this? > > Can you detect that the queue is full and just start replying to new > requests with a 5xx error? 503 looks appropriate. No idea, I'd have to look at how you'd set 0MQ to have a limit and then check if it's at the limit. I think it's time for some research. -- Zed A. Shaw http://zedshaw.com/
On 01/15/2011 03:03 AM, Zed A. Shaw wrote: > On Fri, Jan 14, 2011 at 04:50:29PM -0600, tsuraan wrote: >>> Cuases of this are that in mqsend and mqrecv I'm doing an mqwait rather >>> than trying to do a non-blocking recv/send first and then waiting if I >>> get nothing. I'll look at doing that today and that should get rid of a >>> lot of the problem. >>> >>> The only remaining issue is what to do when it really does fill up like >>> this? >> Can you detect that the queue is full and just start replying to new >> requests with a 5xx error? 503 looks appropriate. > No idea, I'd have to look at how you'd set 0MQ to have a limit and then > check if it's at the limit. I think it's time for some research look for the "high water mark" stuff: it's the queue size beyond which all writes block; in theory the limit should be changeable with the ZMQ_HWM option (including setting it to zero for no limit, but that probably opens some very interesting DOS possibilities), but I haven't been able to find how you can check if you are at/near it.
On 2011-01-16 14:52, Sabin Iacob wrote: > On 01/15/2011 03:03 AM, Zed A. Shaw wrote: >> On Fri, Jan 14, 2011 at 04:50:29PM -0600, tsuraan wrote: >>>> Cuases of this are that in mqsend and mqrecv I'm doing an mqwait rather >>>> than trying to do a non-blocking recv/send first and then waiting if I >>>> get nothing. I'll look at doing that today and that should get rid of a >>>> lot of the problem. >>>> >>>> The only remaining issue is what to do when it really does fill up like >>>> this? >>> Can you detect that the queue is full and just start replying to new >>> requests with a 5xx error? 503 looks appropriate. >> No idea, I'd have to look at how you'd set 0MQ to have a limit and then >> check if it's at the limit. I think it's time for some research > > look for the "high water mark" stuff: it's the queue size beyond which > all writes block; in theory the limit should be changeable with the > ZMQ_HWM option (including setting it to zero for no limit, but that > probably opens some very interesting DOS possibilities), but I haven't > been able to find how you can check if you are at/near it. This is where we should maybe have a strategy to handle these DOS cases, possibly one which could be configured. The zeromq system for the backend introduces a queue, so, if I am not totally wrong you get: clients <-> connections pool <-> Mongrel2 queue/dispatch <-> 0mq queue/dispatch <-> handlers so, 2 queues can be saturated. In fact you can independently saturate them, I was thinking first that the 0mq will anyway not have more messages than the number of connections in the pool, but in fact you can as clients fire a lot of small requests and "disconnect". If you wait just enough to have the message sent over to the 0mq queue, you disconnect, you are left with a message to be processed and when Mongrel2 get the answer, nobody is there to handle it. Rince and repeat, the 0mq queue is saturated because your handlers will not handle the load while Mongrel2 stays happy. Note that I am only looking at the downstream queue, I suppose that one could saturate with slow read clients the upstream side. This is just my thinking at the moment from what I am discovering while playing with Mongrel2/zeromq and my handlers. Note that my interpretation of the saturation points are most likely totally wrong, because I haven't performed any rigorous testing. Just take it as a case to think about. loïc
On Mon, Jan 17, 2011 at 08:56:15AM +0100, Loic d'Anterroches wrote: > This is just my thinking at the moment from what I am discovering while > playing with Mongrel2/zeromq and my handlers. Note that my > interpretation of the saturation points are most likely totally wrong, > because I haven't performed any rigorous testing. Just take it as a case > to think about. Nope, this is actually a problem with all the backend hosting methods to various degrees. It's fairly trivial to make a server do work and then disconnect before the response is sent so that a client doesn't bother processing it even if they run fastcgi, scgi, or proxy HTTP. Just nothing can be done to prevent that without some way of identifying clients and blocking them, but then HTTP is stateless so they can just run curl or httperf and avoid that. This is even worse on servers that allow pipelined requests. All you have to do is find a part of the application that takes a long time. Once you've got that, send one request to the long running url, then a whole bunch of requests pipelined behind it. Disconnect and now the server is stuck there trying to handle the request queue you've dumped. If you're really good you can get all the requests into the backend before you close so that the server can't drop them. The only real solution these days is to use something like mod_* where the app runs inside the web server process and it can have full knowledge of the queue and avoid sending at all. Even then there's still processing going on with a bomb like this so you're trading the ability to scale and distribute load for a small savings against DDoS. The advantage that 0mq has over say, fastcgi, is that the send/recv of these messages happens in a separate thread (or more) so it doesn't block the main server. Sure you have a ton of dead messages being processed, but they're very quick and handling them doesn't eat much of the Mongrel2 server. You can also throw more handlers at the problem by just starting more, rather than with fastcgi where there's a massive reconfig/restart you need to do. Really the only way to deal with huge amounts of DOS attacks is to buy more machines and work at the TCP/IP layer to block obvious attacker IP addresses. That's a pain in the ass though without some good gear. -- Zed A. Shaw http://zedshaw.com/
On 2011-01-14 23:50, tsuraan wrote: >> Cuases of this are that in mqsend and mqrecv I'm doing an mqwait rather >> than trying to do a non-blocking recv/send first and then waiting if I >> get nothing. I'll look at doing that today and that should get rid of a >> lot of the problem. >> >> The only remaining issue is what to do when it really does fill up like >> this? > > Can you detect that the queue is full and just start replying to new > requests with a 5xx error? 503 looks appropriate. You can even go "504 GATEWAY TIMEOUT". loïc
On Mon, Jan 10, 2011 at 04:56:12PM -0600, tsuraan wrote: > I don't understand what I'm doing though; are all new connections > considered to be "hot" until they've had a while to be inactive? > Could I get the full 1024 if I just spawned connections more slowly? > I tried starting connections with a quarter-second pause between each, > but even with the time it took to hit 253 connections, all of them > were still hot. It's simply the number of connections that are in the poll set before it starts trying to offload them into the bigger epoll list. Let me work up a sample of doing what you're doing. Just to make sure, you're testing that it can actually handle the full 1024 connections at the same time right? -- Zed A. Shaw http://zedshaw.com/
> It's simply the number of connections that are in the poll set before it > starts trying to offload them into the bigger epoll list. Let me work > up a sample of doing what you're doing. Just to make sure, you're > testing that it can actually handle the full 1024 connections at the > same time right? Yeah. All I'm doing is running mongrel (you don't even need a backend running) and connecting with a bunch of curl sessions: for i in `seq 1 400`; do (curl http://localhost:8080/comet/$i &); done After curl #253, mongrel2 starts giving this error: [ERROR] (src/superpoll.c:137: errno: Resource temporarily unavailable) Too many open files requested: 256 is greater than hot 256 max. [ERROR] (src/task/fd.c:212: errno: None) Error adding fd -1 to task wait list. I'm not using the 1.5 release you just made; it's a somewhat old fossil checkout. I'll check it out on the actual release next.
On Mon, Jan 10, 2011 at 05:20:12PM -0600, tsuraan wrote: > > It's simply the number of connections that are in the poll set before it > > starts trying to offload them into the bigger epoll list. Let me work > > up a sample of doing what you're doing. Just to make sure, you're > > testing that it can actually handle the full 1024 connections at the > > same time right? > > I'm not using the 1.5 release you just made; it's a somewhat old > fossil checkout. I'll check it out on the actual release next. Try the 1.5, and is this on OSX or a BSD that defaults to 256? I got an OSX box but want to make sure I test it on the same platform. -- Zed A. Shaw http://zedshaw.com/
> Try the 1.5, and is this on OSX or a BSD that defaults to 256? I got an > OSX box but want to make sure I test it on the same platform. I got m2-1.5 running on my mac mini (core2, OSX 10.5) and I can't get it to happen there either. I thought maybe it was a hardlimit/softlimit thing, since OSX defaults to 256/unlimited for nfiles, but I set the limit to 1024/1024 and it still happily handles around a thousand connections. I don't have access to any BSD machine, so I can't test there, but it's at least happening in debian/arm and gentoo/amd64 under linux kernels. Do you have any clues on what's going on?
Not sure what your availability is like, but I think you could setup a free tier ec2 account and install a freebsd ami. http://aws.amazon.com/free/ http://www.daemonology.net/freebsd-on-ec2/ On Thu, Jan 13, 2011 at 12:14 PM, tsuraan <tsuraan@gmail.com> wrote: > > Try the 1.5, and is this on OSX or a BSD that defaults to 256? I got an > > OSX box but want to make sure I test it on the same platform. > > I got m2-1.5 running on my mac mini (core2, OSX 10.5) and I can't get > it to happen there either. I thought maybe it was a > hardlimit/softlimit thing, since OSX defaults to 256/unlimited for > nfiles, but I set the limit to 1024/1024 and it still happily handles > around a thousand connections. I don't have access to any BSD > machine, so I can't test there, but it's at least happening in > debian/arm and gentoo/amd64 under linux kernels. Do you have any > clues on what's going on? >
> Not sure what your availability is like, but I think you could setup a free > tier ec2 account and install a freebsd ami. > http://aws.amazon.com/free/ > http://www.daemonology.net/freebsd-on-ec2/ Ok, so I got mongrel2 running on a FreeBSD AMI (the Makefile requires a -lpthread and -L/usr/local/lib under LIBS and -I/usr/local/include under CFLAGS), and it also doesn't have the problem. So, both Linux machines I've tried do have the issue, but neither OSX nor FreeBSD have it. I still have no idea what the problem is, or how a socket is supposed to migrate from hot to not hot, but I guess I'll start digging into superpoll.c to see if I can make any sense of it.
On Thu, Jan 13, 2011 at 01:42:29PM -0600, tsuraan wrote: > > Not sure what your availability is like, but I think you could setup a free > > tier ec2 account and install a freebsd ami. > > http://aws.amazon.com/free/ > > http://www.daemonology.net/freebsd-on-ec2/ > > Ok, so I got mongrel2 running on a FreeBSD AMI (the Makefile requires > a -lpthread and -L/usr/local/lib under LIBS and -I/usr/local/include > under CFLAGS), and it also doesn't have the problem. So, both Linux > machines I've tried do have the issue, but neither OSX nor FreeBSD > have it. I still have no idea what the problem is, or how a socket is > supposed to migrate from hot to not hot, but I guess I'll start > digging into superpoll.c to see if I can make any sense of it. Bizarreness, alright thanks for checking it out. Keep digging and I'll take a look too. It's probably some different errno that linux has vs. the BSDs. -- Zed A. Shaw http://zedshaw.com/
>> Ok, so I got mongrel2 running on a FreeBSD AMI (the Makefile requires >> a -lpthread and -L/usr/local/lib under LIBS and -I/usr/local/include >> under CFLAGS), and it also doesn't have the problem. So, both Linux >> machines I've tried do have the issue, but neither OSX nor FreeBSD >> have it. I still have no idea what the problem is, or how a socket is >> supposed to migrate from hot to not hot, but I guess I'll start >> digging into superpoll.c to see if I can make any sense of it. > > Bizarreness, alright thanks for checking it out. Keep digging and I'll > take a look too. It's probably some different errno that linux has vs. > the BSDs. Well, one obvious difference between linux and everything else is that in superpoll.c, HAS_EPOLL (which basically governs all of superpoll's behaviour) is defined as 1 if it's a linux system, and 0 otherwise. On non-linux, max_hot is total_open_fd, whereas on linux it tries to do the hot_divident hot/not-hot split. Still haven't tracked down the bug, but that difference was a bit obvious once I noticed it :)
> Not sure what your availability is like, but I think you could setup a free > tier ec2 account and install a freebsd ami. > http://aws.amazon.com/free/ > http://www.daemonology.net/freebsd-on-ec2/ I very much like this idea. Now if I can just figure out how to log into the FreeBSD image that I supposedly made... I'm an AWS newbie, but I've been meaning to learn.
> Try the 1.5, and is this on OSX or a BSD that defaults to 256? I got an > OSX box but want to make sure I test it on the same platform. After booting one of my SheevaPlugs, I got mongrel2-1.5 running there and get the same error. That machine is debian lenny ARM with zeromq from sid; the kernel version is 2.6.32-3-kirkwood. The exact message is: [ERROR] (src/superpoll.c:137: errno: Resource temporarily unavailable) Too many open files requested: 256 is greater than hot 256 max. [ERROR] (src/task/fd.c:212: errno: None) Error adding fd -1 to task wait list. ulimit -n gives 1024, and my full config file is: --- servers = [Server( uuid="f400bf85-4538-4f7a-8908-67e313d515c2", chroot="./", access_log="./logs/access.log", error_log="./logs/error.log", pid_file="./some.pid", default_host="(.+)", port=8080, name="myserver", hosts = [ Host(name="(.+)", routes={ '/' : Proxy(addr='localhost', port=3000), '/comet' : Handler(send_spec="tcp://127.0.0.1:9998", send_ident="sender", recv_spec="tcp://127.0.0.1:9999", recv_ident="sender") })])] settings={'upload.temp_store' : './tmp/foo.XXXXXXXX', 'superpoll.max_fd':1024, 'superpoll.hot_dividend' : 4} --- As always I'd be happy to provide any other information that would be useful.
> Try the 1.5, and is this on OSX or a BSD that defaults to 256? I got an > OSX box but want to make sure I test it on the same platform. It's linux, 2.6.35 (gentoo). I tried it with the mongrel2-1.5 release, and the problem is still there. In my .cfg file, I have this line: settings={'upload.temp_store' : './tmp/foo.XXXXXXXX', 'superpoll.max_fd':1024} I can add a 'superpoll.hot_dividend' : 2 to that dictionary and then go up to nearly 512 curl runs, but setting to 1 gives me this: [INFO] (src/server.c:182) Starting server on port 8080 [INFO] (src/task/fd.c:140) MAX limits.fdtask_stack=102400 [ERROR] (src/superpoll.c:338: errno: Resource temporarily unavailable) Out of memory. [ERROR] (src/superpoll.c:99: errno: None) Failed to configure epoll. Disabling. and then mongrel2 dies. I think that if that worked it would effectively disable epoll anyhow, so that's probably not what I want.
>> Try the 1.5, and is this on OSX or a BSD that defaults to 256? I got an >> OSX box but want to make sure I test it on the same platform. On OSX (10.5.8, PPC) this doesn't appear to happen. Not sure if that helps, but there's a data point. Strangely, ulimit -n on that machine gives me 256, but mongrel2 was happy with over 500 connections. I have no idea how that works, but I'm not really an OSX guy...