Re: [mongrel2] how to safely shut down a handler?
- From:
- Ryan Kelly
- Date:
- 2011-01-25 @ 11:03
On Tue, 2011-01-25 at 02:28 -0800, Zed A. Shaw wrote:
> On Tue, Jan 25, 2011 at 09:01:54PM +1100, Ryan Kelly wrote:
> > I guess my understanding of 0mq isn't quite up to speed yet. How does
> > this backlog mechanism mesh with the manpage for zmq_close, which states
> > that "any outstanding messages physically received from the network but
> > not yet received by the application with zmq_recv() shall also be
> > dropped"?
>
> That's the receive side, what you're using is the send log. You can
> also give the receive side a name and it'll have a similar backlog and
> some other stuff, but I haven't delved into that much.
I think that it's more the recv side than the send side that is the
cause of my trouble - see next response. I will try using a proper
ident on the recv side as well and see if it makes a difference.
> > Just to clarify, do you mean give both handler processes the same ident,
> > or give each handler process a different-but-fixed ident?
>
> Nope you need two idents for this test, but they should be consistent.
> It's the way the mongrel2 handlers are configured anyway. In Tir I just
> read the ident and junk out of the .sqlite based on the route. Makes it
> very easy to config.
>
> > Trying to give both handlers the same ident gives me an assertion error
> > down in the bowels of zmq. I'll double-check my build and try again
> > tomorrow.
>
> But, you should be able to start X handlers with the same ident.
> They'll get the messages round-robin then, but I do it all the time.
>
> > But this case is slightly different, in that I am not *restarting* any
> > handlers. The test begins with two handlers running initially and
> > load-balancing requests between them. Then after serving some requests
> > one of the handlers goes away forever.
>
> So, thinking about that for a second, you're describing this:
>
> 1. I have two handlers A and B.
> 2. Both A and B begin handling requests, with the idents A and B.
> 3. I then kill B and never restart it.
> 4. Why is the message B tried to send not being sent by A?
>
> Simple: B and A don't know about each other so if you never restart B
> it'll never send its messages. It needs to be running to send them.
Hmmm...I don't *think* that's the scenario I'm describing, but it could
be that I don't have the mental model of mongrel2/0mq sorted out quite
right...
I'm not worried about messages *sent* by B being lost, of course that
will happen when B goes offline. What I'm worried about is that B seems
to take down an unrelated request when it dies.
Here's my mental picture of things:
1. I have Mongrel2 configured with a single Handler, sending reqs
out on socket X and receiving response data on socket Y.
2. I have two handler processes A and B, with idents A and B.
3. Both A and B start handling requests, receiving them on socket
X and sending responses on socket Y.
4. This works nicely for a while, with reqs distributed round-robin.
5. I ask B to shut itself down, and never restart it.
6. A request hangs forever, apparently lost when B closed its socket.
Zooming in, here's the race condition that I think is happening, based
on debugging the thing with print statements and reading up on 0mq:
5.1 I ask B to shut down. It breaks out of the recv() loop and
prepares to close its socket.
5.2 Meanwhile, a new request R arrives. Mongrel sends R out on
socket X, and 0mq round-robin delivers it to B.
5.3 B closes its connection to socket X. The unreceived request R
is dropped on the floor.
Since this is a clean shutdown, B could quite happily respond to R if it
knew about it, but I can't find a way for B to say "give me any requests
you have queued, but don't send me any more".
> > By way of explanation, I'm trying to simulate on-demand scaling of the
> > number of handler processes - as load goes up, start more handlers; as
> > load goes down, kill some off.
>
> In order for that to work, you'll need to make sure all of the handlers
> that are running have the exact same send_idents, and you'll have to get
> into the 0mq docs on how to adjust the queue and delivery options.
> Right now the scenario you're emulating is more of what I describe
> above, where you're effectively taking an entire handler offline then
> wondering why it's not sending messages.
>
> Really, it sounds to me like you need to do this with a real application
> that's using the existing Python code I have and a simple handler.
> Fire up say 3 of them and simulate the same thing, then figure out what
> options get you close to what you want. Rather than trying to both
> figure out how to make 0mq do what you want and write a wsgi handler.
I did try using the bundled "mongrel2" python module, but the Connection
class doesn't have any methods for closing down the connection. So I
pulled the relevant bits out directly into my test code.
I will try again with your prebuilt python code, having the handler
process simply die rather than attempting a clean shutdown.
> > Ideally, the handlers that die off will do so cleanly and not take any
> > queued requests along with them. It won't be the end of the world if a
> > few requests time out, as long as they don't get stuck forever. So I'll
> > also take a closer look at timing out requests via the control port.
>
> I still need to do a timeout mechanism. Play with the control port to
> cook up whatever you want for timeout, and when you have something (or
> an idea) that isn't totally crazy then let me know and I'll just put it
> in mongrel2.
Will do.
Thanks,
Ryan
--
Ryan Kelly
http://www.rfk.id.au | This message is digitally signed. Please visit
ryan@rfk.id.au | http://www.rfk.id.au/ramblings/gpg/ for details
Re: [mongrel2] how to safely shut down a handler?
- From:
- Zed A. Shaw
- Date:
- 2011-01-25 @ 19:28
I rewrote it so you have to do the handler restarts manually making it
easier to debug and figure out what's going on. I then wrote a correct
HTTP reply method and a little time+counter so you can see what messages
are getting sent.
Here's my code:
http://zedshaw.com/m2test-handler-close-zed.tar.gz
Run it like this:
1. Start mongrel2 in one window.
2. Start testhandler.py in a window.
3. run testrunner.py in a 3rd window.
4. Go back to the testhandler.py window and restart it, message
completes.
5. Do this a whole bunch manually and see that it works reliably.
Now, look at like line 35. That's the secret sauce. If you want to get
messages with reliable delivery you need to set the ident on the side
you need it. In your case you want send and receive durability so ident
both sides.
However, there's kind of a weird thing: you have to restart mongrel2 if
you *remove* the recv ident on a handler. No idea why, it some 0mq
limitation. So, comment out line 36 so you're not setting the recv
ident. Go back and restart mongrel2.
Do the above test again, and by about the 3rd cycle you'll lose a
message and everything gets stuck.
A lot of your code for starting mongrel2 and firing up processes just
confounded the test, since you can actually make it happen with manual
restarts. Now you can probably take this code and put your process
starting stuff back in to do your test, but I'd suggest creating a 3rd
script that does the process things. Have testrunner.py just send HTTP.
testhandler.py be the handler. And testprocess.py be the thing screwing
with processes.
Hope that helps.
--
Zed A. Shaw
http://zedshaw.com/
Re: [mongrel2] how to safely shut down a handler?
- From:
- Zed A. Shaw
- Date:
- 2011-01-25 @ 18:14
> Here's my mental picture of things:
>
> 1. I have Mongrel2 configured with a single Handler, sending reqs
> out on socket X and receiving response data on socket Y.
> 2. I have two handler processes A and B, with idents A and B.
Stop right there. If you have two handlers with two different idents
using the same route -> 0mq socket from mongrel2 then this is why they
aren't resending messages.
I think I'm going to have to write code for you, since English I think
is failing. Give me a little bit.
--
Zed A. Shaw
http://zedshaw.com/
Re: [mongrel2] how to safely shut down a handler?
- From:
- Ryan Kelly
- Date:
- 2011-01-25 @ 22:23
On Tue, 2011-01-25 at 10:14 -0800, Zed A. Shaw wrote:
> > Here's my mental picture of things:
> >
> > 1. I have Mongrel2 configured with a single Handler, sending reqs
> > out on socket X and receiving response data on socket Y.
> > 2. I have two handler processes A and B, with idents A and B.
>
> Stop right there. If you have two handlers with two different idents
> using the same route -> 0mq socket from mongrel2 then this is why they
> aren't resending messages.
>
> I think I'm going to have to write code for you, since English I think
> is failing.
No, it's me that's failing - looking back at the messages, I appear to
have sent an old version of the test code that shows a different issue.
No wonder we seem to be talking about different things! Sorry.
Thankfully all is not lost, as it's been a most illuminating discussion
from my end.
I'm going to reboot my test code using the pre-built mongrel2 module and
taking all your advice in this thread on board.
Thanks,
Ryan
--
Ryan Kelly
http://www.rfk.id.au | This message is digitally signed. Please visit
ryan@rfk.id.au | http://www.rfk.id.au/ramblings/gpg/ for details
Re: [mongrel2] how to safely shut down a handler?
- From:
- Ryan Kelly
- Date:
- 2011-01-25 @ 22:54
could not decode message
Re: [mongrel2] how to safely shut down a handler?
- From:
- Ryan Kelly
- Date:
- 2011-01-25 @ 20:51
On Tue, 2011-01-25 at 10:14 -0800, Zed A. Shaw wrote:
> > Here's my mental picture of things:
> >
> > 1. I have Mongrel2 configured with a single Handler, sending reqs
> > out on socket X and receiving response data on socket Y.
> > 2. I have two handler processes A and B, with idents A and B.
>
> Stop right there. If you have two handlers with two different idents
> using the same route -> 0mq socket from mongrel2 then this is why they
> aren't resending messages.
Fair enough. But can I run two handlers simultaneously with the same
ident? It always crashes with an assertion error for me.
Using your modified code:
1. Start mongrel2 in one window.
2. Start testhandler.py in a new window.
3. Start another testhandler.py in a new window
=> mongrel2 crashes with "Assertion failed: !engine (session.cpp:287)"
I guess this is the "segfault Mongrel2 from your handler
by not using the identities correctly" that Loic mentioned.
(BTW, I'm using the 1.5 release of mongrel2 and have tried this with
both 0mq 2.0.10 and 0mq 2.1.0)
Ryan
--
Ryan Kelly
http://www.rfk.id.au | This message is digitally signed. Please visit
ryan@rfk.id.au | http://www.rfk.id.au/ramblings/gpg/ for details
Re: [mongrel2] how to safely shut down a handler?
- From:
- Zed A. Shaw
- Date:
- 2011-01-25 @ 23:54
On Wed, Jan 26, 2011 at 07:51:28AM +1100, Ryan Kelly wrote:
> On Tue, 2011-01-25 at 10:14 -0800, Zed A. Shaw wrote:
> Fair enough. But can I run two handlers simultaneously with the same
> ident? It always crashes with an assertion error for me.
>
> Using your modified code:
>
> 1. Start mongrel2 in one window.
> 2. Start testhandler.py in a new window.
> 3. Start another testhandler.py in a new window
> => mongrel2 crashes with "Assertion failed: !engine (session.cpp:287)"
>
> I guess this is the "segfault Mongrel2 from your handler
> by not using the identities correctly" that Loic mentioned.
Well that's special. WTF. Let me check this out some more, 'cause
that's annoying as hell.
--
Zed A. Shaw
http://zedshaw.com/
Re: [mongrel2] how to safely shut down a handler?
- From:
- Loic d'Anterroches
- Date:
- 2011-01-25 @ 18:31
On 2011-01-25 19:14, Zed A. Shaw wrote:
>> Here's my mental picture of things:
>>
>> 1. I have Mongrel2 configured with a single Handler, sending reqs
>> out on socket X and receiving response data on socket Y.
>> 2. I have two handler processes A and B, with idents A and B.
>
> Stop right there. If you have two handlers with two different idents
> using the same route -> 0mq socket from mongrel2 then this is why they
> aren't resending messages.
>
> I think I'm going to have to write code for you, since English I think
> is failing. Give me a little bit.
I started to really understand what was going on with these identities
when reading the guide on Transient vs. Durable Sockets". It explains
very well all the machinery behind and how it affects the message
delivery. It even shows how you can segfault Mongrel2 from your handler
by not using the identities correctly.
http://zguide.zeromq.org/chapter:all#toc37
loïc