librelist archives

« back to archive

implementing smarter timeouts

implementing smarter timeouts

From:
Ryan Kelly
Date:
2011-02-11 @ 00:02
Hi All,


  I've been playing around with timing out requests via the control
port, and have implemented a "reaper" device in my m2wsgi python module
if anyone is interested:

    https://github.com/rfk/m2wsgi/blob/master/m2wsgi/device/reaper.py


  Problem is, the control port only provides a single metric for
deciding whether to time out a connection:  the time since last "ping".

  I'd like to change the output of "status net" to give more
comprehensive information about each connection.  Currently it looks
like this:

    m2 [test]> status net
    {u'total': 1, u'2': [25, 13]}


  I want to make it output something like this:


    m2 [test]> status net
    {u'total': 1, u'2': {
       "fd": 25,
       "last_ping": 13,
       "last_read": 12,
       "last_write": 5,
       "bytes_read": 123,
       "bytes_written": 7654
    }}


  Then we can do trickier things like timing out only inactive
connections, rather than those that have just been around for a long
time.

  Two possible sticking points:

    * are we willing to pay the (hopefully minimal) overhead of
maintaining this extra information about each connection?

    * is it appropriate/desirable to change the "status net" command's
output in a backwards-incompatible way like this?


  Thoughts?  Hopefully I'll get a chance to hack on this over the
weekend.


    Cheers,

       Ryan


-- 
Ryan Kelly
http://www.rfk.id.au  |  This message is digitally signed. Please visit
ryan@rfk.id.au        |  http://www.rfk.id.au/ramblings/gpg/ for details

Re: [mongrel2] implementing smarter timeouts

From:
Loic d'Anterroches
Date:
2011-02-11 @ 08:28
Hello,

>   Two possible sticking points:
> 
>     * are we willing to pay the (hopefully minimal) overhead of
> maintaining this extra information about each connection?

I cannot answer as I cannot really evaluate the real overhead.

>     * is it appropriate/desirable to change the "status net" command's
> output in a backwards-incompatible way like this?

I suppose at the moment, we are all very aware that Mongrel2 is a moving
target which requires a lot of changes on a regular basis in our code.
This is good because this means that we can adapt it to mach our real
needs and do not bother about backward compatibility. I would personnaly
prefer to change as early as possible to do the right thing instead of
the dragging around the weight of the compatibility. Starting the 1.0
release this would make sense, before...

loïc

Re: [mongrel2] implementing smarter timeouts

From:
Ryan Kelly
Date:
2011-02-11 @ 22:48
>  I'd like to change the output of "status net" to give more
>  comprehensive information about each connection. 

I've got a basic version up and running but won't commit it without some
more consideration.  Patch available on my local fossil if anyone wants
to try it out:

   http://www.rfk.id.au:44445/


On Fri, 2011-02-11 at 09:28 +0100, Loic d'Anterroches wrote:
>
> >   Two possible sticking points:
> > 
> >     * are we willing to pay the (hopefully minimal) overhead of
> > maintaining this extra information about each connection?
> 
> I cannot answer as I cannot really evaluate the real overhead.


As a start, tracking this extra information adds 4 32-bit ints to the
Registration struct.  Mongrel2 allocates 64*1204 of these structs
statically, meaning my patch adds an extra 1 MB to the memory usage.

I'm not really set up to do any performance tests (I'd be interested in
doing a "mongrel2 vs nginx" performance shootout, which I imagine would
turn into a "mongrel2 is slightly slower than nginx for now but it comes
with extra awesome" blog post, but just haven't found the time...)

> >     * is it appropriate/desirable to change the "status net" command's
> > output in a backwards-incompatible way like this?
> 
> I suppose at the moment, we are all very aware that Mongrel2 is a moving
> target which requires a lot of changes on a regular basis in our code.
> This is good because this means that we can adapt it to mach our real
> needs and do not bother about backward compatibility. I would personnaly
> prefer to change as early as possible to do the right thing instead of
> the dragging around the weight of the compatibility. Starting the 1.0
> release this would make sense, before...


Yeah, I kind of got that feeling.  I suppose a slightly-less-breaking
change would be to dump out the extra info as more items in the list,
e.g.

    m2 [test]> status net
    {"2": [25, 13, 78, 23, 18, 348],"total": 1}


Pretty ugly though.



   Ryan

-- 
Ryan Kelly
http://www.rfk.id.au  |  This message is digitally signed. Please visit
ryan@rfk.id.au        |  http://www.rfk.id.au/ramblings/gpg/ for details

Re: [mongrel2] implementing smarter timeouts

From:
Zed A. Shaw
Date:
2011-02-12 @ 20:02
On Sat, Feb 12, 2011 at 09:48:39AM +1100, Ryan Kelly wrote:
> 
> >  I'd like to change the output of "status net" to give more
> >  comprehensive information about each connection. 
> 
> I've got a basic version up and running but won't commit it without some
> more consideration.  Patch available on my local fossil if anyone wants
> to try it out:

And I've tweaked it a bit so that it uses the ticker in src/main.c to
keep an internal concept of time.  That avoids having to call time() on
every read/write just to keep the stat.  Instead there's a global
variable of THE_CURRENT_TIME_IS and you can just use that in register.c.

Last change is look at the ZERO_OR_DELTA macro.

-- 
Zed A. Shaw
http://zedshaw.com/

Re: [mongrel2] implementing smarter timeouts

From:
Ryan Kelly
Date:
2011-02-13 @ 00:35
On Sat, 2011-02-12 at 12:02 -0800, Zed A. Shaw wrote:
> On Sat, Feb 12, 2011 at 09:48:39AM +1100, Ryan Kelly wrote:
> > 
> > >  I'd like to change the output of "status net" to give more
> > >  comprehensive information about each connection. 
> > 
> > I've got a basic version up and running but won't commit it without some
> > more consideration.  Patch available on my local fossil if anyone wants
> > to try it out:
> 
> And I've tweaked it a bit so that it uses the ticker in src/main.c to
> keep an internal concept of time.  That avoids having to call time() on
> every read/write just to keep the stat.  Instead there's a global
> variable of THE_CURRENT_TIME_IS and you can just use that in register.c.

Nice.

The other thing I'm thinking about is whether to send a 408-Timeout
response, a 504-Gateway-Timeout, or just kill their connection.  I
suppose these simple rules are probably enough:

  * bytes_read == 0  =>  408 Timeout
  * bytes_written == 0  =>  504 Gateway Timeout
  * anything else  =>  Violent Close

Do other connection types have equivalent timeout responses?


  Ryan 

-- 
Ryan Kelly
http://www.rfk.id.au  |  This message is digitally signed. Please visit
ryan@rfk.id.au        |  http://www.rfk.id.au/ramblings/gpg/ for details

Re: [mongrel2] implementing smarter timeouts

From:
Zed A. Shaw
Date:
2011-02-13 @ 01:01
On Sun, Feb 13, 2011 at 11:35:33AM +1100, Ryan Kelly wrote:
> The other thing I'm thinking about is whether to send a 408-Timeout
> response, a 504-Gateway-Timeout, or just kill their connection.  I
> suppose these simple rules are probably enough:
> 
>   * bytes_read == 0  =>  408 Timeout
>   * bytes_written == 0  =>  504 Gateway Timeout
>   * anything else  =>  Violent Close

That could work.  I'm a big proponent of the Mushroom Theory myself.
Keep them in the dark and feed them crap.  So I'd go with simply just
doing 408 timeout and leave it at that.

But, try it your idea and see how it works in practice.  This is for
your external killer right?

-- 
Zed A. Shaw
http://zedshaw.com/

Re: [mongrel2] implementing smarter timeouts

From:
Ryan Kelly
Date:
2011-02-13 @ 09:11
On Sat, 2011-02-12 at 17:01 -0800, Zed A. Shaw wrote:
> On Sun, Feb 13, 2011 at 11:35:33AM +1100, Ryan Kelly wrote:
> > The other thing I'm thinking about is whether to send a 408-Timeout
> > response, a 504-Gateway-Timeout, or just kill their connection.  I
> > suppose these simple rules are probably enough:
> > 
> >   * bytes_read == 0  =>  408 Timeout
> >   * bytes_written == 0  =>  504 Gateway Timeout
> >   * anything else  =>  Violent Close
> 
> That could work.  I'm a big proponent of the Mushroom Theory myself.
> Keep them in the dark and feed them crap.  So I'd go with simply just
> doing 408 timeout and leave it at that.
> 
> But, try it your idea and see how it works in practice.  This is for
> your external killer right?

Yes, for the reaper device in m2wsgi.

Also, it occurs to me that the control port doesn't allow an external
program to send response data out to a client.  It might have to grow a
"send <conn-id> <data>" command to allow for anything but a violent
close.


  Ryan


-- 
Ryan Kelly
http://www.rfk.id.au  |  This message is digitally signed. Please visit
ryan@rfk.id.au        |  http://www.rfk.id.au/ramblings/gpg/ for details

Re: [mongrel2] implementing smarter timeouts

From:
Zed A. Shaw
Date:
2011-02-14 @ 02:09
On Sun, Feb 13, 2011 at 08:11:48PM +1100, Ryan Kelly wrote:
> Also, it occurs to me that the control port doesn't allow an external
> program to send response data out to a client.  It might have to grow a
> "send <conn-id> <data>" command to allow for anything but a violent
> close.

No need.  You take the conn_id from the listing and just send to it
using your handler's send 0mq and the python library like normal.  It
actually doesn't care what one you use, but probably go with the handler
send socket your m2wsgi is managing.

-- 
Zed A. Shaw
http://zedshaw.com/

Re: [mongrel2] implementing smarter timeouts

From:
Zed A. Shaw
Date:
2011-02-12 @ 19:39
On Sat, Feb 12, 2011 at 09:48:39AM +1100, Ryan Kelly wrote:
> 
> >  I'd like to change the output of "status net" to give more
> >  comprehensive information about each connection. 
> 
> I've got a basic version up and running but won't commit it without some
> more consideration.  Patch available on my local fossil if anyone wants
> to try it out:
> 
>    http://www.rfk.id.au:44445/

Ok this is good stuff.  I've brought it in to the main line so fossil
sync and you should be good.

This will make it easier to do an internal simplistic timeout as well.

-- 
Zed A. Shaw
http://zedshaw.com/

Re: [mongrel2] implementing smarter timeouts

From:
Zed A. Shaw
Date:
2011-02-12 @ 18:15
On Sat, Feb 12, 2011 at 09:48:39AM +1100, Ryan Kelly wrote:
> As a start, tracking this extra information adds 4 32-bit ints to the
> Registration struct.  Mongrel2 allocates 64*1204 of these structs
> statically, meaning my patch adds an extra 1 MB to the memory usage.

That's not so bad, I'll take a look and see what you did.

-- 
Zed A. Shaw
http://zedshaw.com/