librelist archives

« back to archive

Outgoing streaming/flow control

Outgoing streaming/flow control

From:
Brian Downing
Date:
2011-08-09 @ 06:12
I'm thinking of using mongrel2 for an embedded application, mostly
because the handler format is very attractive and easy to support.
However, something I need to do is stream large (many gigabytes)
amounts of generated data; considerably more data than my device has RAM.
Currently, there appears to be no form of flow control available, so when
handling a large request where bandwidth is limited on the other side, the
mongrel2 process's memory usage bloats up immensely until it is killed.

The only way I could see to fix this within the current 0mq handler
scheme would be to add flow control messages, similar the the current
JSON disconnect message.  Preferably these would report the server's
buffer size for the connection once it's past a certain threshold, so the
handler can decide to choke off output until the other side catches up.
Are there any plans for something like this in the works, or any better
idea on how to fix this problem?  (Obviously I could proxy the HTTP or
otherwise handle the socket directly, but I'm trying to avoid that.)

-bcd

Re: [mongrel2] Outgoing streaming/flow control

From:
Zed A. Shaw
Date:
2011-08-12 @ 16:45
On Tue, Aug 09, 2011 at 01:12:02AM -0500, Brian Downing wrote:
> I'm thinking of using mongrel2 for an embedded application, mostly
> because the handler format is very attractive and easy to support.

Very cool.

> However, something I need to do is stream large (many gigabytes)
> amounts of generated data; considerably more data than my device has RAM.
> Currently, there appears to be no form of flow control available, so when
> handling a large request where bandwidth is limited on the other side, the
> mongrel2 process's memory usage bloats up immensely until it is killed.

Do you need to stream large gigabytes TO the mongrel2 server or FROM?

In other words, are you trying to abuse HTTP to create a bidirectional
chat protocol that streams out gigabytes of data perpetually in both
directions?

Or, do you want to have someone connect and then you stream out huge
amounts of data.

OR, do you want to have someone send you huge amounts of data streamed
piece by piece?

-- 
Zed A. Shaw
http://zedshaw.com/

Re: [mongrel2] Outgoing streaming/flow control

From:
Brian Downing
Date:
2011-08-12 @ 18:05
On Fri, Aug 12, 2011 at 09:45:51AM -0700, Zed A. Shaw wrote:
> > However, something I need to do is stream large (many gigabytes)
> > amounts of generated data; considerably more data than my device has RAM.
> > Currently, there appears to be no form of flow control available, so when
> > handling a large request where bandwidth is limited on the other side, the
> > mongrel2 process's memory usage bloats up immensely until it is killed.
> 
> Do you need to stream large gigabytes TO the mongrel2 server or FROM?
> 
> In other words, are you trying to abuse HTTP to create a bidirectional
> chat protocol that streams out gigabytes of data perpetually in both
> directions?

Nope.

> Or, do you want to have someone connect and then you stream out huge
> amounts of data.

I want to stream FROM mongrel2.  Typically this will be a simple GET
request, so there's no incoming body from the client at all.  However,
I will be generating the response to send to the client on the fly,
and I will be able to do that much faster than a WiFi network or most
Internet connections can transmit.  I want to be able to send at full
available bandwidth to the client without filling memory.

With current mongrel2, the handler generating the data has no way of
knowing the socket to the HTTP client is full and just keeps happily
spewing out to the PUB socket, and the mongrel2 process's outgoing io
buffer bloats up until it is oomkilled.  If I were handling the socket
myself it is obviously trivial to cease filling the write buffer when
socket writes start returning EAGAIN.

This is once scenerio.  In another I am streaming data, again FROM
mongrel2, to more than one connected client at once.  Unlike the above
scenario, the data are being generated live.  In most cases they will
be generated slower than the bandwidth available to the client, but if
this is not the case and a client gets too far behind I want to close
the connection to that client rather than have it get data that are too
out-of-date.  (This is not a real-time application; Internet latencies
are fine, but I don't want to keep getting further and further behind
if the client can't keep up.)

> OR, do you want to have someone send you huge amounts of data streamed
> piece by piece?

Nope, nothing like that.

-bcd

Re: [mongrel2] Outgoing streaming/flow control

From:
Zed A. Shaw
Date:
2011-08-13 @ 16:25
On Fri, Aug 12, 2011 at 01:05:15PM -0500, Brian Downing wrote:
> I want to stream FROM mongrel2.  Typically this will be a simple GET
> request, so there's no incoming body from the client at all.

Ok, so it's easy to do with a handler, but...

> With current mongrel2, the handler generating the data has no way of
> knowing the socket to the HTTP client is full and just keeps happily
> spewing out to the PUB socket, and the mongrel2 process's outgoing io
> buffer bloats up until it is oomkilled.  If I were handling the socket
> myself it is obviously trivial to cease filling the write buffer when
> socket writes start returning EAGAIN.

Yeah, that's a problem, because you aren't really in control of the
socket directly so you have very little idea what's going on.  If I were
to tackle this I'd be looking at the control port's status information
(which you can work with a simple json) protocol.  It keeps track of
average bytes transfered and how long the connection's been active.  You
could use that to throttle your handler, but it'd get complicated.

To be honest, it sounds to me like for this application Mongrel2 might
not work for you.  You *really* want to have full control of the socket
in this case and without direct access to the socket you'd be screwed.
You'll always be a 2nd class citizen.  If you can think of some data you
could use to make this work I'd be happy to entertain adding it, or work
up something that does it off the control port.

One final thing to try, is just set the client's timeout such that if it
can't keep up with your transfer rate requirements then mongrel2 will
kill it.  Look at the limits.min_ping, limits.min_read,
limits.min_write settings in "Tweakable Expert Settings":

http://mongrel2.org/static/mongrel2-manual.html#x1-400003.10

I think if you set those to something reasonable, and set the
limits.tick_timer low enough, then Mongrel2 will monitor those
connections for you and throw them out.

If you want better control than that, then hit me up with ideas.

> This is once scenerio.  In another I am streaming data, again FROM
> mongrel2, to more than one connected client at once.  Unlike the above
> scenario, the data are being generated live.  In most cases they will
> be generated slower than the bandwidth available to the client, but if
> this is not the case and a client gets too far behind I want to close
> the connection to that client rather than have it get data that are too
> out-of-date.  (This is not a real-time application; Internet latencies
> are fine, but I don't want to keep getting further and further behind
> if the client can't keep up.)

This one shouldn't be hard, assuming you know that the client is going
too slow, you just send it a close (which is a 0 length message to that
client).  Also, you know you can send one message to target up to 128
clients at a time right?  That include closing them, so you can easily
handle large amounts of streaming.

Let me know if you want to play with the control port idea as well.
Basically, fire up "m2sh control" and try this:

status what=net

If you have a few connections going at the time, then you can see what
data it tracks.  This control port is fully accessible from a
programming language, since it's just a simple tnetstring protocol.
With that data you could possibly make a little thing that watches the
connections and signals your handlers to back off when it sees they're
getting overloaded.

-- 
Zed A. Shaw
http://zedshaw.com/

Re: [mongrel2] Outgoing streaming/flow control

From:
michael j pan
Date:
2011-08-12 @ 18:18
> I want to stream FROM mongrel2.  Typically this will be a simple GET
> request, so there's no incoming body from the client at all.  However,
> I will be generating the response to send to the client on the fly,
> and I will be able to do that much faster than a WiFi network or most
> Internet connections can transmit.  I want to be able to send at full
> available bandwidth to the client without filling memory.

We have a similar requirement and here's how we solve it.  Our current
implementation is not using mongrel2, but mongrel2 is only the
mediator.

- client requests data
- mediator forward the request to the handler
- the handler replies to the mediator the socket info to which it will
push the data (eg a zmq endpoint tcp://1.1.1.1:5000)
- the client connects to that endpoint using zmq sub
- the handler binds to that endpoint, and publishes all the data there

HTH
Mike

Re: [mongrel2] Outgoing streaming/flow control

From:
Brian Downing
Date:
2011-08-12 @ 18:25
On Sat, Aug 13, 2011 at 02:18:12AM +0800, michael j pan wrote:
> We have a similar requirement and here's how we solve it.  Our current
> implementation is not using mongrel2, but mongrel2 is only the
> mediator.
> 
> - client requests data
> - mediator forward the request to the handler
> - the handler replies to the mediator the socket info to which it will
> push the data (eg a zmq endpoint tcp://1.1.1.1:5000)
> - the client connects to that endpoint using zmq sub
> - the handler binds to that endpoint, and publishes all the data there

Just to check to see if I am understanding this correctly - the data never
goes out over HTTP, and instead the client makes a second zmq connection
directly to the handler?  Unfortunately in my situation having it go out
as part of the HTTP connection is a hard requirement - it should look like
a normal GET request to the client, which will typically be a web browser.

-bcd

Re: [mongrel2] Outgoing streaming/flow control

From:
michael j pan
Date:
2011-08-12 @ 18:35
On Sat, Aug 13, 2011 at 02:25, Brian Downing <bdowning@lavos.net> wrote:
> On Sat, Aug 13, 2011 at 02:18:12AM +0800, michael j pan wrote:
>> We have a similar requirement and here's how we solve it.  Our current
>> implementation is not using mongrel2, but mongrel2 is only the
>> mediator.
>>
>> - client requests data
>> - mediator forward the request to the handler
>> - the handler replies to the mediator the socket info to which it will
>> push the data (eg a zmq endpoint tcp://1.1.1.1:5000)
>> - the client connects to that endpoint using zmq sub
>> - the handler binds to that endpoint, and publishes all the data there
>
> Just to check to see if I am understanding this correctly - the data never
> goes out over HTTP, and instead the client makes a second zmq connection
> directly to the handler?  Unfortunately in my situation having it go out
> as part of the HTTP connection is a hard requirement - it should look like
> a normal GET request to the client, which will typically be a web browser.
>

Yup, you understand our scenario correctly.  In your case though, you
could have the mediator tell the client(s) a HTTP endpoint (as opposed
to a ZMQ one).  The client would then make an HTTP GET request to that
endpoint.  Though in this case, one might ask what's the point of
mongrel2 for your use case...

Mike

Re: [mongrel2] Outgoing streaming/flow control

From:
Jim Fulton
Date:
2011-08-09 @ 19:46
On Tue, Aug 9, 2011 at 2:12 AM, Brian Downing <bdowning@lavos.net> wrote:
> I'm thinking of using mongrel2 for an embedded application, mostly
> because the handler format is very attractive and easy to support.
> However, something I need to do is stream large (many gigabytes)
> amounts of generated data; considerably more data than my device has RAM.
> Currently, there appears to be no form of flow control available, so when
> handling a large request where bandwidth is limited on the other side, the
> mongrel2 process's memory usage bloats up immensely until it is killed.
>
> The only way I could see to fix this within the current 0mq handler
> scheme would be to add flow control messages, similar the the current
> JSON disconnect message.  Preferably these would report the server's
> buffer size for the connection once it's past a certain threshold, so the
> handler can decide to choke off output until the other side catches up.
> Are there any plans for something like this in the works, or any better
> idea on how to fix this problem?  (Obviously I could proxy the HTTP or
> otherwise handle the socket directly, but I'm trying to avoid that.)

If mongrel2 used push-pull sockets rather than pub-sub sockets for
getting responses from handlers, then 0mq's built-in flow control
could be used.

The mongrel2 documentation threatens to make the socket types
configurable in a later version. I suspect the simplest way to fix this
would be to allow push-pull sockets to be used for returning responses
and to allow configuration of their high-water marks.  This would be
far less intrusive at the application level than some sort of
application-level flow control.

(I'm assuming that mongrel2 only pulls data off it's incoming
sockets as fast as it could send it to HTTP clients.)

Then again, I'm new to both mongrel2 and 0mq, so I may not know what
I'm talking about. :)

Jim

--
Jim Fulton
http://www.linkedin.com/in/jimfulton

Re: [mongrel2] Outgoing streaming/flow control

From:
Jason Miller
Date:
2011-08-09 @ 17:23
On 01:12 Tue 09 Aug     , Brian Downing wrote:
> The only way I could see to fix this within the current 0mq handler
> scheme would be to add flow control messages, similar the the current
> JSON disconnect message.  Preferably these would report the server's
> buffer size for the connection once it's past a certain threshold, so the
> handler can decide to choke off output until the other side catches up.
> Are there any plans for something like this in the works, or any better
> idea on how to fix this problem?  (Obviously I could proxy the HTTP or
> otherwise handle the socket directly, but I'm trying to avoid that.)
> 
> -bcd
> 
The simplest implementation would be for mongrel2 to just send a message
when the data has gone out.  Then you can control the amount of
outstanding data by:
 1) The size of messages you send
 2) The number of outstanding messages at a time

Something like that ought to solve your problem.  I think whether or not
mongrel sends these messages ought to be an option rather than just
sending them to every handler.  Thoughts?

-Jason

Re: [mongrel2] Outgoing streaming/flow control

From:
Brian Downing
Date:
2011-08-09 @ 18:11
On Tue, Aug 09, 2011 at 10:23:09AM -0700, Jason Miller wrote:
> The simplest implementation would be for mongrel2 to just send a message
> when the data has gone out.  Then you can control the amount of
> outstanding data by:
>  1) The size of messages you send
>  2) The number of outstanding messages at a time
> 
> Something like that ought to solve your problem.  I think whether or not
> mongrel sends these messages ought to be an option rather than just
> sending them to every handler.  Thoughts?

That would work for me, though I'm not sure that'd actually be simplest
to implement since I doubt the socket buffering code keeps the message
boundaries from the handler intact (though I admit I have not looked
very deeply at the code).

-bcd