How exactly (if at all) will Mongrel2 handle streaming data, such as file uploads or large PUT or POST requests? Will the entity body first be completely downloaded into a temporary file and the filename sent to the handler, or will the handler be able to incrementally read the data as it comes in? One of the major points at which Node.js excels is it's ability to efficiently handle streaming data, but Mongrel2's architecture seems to be focused more on synchronous applications.
On Tue, Jul 13, 2010 at 08:34:01AM -0700, Alexander Kern wrote: > How exactly (if at all) will Mongrel2 handle streaming data, such as > file uploads or large PUT or POST requests? Will the entity body first > be completely downloaded into a temporary file and the filename sent > to the handler, or will the handler be able to incrementally read the > data as it comes in? One of the major points at which Node.js excels > is it's ability to efficiently handle streaming data, but Mongrel2's > architecture seems to be focused more on synchronous applications. So, I've been cooking up schemes in my mind, but right now I'm contemplating kind of a "split" design on it with the goal of making it easy to both tell the browser the request was done, but not block the browser while you actually handle the upload contents. What I've got so far, and tell me what you think, is having it do: 1. Mongrel2 sees the "MOBY" request, which means a big body. 2. It notifies the backend handler right away that one is coming in, with all the relevant headers, but not actual body yet. 3. The backend will reply with the response that should go out when the upload is done after doing the usual checks. 4. Mongrel2 will then deal with the browser's request by streaming the actual contents to a temp file somewhere. 5. When the upload is complete, Mongrel2 shoots the response it has saved from #3 above to the browser, and then... 6. Sends a new message to another "upload handler" telling it the upload was done, and where to get the tmpfile. 7. The upload handler then does whatever needs to be done. Video transcoding, S3 push, notify other handlers, whatever. It may even be possible to have the handler from #3 indicate what the final "upload route target" should be, but probably as a later feature. What do you think? -- Zed A. Shaw http://zedshaw.com/
Interesting idea. Some comments: > 1. Mongrel2 sees the "MOBY" request, which means a big body. Where do you draw the line between the size of the bodies? Maybe a specific SQLite setting that could be set as the buffer size (16kb or something small). On top of that, two ways of accessing the same information could be annoying: what if a form can be submitted with or without a file attached? This would complicate handler code. > 3. The backend will reply with the response that should go out when > the upload > is done after doing the usual checks. What if the checks involve processing of the body itself? For example, the browser (or that god-awful thing called Flash) could send an invalid or unknown (think application/octet-stream, which Flash stupidly sends uploads with) Content-Type because of its inability to guess the MIME from the file extension. If you send one of these to something like an image upload service it *should* reply with a 415 Unsupported Media Type (or 400 if you're lazy). This check can't be done unless you are able to check the file itself. > 4. Mongrel2 will then deal with the browser's request by streaming the > actual contents to a temp file somewhere. Where would this tempfile be, and how would it be accessed? One of the benefits of using something like 0mq is that the handler can be located on the network rather than locally on the filesystem. Would you give the handler a URL that it could stream the data from (using some streaming protocol or raw TCP)?
On Tue, Jul 13, 2010 at 10:12:11AM -0700, Alexander Kern wrote: > Interesting idea. Some comments: > > >1. Mongrel2 sees the "MOBY" request, which means a big body. > Where do you draw the line between the size of the bodies? Maybe a > specific SQLite setting that could be set as the buffer size (16kb > or something small). On top of that, two ways of accessing the same > information could be annoying: what if a form can be submitted with > or without a file attached? This would complicate handler code. Yep, there'd be a cutoff somewhere in the config, so if you set it high enough you wouldn't deal with the uploads. Also, the first handler could easily give a response of "screw it, just hand it to me" to make a better decision. As for complicating the handler, yep it does do that, but since handler are fairly easy to write, and you'd write two, it's hopefully not too hard and makes your app deal with file uploads way better. > >3. The backend will reply with the response that should go out > >when the upload > >is done after doing the usual checks. > What if the checks involve processing of the body itself? For > example, the browser (or that god-awful thing called Flash) could > send an invalid or unknown (think application/octet-stream, which > Flash stupidly sends uploads with) Content-Type because of its > inability to guess the MIME from the file extension. If you send one > of these to something like an image upload service it *should* reply > with a 415 Unsupported Media Type (or 400 if you're lazy). This > check can't be done unless you are able to check the file itself. So in this case you'd need a way for the handler that deals with the actual file to give the response, not the first? That's doable, and actually it might be simple to just have handlers only send responses if they're supposed to. Remember, Mongrel2 doesn't care who sends what to a connected browser, so both, none, or one of the handlers you've got can give responses. In this case, just don't have the first handler say anything if the request is alright. Then the second handler does its thing with the file and sends the 415 response if needed. That also simplifies the design a bunch. > >4. Mongrel2 will then deal with the browser's request by streaming the > >actual contents to a temp file somewhere. > Where would this tempfile be, and how would it be accessed? One of > the benefits of using something like 0mq is that the handler can be > located on the network rather than locally on the filesystem. Would > you give the handler a URL that it could stream the data from (using > some streaming protocol or raw TCP)? There's three proposals I've thought of for this: 1. On a file on disk, it's up to your upload processing handler to figure it out from there. 2. Out of an HTTP directory, so something on the network can grab it. 3. Off a raw socket so you can just connect FTP style and pull the whole thing down. Of the three I like #1 since it means you can implement #2 and #3 if you need and it works for everyone in the simplest case, while letting people get more complex if they need. #3 is teh suck to me since that means defining a new protocol for very little benefit. -- Zed A. Shaw http://zedshaw.com/
> Yep, there'd be a cutoff somewhere in the config, so if you set it > high > enough you wouldn't deal with the uploads. Also, the first handler > could easily give a response of "screw it, just hand it to me" to > make a > better decision. I think the reverse would also be useful, something like "never send me the entity body". Certain resources will almost always receive POSTs or PUTs with a binary attached (image upload services especially). It'd be beneficial to the handler of such a service to have only one way of accessing the body, even if the image is a 2kb PNG or a 10mb JPG. Just a thought. > In this case, just don't have the first handler say anything if the > request is alright. Then the second handler does its thing with the > file and sends the 415 response if needed. That also simplifies the > design a bunch. This would work, but why have 2 handlers then? Couldn't you just skip the first handler and let the second one take care of business? > 1. On a file on disk, it's up to your upload processing handler to > figure it out from there. Definitely do this. > 2. Out of an HTTP directory, so something on the network can grab it. Wouldn't this defeat one of the major benefits of Mongrel2? This would force the client/handler/whoever to parse HTTP using whatever (possibly slow) parser they have instead of the ragel one. > 3. Off a raw socket so you can just connect FTP style and pull the > whole > thing down. Blech, seems hacky and unfinished, I agree. Could the handler be configured instead to receive something like chunked HTTP, with it receiving an initial header message, then blocking until it receives subsequent body messages?
On Tue, Jul 13, 2010 at 10:44:30AM -0700, Alexander Kern wrote: > > In this case, just don't have the first handler say anything if the > > request is alright. Then the second handler does its thing with the > > file and sends the 415 response if needed. That also simplifies the > > design a bunch. > This would work, but why have 2 handlers then? Couldn't you just skip > the first handler and let the second one take care of business? Because you don't want Mongrel2 to process any files that it shouldn't. A *very* common scenario is that the front web server in a cluster completes a whole request, then hits the backend only to find out that the URL is wrong, or that request isn't valid, or the login is wrong, or it's too big, etc. Something that's easily checked from just headers on an initial hit to a handler. What this cuts down on is, if the base HTTP request indicates that the upload should not happen, then Mongrel2 can cut the browser off right away and shoot a response without having to complete the upload. But, nothing prevents you from having the same handler deal with it. It's just routing after all, and since the message formats are the universal you just have to deal with both requests and do your thing. > > 1. On a file on disk, it's up to your upload processing handler to > > figure it out from there. > Definitely do this. > > > 2. Out of an HTTP directory, so something on the network can grab it. > Wouldn't this defeat one of the major benefits of Mongrel2? This would > force the client/handler/whoever to parse HTTP using whatever > (possibly slow) parser they have instead of the ragel one. Yep, thus why I'm not so interested in this. I think the majority of folks who do any serious uploads are most likely going to be pushing the uploaded file to some "S3 like thing" and dealing with it there. #1 makes this and anything else possible. Like I said, you can implement #2 if you have #1, but the inverse is harder. > Could the handler be configured instead to receive something like > chunked HTTP, with it receiving an initial header message, then > blocking until it receives subsequent body messages? Yep, except 99% of all HTTP libraries suck and couldn't handle this type of streaming. It'd be possible to implement it easily though using the basic primitives. -- Zed A. Shaw http://zedshaw.com/
On Jul 13, 2010, at 14:05 , Zed A. Shaw wrote: > On Tue, Jul 13, 2010 at 10:44:30AM -0700, Alexander Kern wrote: >>> In this case, just don't have the first handler say anything if the >>> request is alright. Then the second handler does its thing with the >>> file and sends the 415 response if needed. That also simplifies the >>> design a bunch. >> This would work, but why have 2 handlers then? Couldn't you just skip >> the first handler and let the second one take care of business? > > Because you don't want Mongrel2 to process any files that it shouldn't. > A *very* common scenario is that the front web server in a cluster > completes a whole request, then hits the backend only to find out that > the URL is wrong, or that request isn't valid, or the login is wrong, > or it's too big, etc. Something that's easily checked from just headers > on an initial hit to a handler. > > What this cuts down on is, if the base HTTP request indicates that the > upload should not happen, then Mongrel2 can cut the browser off right > away and shoot a response without having to complete the upload. This is brilliant; I haven't ever seen a web server that would do something other than blindly wait for the request to finish before processing and sending a response. In order to still handle HTTP correctly, it seems like you'd need to introduce another mongrel2 state, like, "we're going to tell the client to piss off, but they're still sending garbage, so throw away what they send and then respond." Then again, given the internal FSM, that should be straightforward. But yeah, overall, I like the dual-handler design for uploads with a "fast fail" if one of the handlers rejects the request outright; makes a hell of a lot of sense to me. best, - Fred. http://weblog.fredalger.net/ @_phred
On Tue, Jul 13, 2010 at 03:52:28PM -0400, Fred Alger wrote: > On Jul 13, 2010, at 14:05 , Zed A. Shaw wrote: > > What this cuts down on is, if the base HTTP request indicates that > > the upload should not happen, then Mongrel2 can cut the browser off > > right away and shoot a response without having to complete the > > upload. > > This is brilliant; I haven't ever seen a web server that would do > something other than blindly wait for the request to finish before > processing and sending a response. In order to still handle HTTP > correctly, it seems like you'd need to introduce another mongrel2 > state, like, "we're going to tell the client to piss off, but they're > still sending garbage, so throw away what they send and then respond." > Then again, given the internal FSM, that should be straightforward. I think actually this would be a new state of "MobyRequest", not to be confused with the musician. :-) It'd be similar to Proxying, but instead it's negotiating the upload of a giant request body. > But yeah, overall, I like the dual-handler design for uploads with a > "fast fail" if one of the handlers rejects the request outright; makes > a hell of a lot of sense to me. About the only thing that'd have to be worked out is if this is kosher with the protocol. I think the server is allowed to close the socket violently and send a reply, but not sure if browsers will like that or get it. -- Zed A. Shaw http://zedshaw.com/
> Because you don't want Mongrel2 to process any files that it > shouldn't. > A *very* common scenario is that the front web server in a cluster > completes a whole request, then hits the backend only to find out that > the URL is wrong, or that request isn't valid, or the login is wrong, > or it's too big, etc. Something that's easily checked from just > headers > on an initial hit to a handler. > > What this cuts down on is, if the base HTTP request indicates that the > upload should not happen, then Mongrel2 can cut the browser off right > away and shoot a response without having to complete the upload. > > But, nothing prevents you from having the same handler deal with it. > It's just routing after all, and since the message formats are the > universal you just have to deal with both requests and do your thing. I love the way WebMachine deals with this. Basically it maps the HTTP protocol to a set of callback functions and has a decision engine behind them. Only after parsing the headers does it even touch the body (and yes, it does support streaming). I think we're thinking of the same thing, basically letting the handler (or multiple handlers) parse stuff in any order they want, sending a response once they have enough information. (This is the perfect use case for 100 Continue, by the way. If only browsers actually *used* it...) > Yep, except 99% of all HTTP libraries suck and couldn't handle this > type > of streaming. It'd be possible to implement it easily though using > the > basic primitives. I completely agree. True HTTP support in general sucks. When it exists, the interface usually sucks or is too low level to make code expressive. I'm writing a Ruby/Node library right now that deals with just this (since Node's is too low level and Ruby's just sucks). Mongrel2 will require some changes in application deployment, so why not encourage users to use streaming HTTP libraries? :)
On Tue, Jul 13, 2010 at 11:24:47AM -0700, Alexander Kern wrote: > Mongrel2 will require some changes in application deployment, so why > not encourage users to use streaming HTTP libraries? :) Because nobody would write them. HTTP chunked encoding is a bizarre often abused corner of the standard, and it's horribly innefficient. It's way better to just let whoever needs and want this use the basics to get implement it than trying to do it myself and spend the next year convincing people to do it my way. -- Zed A. Shaw http://zedshaw.com/
"Zed A. Shaw" <zedshaw@zedshaw.com> wrote: > On Tue, Jul 13, 2010 at 11:24:47AM -0700, Alexander Kern wrote: > > Mongrel2 will require some changes in application deployment, so why > > not encourage users to use streaming HTTP libraries? :) > > Because nobody would write them. HTTP chunked encoding is a bizarre > often abused corner of the standard, and it's horribly innefficient. > It's way better to just let whoever needs and want this use the basics > to get implement it than trying to do it myself and spend the next year > convincing people to do it my way. While definitely a corner case and rarely seen, chunked encoding is can be useful and more efficient if used in a pipeline. Mobile devices can stream compressed voice data to a server as even while that stream is active (somebody is speaking into it). A chunk-aware server can then start processing that data before the speaker has even finished speaking and return a result sooner after the last phrase is spoken. Since processing audio data can be expensive, it makes even more sense to process it incrementally as the client uploads it. Another potentially useful case is if I run out of space on my local machine and need to backup to a storage provider: tar zcf - pr0n/ | curl -T- http://example.com/my_faxes.tar.gz I've been meaning to teach curl to calculate and write Content-MD5: trailers, too, so the data can be streamed once and checksummed on-the-fly for the server to verify. I find the above examples quite useful in case where writing large amounts of data to the local filesystem isn't possible. The extra memory bandwidth on the server to needed to decode chunks in userspace shouldn't be much compared to filesystem I/O on the client. -- Eric Wong
On Tue, Jul 13, 2010 at 01:56:14PM -0700, Eric Wong wrote: > "Zed A. Shaw" <zedshaw@zedshaw.com> wrote: > > On Tue, Jul 13, 2010 at 11:24:47AM -0700, Alexander Kern wrote: > > > Mongrel2 will require some changes in application deployment, so why > > > not encourage users to use streaming HTTP libraries? :) > > > > Because nobody would write them. HTTP chunked encoding is a bizarre > > often abused corner of the standard, and it's horribly innefficient. > > It's way better to just let whoever needs and want this use the basics > > to get implement it than trying to do it myself and spend the next year > > convincing people to do it my way. > > While definitely a corner case and rarely seen, chunked encoding is > can be useful and more efficient if used in a pipeline. For everything you said, you could replace chunked encoding with faster and more reliable 0MQ messages, or "plain old sockets". It almost always breaks down that if you're trying to "stream" chunks over to a server from a client, and you use chunked encoding, then you don't grok sockets. The *already* stream. They're sockets. That's what they do. Stream. No chunks needed. For example, sending a chunked encoding from a client is retarded because it's only gotta deal with one buffer. There's no "memory limit", it's a single buffer. You call malloc, and then read/write from it a bunch. Why chunked encoding ever comes into play is beyond me. That's like adding 500 bytes of overhead per message to TCP/IP just so you can feel safer like when you wear belts and suspenders. If however you're using chunked encoding like some ghetto RPC, then use 0MQ instead. Or RabbitMQ, or nearly anything else. Trying to do bidirectional chunked encoding as a message protocol is just baffling. Anyway, sorry about the rant, just every time I see someone claiming to need client side chunked encoding I call bullshit. No offense personally intended. when it's all working then try it out. I'm sure it's actually not hard to implement, just something that every person is going to want to do totally differently. -- Zed A. Shaw http://zedshaw.com/
"Zed A. Shaw" <zedshaw@zedshaw.com> wrote: > On Tue, Jul 13, 2010 at 01:56:14PM -0700, Eric Wong wrote: > > "Zed A. Shaw" <zedshaw@zedshaw.com> wrote: > > > On Tue, Jul 13, 2010 at 11:24:47AM -0700, Alexander Kern wrote: > > > > Mongrel2 will require some changes in application deployment, so why > > > > not encourage users to use streaming HTTP libraries? :) > > > > > > Because nobody would write them. HTTP chunked encoding is a bizarre > > > often abused corner of the standard, and it's horribly innefficient. > > > It's way better to just let whoever needs and want this use the basics > > > to get implement it than trying to do it myself and spend the next year > > > convincing people to do it my way. > > > > While definitely a corner case and rarely seen, chunked encoding is > > can be useful and more efficient if used in a pipeline. > > For everything you said, you could replace chunked encoding with faster > and more reliable 0MQ messages, or "plain old sockets". It almost > always breaks down that if you're trying to "stream" chunks over to a > server from a client, and you use chunked encoding, then you don't grok > sockets. The *already* stream. They're sockets. That's what they do. > Stream. No chunks needed. Of course plain old sockets will always be faster than chunking. But HTTP overhead isn't that much with large bodies/chunks. HTTP is already ubiquitous and trying to get client developers to adopt/learn new stuff isn't easy. I don't do much client-side development, but the popular libcurl already supports HTTP chunking and I suspect other client libraries do, too, especially when it comes to mobile devices. > For example, sending a chunked encoding from a client is retarded > because it's only gotta deal with one buffer. There's no "memory > limit", it's a single buffer. You call malloc, and then read/write from > it a bunch. Why chunked encoding ever comes into play is beyond me. > That's like adding 500 bytes of overhead per message to TCP/IP just so > you can feel safer like when you wear belts and suspenders. Assuming a 4K chunk, I only count 8 bytes of overhead per chunk: "1000\r\n", payload, "\r\n" I haven't studied 0MQ, but since it can use TCP (and most likely, must use TCP when dealing with remote/mobile clients) it would have to deal with message boundaries to split them into messages, too. I'm fine with paying an extra few bytes to avoid introducing the maintenance overhead of more protocols. > If however you're using chunked encoding like some ghetto RPC, then use > 0MQ instead. Or RabbitMQ, or nearly anything else. Trying to do > bidirectional chunked encoding as a message protocol is just baffling. It's a bit weird, yes, but it was fun to try once upon a time and I could've used it to get around firewalls. > Anyway, sorry about the rant, just every time I see someone claiming to > need client side chunked encoding I call bullshit. No offense > personally intended. None taken. > when it's all working then try it out. I'm sure it's actually not hard > to implement, just something that every person is going to want to do > totally differently. Not hard for people on this mailing list, sure, but I've seen plenty of "programmers" struggle to even make HTTP GET requests with whatever libraries they're using. Introducing them to new libraries/protocols would take quite a lot of effort. -- Eric Wong
On Wed, Jul 14, 2010 at 02:05:16AM +0000, Eric Wong wrote: > Of course plain old sockets will always be faster than chunking. But > HTTP overhead isn't that much with large bodies/chunks. HTTP is already > ubiquitous and trying to get client developers to adopt/learn new stuff > isn't easy. First off, HTTP is not ubiquitous. It's not the only major protocol on the internet, and that is still no reason to use it for everything. If you mean "everywhere" as in there's a library in every language, so is sockets, and bastardizing HTTP to be some kind of lame socket protocol just because the library is there is backwards. Also, you realize you're advocating basing a protocol on the HTTP libraries that are in most languages, which are total crap, and which are then on top of TCP anyway. > I don't do much client-side development, but the popular libcurl already > supports HTTP chunking and I suspect other client libraries do, too, > especially when it comes to mobile devices. Nope, they don't, not as HTTP requests with chunked encodings in them. Hell, the majority of them can't even get mime encoding for file uploads right. I mean seriously man, how are they going to get chunked encoding right? > Assuming a 4K chunk, I only count 8 bytes of overhead per chunk: > > "1000\r\n", payload, "\r\n" Alright, but what's that get you? What's it's purpose again? So far you've advocated it for: 1. Constrained RAM: Nope, every socket library there is can allow me to use a buffer of even 1 character in size, so that's not accurate. 2. Constrained Disk: Again, sockets allow for arbitrary sized storage as the buffer and do not require the entire dataset to be in RAM or on Disk to use them. 3. To send chunks: A tautology, you need to use "chunked" encoding so you can send chunks of data, on a socket which can already do that. 4. To avoid additional protocols: So rather than write a clean protocol for an odd purpose, you would rather stack that protocol on HTTP which is then on top of sockets? What's next, SSL inside SOAP inside HTTP inside SSL inside sockets? 5. For developer simplicity: First, it's a total myth that sockets are hard. Second, how is X on top of HTTP on top of sockets easier than just sockets? All the *exact* same errors from sockets are there, plus any other layers. 6. To send through port 80: First off, Mongrel2 shows that the port doesn't matter. The parser handles two different protocols just fine on the same port. Secondly, this is a security hack and solid proof that this use case is just that, a hack. Pretty much none of the reasoning stands, and so far your argument, and that of other people's, is some weird idea that HTTP is simpler for developers, so let's do everything through HTTP. This belief that programmers are too stupid to understand basic sockets, but that they'll understand a protocol written on top of sockets is just maddening. > I'm fine with paying an extra few bytes to avoid introducing the > maintenance overhead of more protocols. If you are putting chunks of video inside chunked encoding, you have just invented a new protocol inside another protocol being used in an odd way. There is *no* way that's easier to maintain, debug, or operate with. The real solution is not, "Coders get HTTP, so put everything inside it no matter what." It is actually, "Create your protocol that works best for you, then write a good library they can just use." That's basically what you are really trying to get with this neckbeard feature of protocols inside HTTP. If the problem is developer usability, then the solution is not using something familiar, but to give them a *usable* way to access your protocol. For example, this is why in one day someone was able to craft a C++ library for doing mongrel handlers, and someon else helped them. Because *I* crafted a protocol that was easy to understand, and then wrote a nice clean simple library for others to work with. That's the real solution, not this HTTP cargo culting. -- Zed A. Shaw http://zedshaw.com/
To add an interesting side point. It's interesting how more and more protocols and functionality have been layered over HTTP. Firewalls used to be relatively simple in that you could block most ports except those necessary, and you'd block a huge swatch of attacks. Now those ports barely even matter. Everything is tunneled over HTTP, so we need fancy application firewalls that use DPI to figure out what the heck is being tunneled. Add SSL in to the mix, and you can see how this rationale has gotten us into a bit of a quagmire. -Tim On Jul 14, 2010, at 2:50 AM, Zed A. Shaw wrote: > On Wed, Jul 14, 2010 at 02:05:16AM +0000, Eric Wong wrote: >> Of course plain old sockets will always be faster than chunking. But >> HTTP overhead isn't that much with large bodies/chunks. HTTP is already >> ubiquitous and trying to get client developers to adopt/learn new stuff >> isn't easy. > > First off, HTTP is not ubiquitous. It's not the only major protocol on > the internet, and that is still no reason to use it for everything. If > you mean "everywhere" as in there's a library in every language, so is > sockets, and bastardizing HTTP to be some kind of lame socket protocol > just because the library is there is backwards. > > Also, you realize you're advocating basing a protocol on the HTTP > libraries that are in most languages, which are total crap, and which > are then on top of TCP anyway. > > >> I don't do much client-side development, but the popular libcurl already >> supports HTTP chunking and I suspect other client libraries do, too, >> especially when it comes to mobile devices. > > Nope, they don't, not as HTTP requests with chunked encodings in them. > Hell, the majority of them can't even get mime encoding for file uploads > right. I mean seriously man, how are they going to get chunked encoding > right? > >> Assuming a 4K chunk, I only count 8 bytes of overhead per chunk: >> >> "1000\r\n", payload, "\r\n" > > Alright, but what's that get you? What's it's purpose again? So far > you've advocated it for: > > 1. Constrained RAM: Nope, every socket library there is can allow me to > use a buffer of even 1 character in size, so that's not accurate. > 2. Constrained Disk: Again, sockets allow for arbitrary sized storage > as the buffer and do not require the entire dataset to be in RAM or on > Disk to use them. > 3. To send chunks: A tautology, you need to use "chunked" encoding so > you can send chunks of data, on a socket which can already do that. > 4. To avoid additional protocols: So rather than write a clean protocol > for an odd purpose, you would rather stack that protocol on HTTP which > is then on top of sockets? What's next, SSL inside SOAP inside HTTP > inside SSL inside sockets? > 5. For developer simplicity: First, it's a total myth that sockets are > hard. Second, how is X on top of HTTP on top of sockets easier than just > sockets? All the *exact* same errors from sockets are there, plus any > other layers. > 6. To send through port 80: First off, Mongrel2 shows that the port > doesn't matter. The parser handles two different protocols just fine on > the same port. Secondly, this is a security hack and solid proof that > this use case is just that, a hack. > > Pretty much none of the reasoning stands, and so far your argument, and > that of other people's, is some weird idea that HTTP is simpler for > developers, so let's do everything through HTTP. > > This belief that programmers are too stupid to understand basic sockets, > but that they'll understand a protocol written on top of sockets is just > maddening. > >> I'm fine with paying an extra few bytes to avoid introducing the >> maintenance overhead of more protocols. > > If you are putting chunks of video inside chunked encoding, you have > just invented a new protocol inside another protocol being used in an > odd way. There is *no* way that's easier to maintain, debug, or operate > with. > > The real solution is not, "Coders get HTTP, so put everything inside it no > matter what." It is actually, "Create your protocol that works best for > you, then write a good library they can just use." That's basically > what you are really trying to get with this neckbeard feature of > protocols inside HTTP. If the problem is developer usability, then the > solution is not using something familiar, but to give them a *usable* > way to access your protocol. > > For example, this is why in one day someone was able to craft a C++ > library for doing mongrel handlers, and someon else helped them. > Because *I* crafted a protocol that was easy to understand, and then > wrote a nice clean simple library for others to work with. > > That's the real solution, not this HTTP cargo culting. > > -- > Zed A. Shaw > http://zedshaw.com/
On Wed, Jul 14, 2010 at 09:05:07AM -0400, Timothy M Rodriguez wrote: > To add an interesting side point. It's interesting how more and more > protocols and functionality have been layered over HTTP. Firewalls > used to be relatively simple in that you could block most ports except > those necessary, and you'd block a huge swatch of attacks. Now those > ports barely even matter. Everything is tunneled over HTTP, so we > need fancy application firewalls that use DPI to figure out what the > heck is being tunneled. Add SSL in to the mix, and you can see how > this rationale has gotten us into a bit of a quagmire. True, but I think a counter to that is that you still need a cooperating server on the other side that understands the layering. I've found that it's only companies who want to block traffic going out that have this problem, or governments. Traffic coming in still have to HTTP since it would need a cooperating server internally to function. -- Zed A. Shaw http://zedshaw.com/
Good point. It doesn't matter as much on ingress. On Jul 14, 2010, at 1:39 PM, Zed A. Shaw wrote: > On Wed, Jul 14, 2010 at 09:05:07AM -0400, Timothy M Rodriguez wrote: >> To add an interesting side point. It's interesting how more and more >> protocols and functionality have been layered over HTTP. Firewalls >> used to be relatively simple in that you could block most ports except >> those necessary, and you'd block a huge swatch of attacks. Now those >> ports barely even matter. Everything is tunneled over HTTP, so we >> need fancy application firewalls that use DPI to figure out what the >> heck is being tunneled. Add SSL in to the mix, and you can see how >> this rationale has gotten us into a bit of a quagmire. > > True, but I think a counter to that is that you still need a cooperating > server on the other side that understands the layering. I've found that > it's only companies who want to block traffic going out that have this > problem, or governments. Traffic coming in still have to HTTP since it > would need a cooperating server internally to function. > > -- > Zed A. Shaw > http://zedshaw.com/
I was wondering the same thing myself, since ZMQ messages are atomic, you'd need to send multiple messages to do streaming (which the backend protocol doesn't seem to support yet). You could use the raw TCP backend though. On Tue, Jul 13, 2010 at 8:34 AM, Alexander Kern <alex@kernul.com> wrote: > How exactly (if at all) will Mongrel2 handle streaming data, such as > file uploads or large PUT or POST requests? Will the entity body first > be completely downloaded into a temporary file and the filename sent > to the handler, or will the handler be able to incrementally read the > data as it comes in? One of the major points at which Node.js excels > is it's ability to efficiently handle streaming data, but Mongrel2's > architecture seems to be focused more on synchronous applications. > -- Andrew Cholakian http://www.andrewvc.com