librelist archives

« back to archive

Mongrel2 not serving files bigger than 2GB

Mongrel2 not serving files bigger than 2GB

From:
Dalton Barreto
Date:
2011-05-27 @ 02:09
Hello,

Just a few days ago I needed to serve a 3.5GB file out of mongrel2,
and was surprised to receive an empty response.

In the logs I see:

[ERROR] (src/dir.c:94: errno: Value too large for defined data type)
File stat failed: <here is the file path>

So I googled about this and found out an option we can pass to gcc
when compiling code that will handler 2GB+ files:
-D_FILE_OFFSET_BITS=64

I created a branch[1] with this correction, but since I don't have
access to *BSDs, OSX and other platforms, it would be great if anyone
tried to compile and run the tests just to check thats all ok.

Please review the patch and let me know if this -D_FILE_OFFSET_BITS=64
option is ok on all platforms.

The patch is here:

https://github.com/zedshaw/mongrel2/commit/e3b526e523edb8f3cb98a3f3e149ebf2e406101b

Thanks.

[1] https://github.com/zedshaw/mongrel2/tree/staticfiles-bigger-2gb

-- 
Dalton Barreto
http://daltonmatos.wordpress.com
http://wsgid.com

Re: [mongrel2] Mongrel2 not serving files bigger than 2GB

From:
Zed A. Shaw
Date:
2011-05-27 @ 04:56
On Thu, May 26, 2011 at 11:09:04PM -0300, Dalton Barreto wrote:
> Hello,
> 
> Just a few days ago I needed to serve a 3.5GB file out of mongrel2,
> and was surprised to receive an empty response.
> 
> In the logs I see:
> 
> [ERROR] (src/dir.c:94: errno: Value too large for defined data type)
> File stat failed: <here is the file path>

Jeez, well at least our practice of checking every return value
protected against it.

The use of stat to find file sizes has hit us in another instance so I
wonder if we can just ditch it and do something else.

Anyway, I'll grab your stuff and apply it if someone hasn't yet.

-- 
Zed A. Shaw
http://zedshaw.com/

Re: [mongrel2] Mongrel2 not serving files bigger than 2GB

From:
Alex Gartrell
Date:
2011-05-27 @ 03:22
I think we will need to do more to make this work correctly.  Specifically,
all of the steam_file functions take int len's instead of off_t/size_t
len's, so we're going to overflow with files greater than 4 gbs anyway.  If
you wanna go throught and change all this stuff you can, or I can do it;
just let me know.

Alex

On Thu, May 26, 2011 at 10:09 PM, Dalton Barreto <daltonmatos@gmail.com>wrote:

> Hello,
>
> Just a few days ago I needed to serve a 3.5GB file out of mongrel2,
> and was surprised to receive an empty response.
>
> In the logs I see:
>
> [ERROR] (src/dir.c:94: errno: Value too large for defined data type)
> File stat failed: <here is the file path>
>
> So I googled about this and found out an option we can pass to gcc
> when compiling code that will handler 2GB+ files:
> -D_FILE_OFFSET_BITS=64
>
> I created a branch[1] with this correction, but since I don't have
> access to *BSDs, OSX and other platforms, it would be great if anyone
> tried to compile and run the tests just to check thats all ok.
>
> Please review the patch and let me know if this -D_FILE_OFFSET_BITS=64
> option is ok on all platforms.
>
> The patch is here:
>
> 
https://github.com/zedshaw/mongrel2/commit/e3b526e523edb8f3cb98a3f3e149ebf2e406101b
>
> Thanks.
>
> [1] https://github.com/zedshaw/mongrel2/tree/staticfiles-bigger-2gb
>
> --
> Dalton Barreto
> http://daltonmatos.wordpress.com
> http://wsgid.com
>

Re: [mongrel2] Mongrel2 not serving files bigger than 2GB

From:
Dalton Barreto
Date:
2011-05-27 @ 23:14
2011/5/27 Alex Gartrell <agartrell@cmu.edu>:
> I think we will need to do more to make this work correctly.  Specifically,
> all of the steam_file functions take int len's instead of off_t/size_t
> len's, so we're going to overflow with files greater than 4 gbs anyway.  If
> you wanna go throught and change all this stuff you can, or I can do it;
> just let me know.
> Alex

Great Alex, I just didn't see this. If you want, push this fixes
directly to the branch I already created. Even not merging all this
for the current release, at least we will already have a something
good when it''s time for the merge.

Thanks for you help.

-- 
Dalton Barreto
http://daltonmatos.wordpress.com
http://wsgid.com

Re: [mongrel2] Mongrel2 not serving files bigger than 2GB

From:
Dalton Barreto
Date:
2011-05-28 @ 17:09
2011/5/27 Dalton Barreto <daltonmatos@gmail.com>:
> 2011/5/27 Alex Gartrell <agartrell@cmu.edu>:
>> I think we will need to do more to make this work correctly.  Specifically,
>> all of the steam_file functions take int len's instead of off_t/size_t
>> len's, so we're going to overflow with files greater than 4 gbs anyway.  If
>> you wanna go throught and change all this stuff you can, or I can do it;
>> just let me know.
>> Alex
>
> Great Alex, I just didn't see this. If you want, push this fixes
> directly to the branch I already created. Even not merging all this
> for the current release, at least we will already have a something
> good when it''s time for the merge.
>

Hey Alex,

Take a look at the conversation at issue 5 [1]. I will try to fix this
problem without changing the len parameter  of IOBuf_stream_*
functions, as Zed suggested.

If you are willing to help, let me know.

Thanks.
-- 
Dalton Barreto
http://daltonmatos.wordpress.com
http://wsgid.com

Re: [mongrel2] Mongrel2 not serving files bigger than 2GB

From:
Dalton Barreto
Date:
2011-05-28 @ 21:03
2011/5/28 Dalton Barreto <daltonmatos@gmail.com>:
>
> Hey Alex,
>
> Take a look at the conversation at issue 5 [1]. I will try to fix this
> problem without changing the len parameter  of IOBuf_stream_*
> functions, as Zed suggested.
>

Actually, now looking at the code I realized that we still need to
change the len parameter. =)
At least from IOBuf_stream_file() and IOBuf->stream_file callback.



-- 
Dalton Barreto
http://daltonmatos.wordpress.com
http://wsgid.com

Re: [mongrel2] Mongrel2 not serving files bigger than 2GB

From:
Dalton Barreto
Date:
2011-05-28 @ 21:14
2011/5/28 Dalton Barreto <daltonmatos@gmail.com>:
> 2011/5/28 Dalton Barreto <daltonmatos@gmail.com>:
>>
>> Hey Alex,
>>
>> Take a look at the conversation at issue 5 [1]. I will try to fix this
>> problem without changing the len parameter  of IOBuf_stream_*
>> functions, as Zed suggested.
>>
>
> Actually, now looking at the code I realized that we still need to
> change the len parameter. =)
> At least from IOBuf_stream_file() and IOBuf->stream_file callback.
>

I've pushed some code to the branch. What's missing is the main logic
to send chunks of 2GB.
I'll take a look at this later.

Thanks.

-- 
Dalton Barreto
http://daltonmatos.wordpress.com
http://wsgid.com

Re: [mongrel2] Mongrel2 not serving files bigger than 2GB

From:
Dalton Barreto
Date:
2011-05-29 @ 23:35
2011/5/28 Dalton Barreto <daltonmatos@gmail.com>:
> 2011/5/28 Dalton Barreto <daltonmatos@gmail.com>:
>> 2011/5/28 Dalton Barreto <daltonmatos@gmail.com>:
>>>
>>> Hey Alex,
>>>
>>> Take a look at the conversation at issue 5 [1]. I will try to fix this
>>> problem without changing the len parameter  of IOBuf_stream_*
>>> functions, as Zed suggested.
>>>
>>
>> Actually, now looking at the code I realized that we still need to
>> change the len parameter. =)
>> At least from IOBuf_stream_file() and IOBuf->stream_file callback.
>>
>
> I've pushed some code to the branch. What's missing is the main logic
> to send chunks of 2GB.
> I'll take a look at this later.
>

Alright, so we have now a functioning version of chunked responses. I
had to disable the timeout logic because Registration struct has a
"uint32_t bytes_written", that was obviously overflowing.

To skip the overflow problems I changed from int to long long int, but
I don't know if this is the best to do. Any suggestions?

Other than that, the implementation works pretty well. Even curl
reporting at the end:

curl: (18) transfer closed with outstanding read data remaining

the MD5 sums of the two files match.

Please review the patches, make any suggestions you have and I will
make all modifications needed before merging into master branch.

Thanks!


-- 
Dalton Barreto
http://daltonmatos.wordpress.com
http://wsgid.com

Re: [mongrel2] Mongrel2 not serving files bigger than 2GB

From:
Alex Gartrell
Date:
2011-05-30 @ 14:53
This breaks on OS X (and I'd assume the BSD's, but I don't have them up and
running at the moment).

src/dir.c: In function ‘Dir_find_file’:
src/dir.c:114: error: ‘O_LARGEFILE’ undeclared (first use in this function)
src/dir.c:114: error: (Each undeclared identifier is reported only once
src/dir.c:114: error: for each function it appears in.)

I was also able to break it with a 3.7 gb file of all zeros on OS X

~/sandbox/bigfiles/files $ curl localhost:6767/bigfile > bigfile2
  % Total    % Received % Xferd  Average Speed   Time    Time     Time
 Current
                                 Dload  Upload   Total   Spent    Left
 Speed
 26 3814M   26 1024M    0     0  45.5M      0  0:01:23  0:00:22  0:01:01
41.1M
curl: (56) Received problem 3 in the chunky parser

(it sent the first chunk and then croaked)

[ERROR] (src/bsd_specific.c:64: errno: Socket is not connected) OS X
sendfile wrapper failed
[ERROR] (src/dir.c:185: errno: None) Failed sending chunk content:
IOBuf_stream_file returned -1

Sorry I don't have  more time to look at it right now, I'll try to take a
look later on.

Alex



On Sun, May 29, 2011 at 7:35 PM, Dalton Barreto <daltonmatos@gmail.com>wrote:

> 2011/5/28 Dalton Barreto <daltonmatos@gmail.com>:
> > 2011/5/28 Dalton Barreto <daltonmatos@gmail.com>:
> >> 2011/5/28 Dalton Barreto <daltonmatos@gmail.com>:
> >>>
> >>> Hey Alex,
> >>>
> >>> Take a look at the conversation at issue 5 [1]. I will try to fix this
> >>> problem without changing the len parameter  of IOBuf_stream_*
> >>> functions, as Zed suggested.
> >>>
> >>
> >> Actually, now looking at the code I realized that we still need to
> >> change the len parameter. =)
> >> At least from IOBuf_stream_file() and IOBuf->stream_file callback.
> >>
> >
> > I've pushed some code to the branch. What's missing is the main logic
> > to send chunks of 2GB.
> > I'll take a look at this later.
> >
>
> Alright, so we have now a functioning version of chunked responses. I
> had to disable the timeout logic because Registration struct has a
> "uint32_t bytes_written", that was obviously overflowing.
>
> To skip the overflow problems I changed from int to long long int, but
> I don't know if this is the best to do. Any suggestions?
>
> Other than that, the implementation works pretty well. Even curl
> reporting at the end:
>
> curl: (18) transfer closed with outstanding read data remaining
>
> the MD5 sums of the two files match.
>
> Please review the patches, make any suggestions you have and I will
> make all modifications needed before merging into master branch.
>
> Thanks!
>
>
> --
> Dalton Barreto
> http://daltonmatos.wordpress.com
> http://wsgid.com
>

Re: [mongrel2] Mongrel2 not serving files bigger than 2GB

From:
Dalton Barreto
Date:
2011-05-31 @ 01:19
2011/5/30 Alex Gartrell <agartrell@cmu.edu>:
> This breaks on OS X (and I'd assume the BSD's, but I don't have them up and
> running at the moment).
> src/dir.c: In function ‘Dir_find_file’:
> src/dir.c:114: error: ‘O_LARGEFILE’ undeclared (first use in this function)
> src/dir.c:114: error: (Each undeclared identifier is reported only once
> src/dir.c:114: error: for each function it appears in.)

Indeed. Actually I just compiled without O_LARGEFILE and it worked. I
just kept -D_FILE_OFFSET_BITS=64.

> I was also able to break it with a 3.7 gb file of all zeros on OS X
> ~/sandbox/bigfiles/files $ curl localhost:6767/bigfile > bigfile2
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time
>  Current
>                                  Dload  Upload   Total   Spent    Left
>  Speed
>  26 3814M   26 1024M    0     0  45.5M      0  0:01:23  0:00:22  0:01:01
> 41.1M
> curl: (56) Received problem 3 in the chunky parser
> (it sent the first chunk and then croaked)
> [ERROR] (src/bsd_specific.c:64: errno: Socket is not connected) OS X
> sendfile wrapper failed
> [ERROR] (src/dir.c:185: errno: None) Failed sending chunk content:
> IOBuf_stream_file returned -1

This is strange. I tested here with a 3GB file and it streamed just
fine. Maybe it's something to do with BSD's sendfile()?

> Sorry I don't have  more time to look at it right now, I'll try to take a
> look later on.
> Alex
>

No problem Alex. Thanks for your time looking at this. I'll keep
working. If you have any suggestions just let me know.

Thanks a lot.

-- 
Dalton Barreto
http://daltonmatos.wordpress.com
http://wsgid.com

Re: [mongrel2] Mongrel2 not serving files bigger than 2GB

From:
Dalton Barreto
Date:
2011-05-31 @ 22:40
> 2011/5/30 Alex Gartrell <agartrell@cmu.edu>:
> I was also able to break it with a 3.7 gb file of all zeros on OS X
> ~/sandbox/bigfiles/files $ curl localhost:6767/bigfile > bigfile2
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time
>  Current
>                                  Dload  Upload   Total   Spent    Left
>  Speed
>  26 3814M   26 1024M    0     0  45.5M      0  0:01:23  0:00:22  0:01:01
> 41.1M
> curl: (56) Received problem 3 in the chunky parser
> (it sent the first chunk and then croaked)

Hey Zed and Alex,

I've being thinking about this implementation and I may be (probably
am) in the wrong path. The HTTP chunked response is better suited when
we *do not* know the content length before hand, and this is *not* the
case here as we are serving a static file with a known size.

So, what do you think about removing this (currently broken) HTTP
chunked response logic and just write the chunks directly to the
connection socket? In fact, I think that this implementation is more
correct than what I'm doing right now.

Tell me what you think and I will change the code, it will be a much
simpler implementation and it will be easier to merge back into the
master branch.

Thanks!

-- 
Dalton Barreto
http://daltonmatos.wordpress.com
http://wsgid.com

Re: [mongrel2] Mongrel2 not serving files bigger than 2GB

From:
Zed A. Shaw
Date:
2011-05-27 @ 04:57
On Thu, May 26, 2011 at 11:22:14PM -0400, Alex Gartrell wrote:
> I think we will need to do more to make this work correctly.  Specifically,
> all of the steam_file functions take int len's instead of off_t/size_t
> len's, so we're going to overflow with files greater than 4 gbs anyway.  If
> you wanna go throught and change all this stuff you can, or I can do it;
> just let me know.

Oh yeah there's going to be a bunch of those looking at this now.  Let's
put this in the list of ticket and then hit it on the next release
instead of now.  I want to spend more time testing it and doing it
right.

-- 
Zed A. Shaw
http://zedshaw.com/

Re: [mongrel2] Mongrel2 not serving files bigger than 2GB

From:
Dalton Barreto
Date:
2011-05-27 @ 23:15
2011/5/27 Zed A. Shaw <zedshaw@zedshaw.com>:
> On Thu, May 26, 2011 at 11:22:14PM -0400, Alex Gartrell wrote:
>> I think we will need to do more to make this work correctly.  Specifically,
>> all of the steam_file functions take int len's instead of off_t/size_t
>> len's, so we're going to overflow with files greater than 4 gbs anyway.  If
>> you wanna go throught and change all this stuff you can, or I can do it;
>> just let me know.
>
> Oh yeah there's going to be a bunch of those looking at this now.  Let's
> put this in the list of ticket and then hit it on the next release
> instead of now.  I want to spend more time testing it and doing it
> right.

Alright, issue created: https://github.com/zedshaw/mongrel2/issues/5

Thanks.

-- 
Dalton Barreto
http://daltonmatos.wordpress.com
http://wsgid.com

Re: [mongrel2] Mongrel2 not serving files bigger than 2GB

From:
Zed A. Shaw
Date:
2011-05-27 @ 23:58
On Fri, May 27, 2011 at 08:15:29PM -0300, Dalton Barreto wrote:
> Alright, issue created: https://github.com/zedshaw/mongrel2/issues/5

I absolutely hate github's issues system. How in the hell did they screw
these up so bad.  It's like they don't actually code for a living.

For example, I just now replied to that ticket because the emails they
send out look exactly like librelist emails, so I thought it was a
message you sent to the mailing list.  They couldn't be bothered to do
[github/mongrel2] instead?  Oh no, that's right, because they're all
that matters.

Then I'm looking for the close button, and I see this massive green OPEN
at the top right, a ton of blank space, and nothing.  Instead I gotta
scroll through the whole list of comments and then ohhhh there it is,
a gray button inside a gray panel with only a curvy button bevel to
visually differentiate it from all the other gray shit on their page.

WTF.

-- 
Zed A. Shaw
http://zedshaw.com/