librelist archives

« back to archive

Any way to stream file uploads?

Any way to stream file uploads?

From:
Michael Fogleman
Date:
2011-09-09 @ 17:01
request.files['file'] gives you a FileStorage instance, which has a
file-like object that you can read from. But that file is basically
the entire upload already in memory or in a temporary file. Is there
any way in Flask to process the stream as it is being uploaded?

Re: [flask] Any way to stream file uploads?

From:
Armin Ronacher
Date:
2011-09-09 @ 17:11
Hi,

On 9/9/11 7:01 PM, Michael Fogleman wrote:
> request.files['file'] gives you a FileStorage instance, which has a
> file-like object that you can read from. But that file is basically
> the entire upload already in memory or in a temporary file. Is there
> any way in Flask to process the stream as it is being uploaded?
The way the form parsing in Flask (or rather Werkzeug) works is by
consuming wsgi.input via werkzeug.formparser.parse_multipart which is
invoked by werkzeug.formparser.parse_form_data.  It's nontrivial to hook
into that generally but that is not necessary for any task I came up
with since you can just wrap the wsgi.input.

Werkzeug guarantees you that it will always only use the .read() method
with a given size from the input stream so you can easily wrap it:


class StreamWrapper(object):
    def __init__(self, stream):
        self._stream = stream
    def read(self, bytes):
        rv = self._stream.read(bytes)
        # do something with rv
        return rv


@app.route('/upload', methods=['GET', 'POST'])
def upload_files():
    request.environ['wsgi.input'] = \
        StreamWrapper(request.environ['wsgi.input'])
    # at that point access request.files and it will read via
    # your StreamWrapper.  Careful not to access request.files at
    # any point earlier.  request.shallow can be set to True to make
    # sure this does not happen by accident.
    ...

Werkzeug calls .read() on your stream in buffer_size steps which is
currently kinda hardcoded to 10KB.  If there are wishes to make this
more pluggable, please file a ticket in the Werkzeug issue tracker.


Regards,
Armin

Re: [flask] Any way to stream file uploads?

From:
Michael Fogleman
Date:
2011-09-09 @ 17:29
Cool, but it looks like it's trying to use readline:

  File 
"C:\Python26\lib\site-packages\werkzeug-0.6.2-py2.6.egg\werkzeug\formparser.py",
line 208, in parse_multipart
    file = LimitedStream(file, content_length)
  File "C:\Python26\lib\site-packages\werkzeug-0.6.2-py2.6.egg\werkzeug\wsgi.py",
line 662, in __init__
    self._readline = stream.readline
AttributeError: 'StreamWrapper' object has no attribute 'readline'

On Fri, Sep 9, 2011 at 1:11 PM, Armin Ronacher
<armin.ronacher@active-4.com> wrote:
> Hi,
>
> On 9/9/11 7:01 PM, Michael Fogleman wrote:
>> request.files['file'] gives you a FileStorage instance, which has a
>> file-like object that you can read from. But that file is basically
>> the entire upload already in memory or in a temporary file. Is there
>> any way in Flask to process the stream as it is being uploaded?
> The way the form parsing in Flask (or rather Werkzeug) works is by
> consuming wsgi.input via werkzeug.formparser.parse_multipart which is
> invoked by werkzeug.formparser.parse_form_data.  It's nontrivial to hook
> into that generally but that is not necessary for any task I came up
> with since you can just wrap the wsgi.input.
>
> Werkzeug guarantees you that it will always only use the .read() method
> with a given size from the input stream so you can easily wrap it:
>
>
> class StreamWrapper(object):
>    def __init__(self, stream):
>        self._stream = stream
>    def read(self, bytes):
>        rv = self._stream.read(bytes)
>        # do something with rv
>        return rv
>
>
> @app.route('/upload', methods=['GET', 'POST'])
> def upload_files():
>    request.environ['wsgi.input'] = \
>        StreamWrapper(request.environ['wsgi.input'])
>    # at that point access request.files and it will read via
>    # your StreamWrapper.  Careful not to access request.files at
>    # any point earlier.  request.shallow can be set to True to make
>    # sure this does not happen by accident.
>    ...
>
> Werkzeug calls .read() on your stream in buffer_size steps which is
> currently kinda hardcoded to 10KB.  If there are wishes to make this
> more pluggable, please file a ticket in the Werkzeug issue tracker.
>
>
> Regards,
> Armin
>