librelist archives

« back to archive

Possible blueprint setup race condition

Possible blueprint setup race condition

From:
Clemens Kolbitsch
Date:
2014-04-01 @ 02:23
Hi guys,

I've been struggling with a weird situation (I have to assume a bug) and
cannot find any references to this on the Internet. If you could chime in
with some advice, that'd be wonderful :)

First off, please don't hate me - we are using python-flask=0.8-1, because
our servers run on Ubuntu LTS-12.04.01 and that's just the version of flask
comes with it :-/

Here is the situation: We host a web-service through nginx, serving
requests to uwsgi-plugin-python (version 1.0.3+dfsg-1ubuntu0.1) to our nice
Python (2.7.3) flask app. Since we had garbage-collection issues with many
small objects on the heap in the past, we decided to respawn the uwsgi
service every few k requests and use multiple processes to avoid a
down-time in the respawn phase (and every process has multiple threads of
course).

Now, it seems like each time the Python process respawns, that flask sets
up our blueprints and then starts accepting requests, but *sometimes*,
responds with a 404 to clients early during this phase. This happens very
rarely, but since there is considerable load on the server, it actually
starts adding up so that we need to something about it.

I have spent quite some time debugging the blueprints, and my findings - so
far - show that at app creation time, only one thread is running in Python
*after* the blueprints have been registered. However, when I add code to
log details about the blueprints / URL-rules when the 404 happens, then the
blueprints are not all/fully loaded yet.

It seems to me like new threads are created and start accepting connections
before all blueprints/rules have been fully registered. Here is some code
to show what I mean:

def create_app(...):
    app = flask.Flask(__name__)
    [...]

    @app.after_request
    def after_request(response):
        if response.status_code == 404:
            # DBG2
            print 'dbg', response.status_code, request.base_url
            print threading.active_count(), "active threads"
            print app.url_map.__dict__.get('foo/mytesturl')
        return response

    app.register_blueprint(mymodel, url_prefix="/foo")
    # DBG1
    print 'done loading', app.url_map.__dict__
    print threading.active_count(), "active threads"
    return app

While the DBG1 code shows everything correctly, the DBG2 code shows a
sub-set of my URL-handlers (which is why the 404 gets executed).

BTW, at DBG1 point, the logs usually show "1 active threads", while at DBG2
many more threads are loaded, making me assume there is some sort of
lazy/on-demand loading of the rules. Is this possible?

Again, the code works, it's just that there is this race where I get 404s.

Is this a known bug in python-flask=0.8-1? If so, it'd be worth
investigating on a newer version, or is there a specific way to tell flask
to fully load all blueprints/rules before accepting connections?

Sorry for the overly long email, I wanted to get all the details right :)

Thanks!
-Clemens


-- 
Clemens Kolbitsch
Security Researcher
kolbitsch@lastline.com
Mobile +1 (206) 356-7745
Land +1 (805) 456-7076

Lastline, Inc.
6950 Hollister Avenue, Suite 101
Goleta, CA 93117

www.lastline.com

Re: [flask] Possible blueprint setup race condition

From:
Shawn Milochik
Date:
2014-04-01 @ 02:32
How about running gunicorn behind nginx (I use this setup). You can have
multiple workers, and use a command line flag to have individual workers
restart after a given number of requests if desired. That's also a command
line flag.

Re: [flask] Possible blueprint setup race condition

From:
Shawn Milochik
Date:
2014-04-01 @ 02:29
Use virtualenv and install the current version of Flask and see if that
fixes it.

Nobody's going to hate you, but you aren't trapped with the version
provided by apt-get.

http://milocast.com/virtualenv.html
On Mar 31, 2014 10:26 PM, "Clemens Kolbitsch" <kolbitsch@lastline.com>
wrote:

> Hi guys,
>
> I've been struggling with a weird situation (I have to assume a bug) and
> cannot find any references to this on the Internet. If you could chime in
> with some advice, that'd be wonderful :)
>
> First off, please don't hate me - we are using python-flask=0.8-1, because
> our servers run on Ubuntu LTS-12.04.01 and that's just the version of flask
> comes with it :-/
>
> Here is the situation: We host a web-service through nginx, serving
> requests to uwsgi-plugin-python (version 1.0.3+dfsg-1ubuntu0.1) to our nice
> Python (2.7.3) flask app. Since we had garbage-collection issues with many
> small objects on the heap in the past, we decided to respawn the uwsgi
> service every few k requests and use multiple processes to avoid a
> down-time in the respawn phase (and every process has multiple threads of
> course).
>
> Now, it seems like each time the Python process respawns, that flask sets
> up our blueprints and then starts accepting requests, but *sometimes*,
> responds with a 404 to clients early during this phase. This happens very
> rarely, but since there is considerable load on the server, it actually
> starts adding up so that we need to something about it.
>
> I have spent quite some time debugging the blueprints, and my findings -
> so far - show that at app creation time, only one thread is running in
> Python *after* the blueprints have been registered. However, when I add
> code to log details about the blueprints / URL-rules when the 404 happens,
> then the blueprints are not all/fully loaded yet.
>
> It seems to me like new threads are created and start accepting
> connections before all blueprints/rules have been fully registered. Here is
> some code to show what I mean:
>
> def create_app(...):
>     app = flask.Flask(__name__)
>     [...]
>
>     @app.after_request
>     def after_request(response):
>         if response.status_code == 404:
>             # DBG2
>             print 'dbg', response.status_code, request.base_url
>             print threading.active_count(), "active threads"
>             print app.url_map.__dict__.get('foo/mytesturl')
>         return response
>
>     app.register_blueprint(mymodel, url_prefix="/foo")
>     # DBG1
>     print 'done loading', app.url_map.__dict__
>     print threading.active_count(), "active threads"
>     return app
>
> While the DBG1 code shows everything correctly, the DBG2 code shows a
> sub-set of my URL-handlers (which is why the 404 gets executed).
>
> BTW, at DBG1 point, the logs usually show "1 active threads", while at
> DBG2 many more threads are loaded, making me assume there is some sort of
> lazy/on-demand loading of the rules. Is this possible?
>
> Again, the code works, it's just that there is this race where I get 404s.
>
> Is this a known bug in python-flask=0.8-1? If so, it'd be worth
> investigating on a newer version, or is there a specific way to tell flask
> to fully load all blueprints/rules before accepting connections?
>
> Sorry for the overly long email, I wanted to get all the details right :)
>
> Thanks!
> -Clemens
>
>
> --
> Clemens Kolbitsch
> Security Researcher
> kolbitsch@lastline.com
> Mobile +1 (206) 356-7745
> Land +1 (805) 456-7076
>
> Lastline, Inc.
> 6950 Hollister Avenue, Suite 101
> Goleta, CA 93117
>
> www.lastline.com
>

Re: [flask] Possible blueprint setup race condition

From:
Clemens Kolbitsch
Date:
2014-04-01 @ 02:40
Hi Shawn,

Thanks for picking up this thread :) I guess I should have tried this right
away. I upgraded flask to 0.10.1-2 and the exactly same problem re-occurs

We use uwsgi because we deploy stuff through puppet and have various other
things around uwsgi that would make it a lot of work to switch. Further,
I'm not sure this is due to uwsgi - we need multi-threading (to optimize to
DB connection-pools) and this really seems like a flask multi-thread issue

Any other ideas?

Thanks!
-Clemens


On Mon, Mar 31, 2014 at 7:29 PM, Shawn Milochik <shawn.milo@gmail.com>wrote:

> Use virtualenv and install the current version of Flask and see if that
> fixes it.
>
> Nobody's going to hate you, but you aren't trapped with the version
> provided by apt-get.
>
> http://milocast.com/virtualenv.html
> On Mar 31, 2014 10:26 PM, "Clemens Kolbitsch" <kolbitsch@lastline.com>
> wrote:
>
>> Hi guys,
>>
>> I've been struggling with a weird situation (I have to assume a bug) and
>> cannot find any references to this on the Internet. If you could chime in
>> with some advice, that'd be wonderful :)
>>
>> First off, please don't hate me - we are using python-flask=0.8-1,
>> because our servers run on Ubuntu LTS-12.04.01 and that's just the version
>> of flask comes with it :-/
>>
>> Here is the situation: We host a web-service through nginx, serving
>> requests to uwsgi-plugin-python (version 1.0.3+dfsg-1ubuntu0.1) to our nice
>> Python (2.7.3) flask app. Since we had garbage-collection issues with many
>> small objects on the heap in the past, we decided to respawn the uwsgi
>> service every few k requests and use multiple processes to avoid a
>> down-time in the respawn phase (and every process has multiple threads of
>> course).
>>
>> Now, it seems like each time the Python process respawns, that flask sets
>> up our blueprints and then starts accepting requests, but *sometimes*,
>> responds with a 404 to clients early during this phase. This happens very
>> rarely, but since there is considerable load on the server, it actually
>> starts adding up so that we need to something about it.
>>
>> I have spent quite some time debugging the blueprints, and my findings -
>> so far - show that at app creation time, only one thread is running in
>> Python *after* the blueprints have been registered. However, when I add
>> code to log details about the blueprints / URL-rules when the 404 happens,
>> then the blueprints are not all/fully loaded yet.
>>
>> It seems to me like new threads are created and start accepting
>> connections before all blueprints/rules have been fully registered. Here is
>> some code to show what I mean:
>>
>> def create_app(...):
>>     app = flask.Flask(__name__)
>>     [...]
>>
>>     @app.after_request
>>     def after_request(response):
>>         if response.status_code == 404:
>>             # DBG2
>>             print 'dbg', response.status_code, request.base_url
>>             print threading.active_count(), "active threads"
>>             print app.url_map.__dict__.get('foo/mytesturl')
>>         return response
>>
>>     app.register_blueprint(mymodel, url_prefix="/foo")
>>     # DBG1
>>     print 'done loading', app.url_map.__dict__
>>     print threading.active_count(), "active threads"
>>     return app
>>
>> While the DBG1 code shows everything correctly, the DBG2 code shows a
>> sub-set of my URL-handlers (which is why the 404 gets executed).
>>
>> BTW, at DBG1 point, the logs usually show "1 active threads", while at
>> DBG2 many more threads are loaded, making me assume there is some sort of
>> lazy/on-demand loading of the rules. Is this possible?
>>
>> Again, the code works, it's just that there is this race where I get 404s.
>>
>> Is this a known bug in python-flask=0.8-1? If so, it'd be worth
>> investigating on a newer version, or is there a specific way to tell flask
>> to fully load all blueprints/rules before accepting connections?
>>
>> Sorry for the overly long email, I wanted to get all the details right :)
>>
>> Thanks!
>> -Clemens
>>
>>
>> --
>> Clemens Kolbitsch
>> Security Researcher
>> kolbitsch@lastline.com
>> Mobile +1 (206) 356-7745
>> Land +1 (805) 456-7076
>>
>> Lastline, Inc.
>> 6950 Hollister Avenue, Suite 101
>> Goleta, CA 93117
>>
>> www.lastline.com
>>
>


-- 
Clemens Kolbitsch
Security Researcher
kolbitsch@lastline.com
Mobile +1 (206) 356-7745
Land +1 (805) 456-7076

Lastline, Inc.
6950 Hollister Avenue, Suite 101
Goleta, CA 93117

www.lastline.com

Re: [flask] Possible blueprint setup race condition

From:
Shawn Milochik
Date:
2014-04-01 @ 02:49
Clemens,

Sorry, I don't have any insight into your specific issue. I was really
hoping the upgrade would give you more information.  I don't expect that
switching to gunicorn would fix it, although it might mitigate it because
you'd have separate workers, any of which could be handling active requests
while the others were restarting.

Hopefully someone else on the list can help. Otherwise, I'd continue down
the path you're on -- use Python's fantastic introspection to home in on
the problem via logging until you can either fix it or file a specific bug
report.

Shawn

Re: [flask] Possible blueprint setup race condition

From:
Clemens Kolbitsch
Date:
2014-04-01 @ 03:01
ok, thanks for your advice and for confirming that I'm not doing anything
obviously stupid :)


On Mon, Mar 31, 2014 at 7:49 PM, Shawn Milochik <shawn.milo@gmail.com>wrote:

> Clemens,
>
> Sorry, I don't have any insight into your specific issue. I was really
> hoping the upgrade would give you more information.  I don't expect that
> switching to gunicorn would fix it, although it might mitigate it because
> you'd have separate workers, any of which could be handling active requests
> while the others were restarting.
>
> Hopefully someone else on the list can help. Otherwise, I'd continue down
> the path you're on -- use Python's fantastic introspection to home in on
> the problem via logging until you can either fix it or file a specific bug
> report.
>
> Shawn
>



-- 
Clemens Kolbitsch
Security Researcher
kolbitsch@lastline.com
Mobile +1 (206) 356-7745
Land +1 (805) 456-7076

Lastline, Inc.
6950 Hollister Avenue, Suite 101
Goleta, CA 93117

www.lastline.com

Re: [flask] Possible blueprint setup race condition

From:
Emanuil Tolev
Date:
2014-04-01 @ 08:31
This is a strange bug somewhere (no idea whether in Flask or rest of
stack). We use blueprints a lot too, have 7-8 loading at init in several of
our apps. Never experienced routes missing. We do use gunicorn to deploy
though, in the default pre-fork sync worker setup, which I believe spawns
new processes, not threads. We do also kill workers every X requests on 1
of the apps.

Have you tried x-posting to the uwsgi mailing list and/or
uswsgi-plugin-python issue tracker too? Lots of non-trivial Flask apps use
blueprints and you've tested on a recent version. At least some of the
uwsgi user base has got to be on Flask too.

I wish I could help you more but "new threads are created and start
accepting connections before all blueprints/rules have been fully
registered" to me just sounds like the problem could come from any part of
the stack.

Greetings,
Emanuil


On 1 April 2014 04:01, Clemens Kolbitsch <kolbitsch@lastline.com> wrote:

> ok, thanks for your advice and for confirming that I'm not doing anything
> obviously stupid :)
>
>
> On Mon, Mar 31, 2014 at 7:49 PM, Shawn Milochik <shawn.milo@gmail.com>wrote:
>
>> Clemens,
>>
>> Sorry, I don't have any insight into your specific issue. I was really
>> hoping the upgrade would give you more information.  I don't expect that
>> switching to gunicorn would fix it, although it might mitigate it because
>> you'd have separate workers, any of which could be handling active requests
>> while the others were restarting.
>>
>> Hopefully someone else on the list can help. Otherwise, I'd continue down
>> the path you're on -- use Python's fantastic introspection to home in on
>> the problem via logging until you can either fix it or file a specific bug
>> report.
>>
>> Shawn
>>
>
>
>
> --
> Clemens Kolbitsch
> Security Researcher
> kolbitsch@lastline.com
> Mobile +1 (206) 356-7745
> Land +1 (805) 456-7076
>
> Lastline, Inc.
> 6950 Hollister Avenue, Suite 101
> Goleta, CA 93117
>
> www.lastline.com
>

unsubscribe

From:
Date:
2014-04-01 @ 03:04