librelist archives

« back to archive

Flask Cache Busting Static Content (incl. Images)

Flask Cache Busting Static Content (incl. Images)

From:
Fotis Gimian
Date:
2014-05-08 @ 13:26
Hey there guys, hope you're doing well.

I was wondering if a current mechanism exists to perform cache busting of
static files such as images, media files and CSS / JS?

My requirements for cache busting are:

   - It should use an MD5 checksum of the file contents to create a unique
   link.
   - The methodology used should not be a GET parameter as some caches
   don't honor this.
   - It would cover all types of static content (including binary files
   like images).

So far I'm aware of these solutions:

*Flask-Assets / Webassets*: Although this is an excellent solution, it does
not cover images or other binary data.  See
https://github.com/miracle2k/webassets/issues/117 for more information.

*Snippet by Armin*: As per Armin's excellent presentation
https://speakerdeck.com/mitsuhiko/advanced-flask-patterns-1, the following
code snippet is suggested.

from hashlib import md5
import pkg_resources


ASSET_REVISION = md5(str(pkg_resources.get_distribution(
    'Package-Name').version)).hexdigest()[:14]


@app.url_defaults
def static_cache_buster(endpoint, values):
    if endpoint == 'static':
        values['_v'] = ASSET_REVISION

This leads us in the right direction but is not using the MD5 checksum of
the file and is also using GET parameters instead of a new filename.
 Updating this solution to use an MD5 checksum would be very inefficient
since the function is run each and every time a static file is downloaded.

For production, Rails seems to allow you to run a task which calculates the
MD5 checksum of all assets and copy new versions of the files to the public
/ static directory with the appropriate filename (e.g.
app/assets/images/cat.jpg becomes
public/assets/cat-5b381365ec9d97fecec03df560d6b1bc.jpg).  A series of
mappings are then stored in a YAML file which is read when the app starts
up.

e.g.

{
    "files": {
        "cat-5b381365ec9d97fecec03df560d6b1bc.jpg": {
            "logical_path":"cat.jpg",
            "mtime":"2013-10-05T04:18:39+10:00",
            "size":1718186,
            "digest":"5b381365ec9d97fecec03df560d6b1bc"
        },
        "application-10faafa06109fa14582542ac1852f5c5.js": {
            "logical_path":"application.js",
            "mtime":"2014-05-08T22:52:46+10:00",
            "size":112729,
            "digest":"10faafa06109fa14582542ac1852f5c5"
        },
        "application-cf0b4d12cded06d61176668723302161.css": {
            "logical_path":"application.css",
            "mtime":"2014-05-08T22:49:41+10:00",
            "size":811,"digest":"cf0b4d12cded06d61176668723302161"
        }
    },
    "assets": {
        "cat.jpg": "cat-5b381365ec9d97fecec03df560d6b1bc.jpg",
        "application.js":"application-10faafa06109fa14582542ac1852f5c5.js",
        "application.css":"application-cf0b4d12cded06d61176668723302161.css"
    }
}

This also leads me to my next question.  Do we currently have a way to
easily collect files from multiple source asset / static directories into
one one master public directory for serving with Nginx in production
(similar to Django's collectstatic command or the Rails assets:precompile
rake job)?

I would love to hear how my fellow Flask developers are tackling these
things at the current time :)

Thanks heaps for all your time
Fotis

Re: [flask] Flask Cache Busting Static Content (incl. Images)

From:
Matthias Urlichs
Date:
2014-05-08 @ 14:31
Hi,

Fotis Gimian:
> I was wondering if a current mechanism exists to perform cache busting of
> static files such as images, media files and CSS / JS?

I calculate the hash when I upload the file (or initially fill the site).
Each file gets mirrored by a database-backed object that knows the file's
storage location and physical path, which is calculated from the object ID
and the hash (hashed by object ID, so that I won't end up with a million-
file directory).

The guts of that are here: https://gist.github.com/smurfix/827026360f7ee3245b86

There's another object that links the website and the generic URL
(http://www.example/static/js/jquery.js) to the concrete current revision
of a file. This way, url_for("asset","js/jQuery.min.js")
can emit "/static/js/jquery.js" which then gets redirected, or I can tell
the storage driver to refer directly to the path, or I can tell it to emit
a link to a cloud-based URL which mirrors the directory.

All of this is of course cached, and I plan to push the change notices
through RabbitMQ (instead of processing them locally with Blinker) so that
I can run the site consistently on more than one backend, and process
"here's a new file, please send it to the cloud server" as (a)synchronously
as I want.

-- 
-- Matthias Urlichs

Re: [flask] Flask Cache Busting Static Content (incl. Images)

From:
Fotis Gimian
Date:
2014-05-09 @ 12:04
Hey there Matthias, very cool solution! :)

Thanks a lot for your reply and the code snippet!
Fotis


On 9 May 2014 00:31, Matthias Urlichs <matthias@urlichs.de> wrote:

> Hi,
>
> Fotis Gimian:
> > I was wondering if a current mechanism exists to perform cache busting of
> > static files such as images, media files and CSS / JS?
>
> I calculate the hash when I upload the file (or initially fill the site).
> Each file gets mirrored by a database-backed object that knows the file's
> storage location and physical path, which is calculated from the object ID
> and the hash (hashed by object ID, so that I won't end up with a million-
> file directory).
>
> The guts of that are here:
> https://gist.github.com/smurfix/827026360f7ee3245b86
>
> There's another object that links the website and the generic URL
> (http://www.example/static/js/jquery.js) to the concrete current revision
> of a file. This way, url_for("asset","js/jQuery.min.js")
> can emit "/static/js/jquery.js" which then gets redirected, or I can tell
> the storage driver to refer directly to the path, or I can tell it to emit
> a link to a cloud-based URL which mirrors the directory.
>
> All of this is of course cached, and I plan to push the change notices
> through RabbitMQ (instead of processing them locally with Blinker) so that
> I can run the site consistently on more than one backend, and process
> "here's a new file, please send it to the cloud server" as (a)synchronously
> as I want.
>
> --
> -- Matthias Urlichs
>