librelist archives

« back to archive

tnetstrings speed test

tnetstrings speed test

From:
Zed A. Shaw
Date:
2011-03-21 @ 00:19
I got the naive tnetstrings working well and now I have a simple speed
test harness going:

http://codepad.org/Uj42SuMo

Preliminary tests by doing the roundtrip tests with 300k runs gets
about:

JSON:
    real    3m24.780s
    user    3m24.520s
    sys 0m0.020s

tnetstrings:
    real    1m5.692s
    user    1m5.650s
    sys 0m0.010s

SimpleJSON:
    real    0m57.427s
    user    0m57.350s
    sys 0m0.030s

So, the stock JSON in python sort of sucks, but apparently simplejson
includes a little _speedups.c module that triples its performance.
That's very interesting to find out (and sort of annoying).

The naive tnetstrings then is at least as fast as simplejson with its C
library, and 3x the speed of stock JSON.  Now the question is if that
naive implementation can be sped up any, or is it only possible to speed
it up with C the way SimpleJSON did.

As for the code size, tnetstrings wins by a whole hell of a lot being
about 1/10th the size.

Finally, the real test will be how it fairs in C.  The whole purpose is
really to have a fast thing that Mongrel2 can use for "internal" stuff
like proxy handlers, control port stuff, etc.

-- 
Zed A. Shaw
http://zedshaw.com/

Re: [mongrel2] tnetstrings speed test

From:
Jason Miller
Date:
2011-03-22 @ 22:24
On 17:19 Sun 20 Mar     , Zed A. Shaw wrote:
> I got the naive tnetstrings working well and now I have a simple speed
> test harness going:
I said this on IRC, but posting it here for the record.

I've implemented a simple tnetstrings for common lisp here:
https://github.com/jasom/tnetstring

Note that for deeply-nested structures, writing out tnetstrings is going
to be slower than json, since tnetstrings needs to render all children
structures before it can write-out the length of the parent structure.

On the other-hand decoding deeply-nested tnetstrings is a big win vs
json since you don't have to search for (possibly quoted) delimiters.

I don't think this will affect  mongrel2 much though, since I think the
maximum depth mongrel2 is going to encode should be 3:

  headers dictionary:
    cookies dictionary:
       list of cookies

-Jason

P.S.
If you want to see the asymmetry in relative performance on
encode/decode try out something like:

"243:238:233:228:223:218:213:208:203:198:193:188:183:178:173:168:163:158:153:148:143:138:133:128:123:118:113:108:103:99:95:91:87:83:79:75:71:67:63:59:55:51:47:43:39:35:31:27:23:19:15:11:hello-there,]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]"

Re: [mongrel2] tnetstrings speed test

From:
Ryan Kelly
Date:
2011-03-22 @ 22:55
On Tue, 2011-03-22 at 15:24 -0700, Jason Miller wrote:
> On 17:19 Sun 20 Mar     , Zed A. Shaw wrote:
> > I got the naive tnetstrings working well and now I have a simple speed
> > test harness going:
> I said this on IRC, but posting it here for the record.
> 
> I've implemented a simple tnetstrings for common lisp here:
> https://github.com/jasom/tnetstring
> 
> Note that for deeply-nested structures, writing out tnetstrings is going
> to be slower than json, since tnetstrings needs to render all children
> structures before it can write-out the length of the parent structure.

Yeah, that's a pain.  The pure-python version has the same problem, it
has to generate lots of little string objects to figure out their
lengths, then join them all together.

A trick that seems to work nicely in the C-extension version is to
render everything in reverse, i.e. you output the typecode tag, then
recursively render the nested structures in reverse, then write the
length.  This lets you write everything into one big char* buffer the do
a single in-place reverse to fix it up at the end.

Not sure how feasible this would be in higher-level languages.

Interestingly, the cjson module also renders JSON by producing lots of
little strings and joining them all together, even though it doesn't
have to.  Because of this, _tnetstring absolutely *smokes* cjson
rendering your deeply nested list example - almost 85% speedup on my
machine.


  Cheers,

     Ryan

> On the other-hand decoding deeply-nested tnetstrings is a big win vs
> json since you don't have to search for (possibly quoted) delimiters.
> 
> I don't think this will affect  mongrel2 much though, since I think the
> maximum depth mongrel2 is going to encode should be 3:
> 
>   headers dictionary:
>     cookies dictionary:
>        list of cookies
> 
> -Jason
> 
> P.S.
> If you want to see the asymmetry in relative performance on
> encode/decode try out something like:
> 
"243:238:233:228:223:218:213:208:203:198:193:188:183:178:173:168:163:158:153:148:143:138:133:128:123:118:113:108:103:99:95:91:87:83:79:75:71:67:63:59:55:51:47:43:39:35:31:27:23:19:15:11:hello-there,]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]"
> 
> 

-- 
Ryan Kelly
http://www.rfk.id.au  |  This message is digitally signed. Please visit
ryan@rfk.id.au        |  http://www.rfk.id.au/ramblings/gpg/ for details

Re: [mongrel2] tnetstrings speed test

From:
Tordek
Date:
2011-03-23 @ 05:46
http://dpaste.de/dFOH/

Here's a simple PHP one. It's 10 times slower than the native JSON
parser, which I assume is done in C.

Re: [mongrel2] tnetstrings speed test

From:
Loic d'Anterroches
Date:
2011-03-23 @ 07:52

On 2011-03-23 06:46, Tordek wrote:
> http://dpaste.de/dFOH/
> 
> Here's a simple PHP one. It's 10 times slower than the native JSON
> parser, which I assume is done in C.

Great! I will work on it to find the hot spots and see if we can do
better. Anyway, I will create a PHP C extension for tnetstrings.

Yes json_encode and json_decode are coded in C. Also, note that
json_encode is not binary safe in PHP where tnetstrings are.

loïc

Re: [mongrel2] tnetstrings speed test

From:
Ryan Kelly
Date:
2011-03-23 @ 08:11
On Wed, 2011-03-23 at 08:52 +0100, Loic d'Anterroches wrote:
> 
> On 2011-03-23 06:46, Tordek wrote:
> > http://dpaste.de/dFOH/
> > 
> > Here's a simple PHP one. It's 10 times slower than the native JSON
> > parser, which I assume is done in C.
> 
> Great! I will work on it to find the hot spots and see if we can do
> better. Anyway, I will create a PHP C extension for tnetstrings.

I've put my python C-extension up on github, feel free to cannibalise
any of it if you want.  The stuff in tns_core.c should be fairly
re-usable:

  https://github.com/rfk/tnetstring/blob/master/tnetstring/tns_core.c


 Cheers,

    Ryan

-- 
Ryan Kelly
http://www.rfk.id.au  |  This message is digitally signed. Please visit
ryan@rfk.id.au        |  http://www.rfk.id.au/ramblings/gpg/ for details

Re: [mongrel2] tnetstrings speed test

From:
Zed A. Shaw
Date:
2011-03-25 @ 20:33
On Wed, Mar 23, 2011 at 07:11:28PM +1100, Ryan Kelly wrote:
> I've put my python C-extension up on github, feel free to cannibalise
> any of it if you want.  The stuff in tns_core.c should be fairly
> re-usable:

Damn, I gotta read the mailing list more often.  I'll check this out
more and maybe ditch what I've got.



-- 
Zed A. Shaw
http://zedshaw.com/

Re: [mongrel2] tnetstrings speed test

From:
Loic d'Anterroches
Date:
2011-03-23 @ 08:18

On 2011-03-23 09:11, Ryan Kelly wrote:
> On Wed, 2011-03-23 at 08:52 +0100, Loic d'Anterroches wrote:
>>
>> On 2011-03-23 06:46, Tordek wrote:
>>> http://dpaste.de/dFOH/
>>>
>>> Here's a simple PHP one. It's 10 times slower than the native JSON
>>> parser, which I assume is done in C.
>>
>> Great! I will work on it to find the hot spots and see if we can do
>> better. Anyway, I will create a PHP C extension for tnetstrings.
> 
> I've put my python C-extension up on github, feel free to cannibalise
> any of it if you want.  The stuff in tns_core.c should be fairly
> re-usable:
> 
>   https://github.com/rfk/tnetstring/blob/master/tnetstring/tns_core.c

Thanks a lot! And your code is a pleasure to read. Nicely indented, with
nice spacing and vertical rythm.

loïc

Re: [mongrel2] tnetstrings speed test

From:
Tordek
Date:
2011-03-21 @ 18:03
On 20/03/11 21:19, Zed A. Shaw wrote:
> The naive tnetstrings then is at least as fast as simplejson with its C
> library, and 3x the speed of stock JSON.  Now the question is if that
> naive implementation can be sped up any, or is it only possible to speed
> it up with C the way SimpleJSON did.

With some trivial optimizations (still naïve; just faster naïve) 
I've gotten ~10% speedup that now reliably beats simplejson on my 
machine:

simplejson:  14.1191418171
Zed's:  15.4922268391
Tordek's:  14.0526180267

http://codepad.org/4pnvwqCd

Each is tested with:

print "Tordek's: ", timeit.timeit("thrash_tnetstrings()", "from 
tnetstr import thrash_tnetstrings", number=100000)

or similar.

-- 
Guillermo O. «Tordek» Freschi. Programador, Escritor, Genio Maligno.
http://tordek.com.ar :: http://twitter.com/tordek
http://www.arcanopedia.com.ar - Juegos de Rol en Argentina

Re: [mongrel2] tnetstrings speed test

From:
Zed A. Shaw
Date:
2011-03-21 @ 22:33
On Mon, Mar 21, 2011 at 03:03:29PM -0300, Tordek wrote:
> On 20/03/11 21:19, Zed A. Shaw wrote:
> > The naive tnetstrings then is at least as fast as simplejson with its C
> > library, and 3x the speed of stock JSON.  Now the question is if that
> > naive implementation can be sped up any, or is it only possible to speed
> > it up with C the way SimpleJSON did.
> 
> With some trivial optimizations (still naïve; just faster naïve) 
> I've gotten ~10% speedup that now reliably beats simplejson on my 
> machine:

Cool, I'll incorporate this when I hack on it tonight.

-- 
Zed A. Shaw
http://zedshaw.com/

Re: [mongrel2] tnetstrings speed test

From:
Tordek
Date:
2011-03-21 @ 23:36
On 21/03/11 19:33, Zed A. Shaw wrote:
> On Mon, Mar 21, 2011 at 03:03:29PM -0300, Tordek wrote:
>>  On 20/03/11 21:19, Zed A. Shaw wrote:
>>  >  The naive tnetstrings then is at least as fast as simplejson with its C
>>  >  library, and 3x the speed of stock JSON.  Now the question is if that
>>  >  naive implementation can be sped up any, or is it only possible to speed
>>  >  it up with C the way SimpleJSON did.
>>
>>  With some trivial optimizations (still naïve; just faster naïve)
>>  I've gotten ~10% speedup that now reliably beats simplejson on my
>>  machine:
>
> Cool, I'll incorporate this when I hack on it tonight.
>

And here's Darren Rush's idea, which skips an assigment; more 
repetition, but another 10% or so:

def parse_tnetstring(data):
     payload, payload_type, remain = parse_payload(data)

     if payload_type == ',':
         return payload, remain
     elif payload_type == '#':
         return int(payload), remain
     elif payload_type == '!':
         return payload == 'true', remain
     elif payload_type == '~':
         assert len(payload) == 0, "Payload must be 0 length for null."
         return None, remain
     elif payload_type == '}':
         return parse_dict(payload), remain
     elif payload_type == ']':
         return parse_list(payload), remain
     else:
         assert False, "Invalid payload type: %r" % payload_type


-- 
Guillermo O. «Tordek» Freschi. Programador, Escritor, Genio Maligno.
http://tordek.com.ar :: http://twitter.com/tordek
http://www.arcanopedia.com.ar - Juegos de Rol en Argentina

Re: [mongrel2] tnetstrings speed test

From:
Bobby Powers
Date:
2011-03-22 @ 03:18
I get mostly similar results.  Runs of 50k (I'm impatient), like:
[bpowers@vyse tns]$ time python zed.py tns 50000
(using 'user' time)
[bpowers@vyse tns]$ python --version
Python 2.7.1

Zed: 0m8.486s
Tordek: 0m7.708s
Darren: 0m7.544s

But, json got a lot faster:
json: 0m7.510s
simplejson: 0m7.320s
(simplejson v2.1.3)

Wow!  json tripled in speed, and is now almost as fast as simplejson!
This perplexed me until I found this gem in the python 2.7 release
notes:
* Updated module: The json module was upgraded to version 2.0.9 of the
simplejson package, which includes a C extension that makes encoding
and decoding faster. (Contributed by Bob Ippolito; issue 4136.)


yours,
Bobby

On Mon, Mar 21, 2011 at 4:36 PM, Tordek <kedrot@gmail.com> wrote:
>
> On 21/03/11 19:33, Zed A. Shaw wrote:
> > On Mon, Mar 21, 2011 at 03:03:29PM -0300, Tordek wrote:
> >>  On 20/03/11 21:19, Zed A. Shaw wrote:
> >>  >  The naive tnetstrings then is at least as fast as simplejson with its C
> >>  >  library, and 3x the speed of stock JSON.  Now the question is if that
> >>  >  naive implementation can be sped up any, or is it only possible to speed
> >>  >  it up with C the way SimpleJSON did.
> >>
> >>  With some trivial optimizations (still naïve; just faster naïve)
> >>  I've gotten ~10% speedup that now reliably beats simplejson on my
> >>  machine:
> >
> > Cool, I'll incorporate this when I hack on it tonight.
> >
>
> And here's Darren Rush's idea, which skips an assigment; more
> repetition, but another 10% or so:
>
> def parse_tnetstring(data):
>     payload, payload_type, remain = parse_payload(data)
>
>     if payload_type == ',':
>         return payload, remain
>     elif payload_type == '#':
>         return int(payload), remain
>     elif payload_type == '!':
>         return payload == 'true', remain
>     elif payload_type == '~':
>         assert len(payload) == 0, "Payload must be 0 length for null."
>         return None, remain
>     elif payload_type == '}':
>         return parse_dict(payload), remain
>     elif payload_type == ']':
>         return parse_list(payload), remain
>     else:
>         assert False, "Invalid payload type: %r" % payload_type
>
>
> --
> Guillermo O. «Tordek» Freschi. Programador, Escritor, Genio Maligno.
> http://tordek.com.ar :: http://twitter.com/tordek
> http://www.arcanopedia.com.ar - Juegos de Rol en Argentina