- Potential Morton code optimization opportunity? by Ethan Hereth
- Re: [p4est] Potential Morton code optimization opportunity? by Tobin Isaac

- From:
- Ethan Hereth
- Date:
- 2014-10-28 @ 21:36

Hey Carsten et al., I came across this blog post the other day presenting a couple ways of significantly optimizing the Morton encoding logic. He terms the two optimizations the "Magic bits" and "Look up table" methods. The numbers seem to speak for themselves in terms of the speedups gained from these optimization. I.e. he gets order of magnitude or more speedups with these methods. I dug around in the code to see if this is applicable to p4est but I'll be honest with you all, all those &, |, <<, >>, etc. make my head spin. My direct background is not in computer science and I have a really hard time figuring out what is going on. I really appreciate this code, that doesn't mean I really understand it!! It doesn't seem to me that these methods are directly applicable to what p4est does but I think it's possible that it could be? I figured that instead of trying to figure it out for myself, which I don't really have time for anyway, I'd pass on the information and see if somebody more qualified than me thinks it's useful! The link is here: http://www.forceflow.be/2013/10/07/morton-encodingdecoding-through-bit-interleaving-implementations/ Is this something that p4est could benefit from? It looks like to me that them method p4est uses currently is the 'For loop based' method which is by far the slowest of the methods presented in the blog. Please forgive me the noise if this is not at all relevant to p4est, but I figured it couldn't hurt to point it out just in case! Cheers! Ethan Alan

- From:
- Tobin Isaac
- Date:
- 2014-10-28 @ 22:29

Hi Ethan Alan, Interesting post. I was aware of the magic bits approach, but I hadn't considered the utility of a lookup table. In p4est, we almost always deal with the "decoded" coordinates, and rarely with the "encoded" Morton id. There are a few places where we need to get the next/previous quadrant wrt Morton id, but these do not appear to be in performance-critical loops. What _is_ performance critical to p4est is comparing decoded coordinates without encoding, which is in p4est_quadrant_compare(). This uses one simple lookup table and a handful of bitwise operations (its a little more complicated because we allow negative coordinates). While this could be improved using intrinsics to get at the assembly (at one point I think I remember Hari Sundar saying this is done in dendro), it wouldn't be portable, and I don't expect it to make a big difference. Still, maybe an interesting project for a student? Toby On Tue, Oct 28, 2014 at 05:36:04PM -0400, Ethan Hereth wrote: > Hey Carsten et al., > > I came across this blog post the other day presenting a couple ways of > significantly optimizing the Morton encoding logic. He terms the two > optimizations the "Magic bits" and "Look up table" methods. The numbers > seem to speak for themselves in terms of the speedups gained from these > optimization. I.e. he gets order of magnitude or more speedups with these > methods. > > I dug around in the code to see if this is applicable to p4est but I'll be > honest with you all, all those &, |, <<, >>, etc. make my head spin. My > direct background is not in computer science and I have a really hard time > figuring out what is going on. I really appreciate this code, that doesn't > mean I really understand it!! > > It doesn't seem to me that these methods are directly applicable to what > p4est does but I think it's possible that it could be? I figured that > instead of trying to figure it out for myself, which I don't really have > time for anyway, I'd pass on the information and see if somebody more > qualified than me thinks it's useful! > > The link is here: > http://www.forceflow.be/2013/10/07/morton-encodingdecoding-through-bit-interleaving-implementations/ > > Is this something that p4est could benefit from? It looks like to me that > them method p4est uses currently is the 'For loop based' method which is by > far the slowest of the methods presented in the blog. > > Please forgive me the noise if this is not at all relevant to p4est, but I > figured it couldn't hurt to point it out just in case! > > Cheers! > > Ethan Alan