librelist archives

« back to archive

cubed sphere mesh with p6est

cubed sphere mesh with p6est

From:
Mueller, Andreas Fornatl, Gy, National Research Council Research Associate
Date:
2015-07-01 @ 00:51
Dear p4est developers,

Yesterday I tried to run a scalability study on the BG/Q Mira. My binaries
work perfectly on 1024 nodes on Mira. However when I tried to run this 
simulation (same binary, same settings, same input-file) on 28322 nodes 
(1812608 MPI processes) p4est failed. Below is the beginning of the 
log-output. I’m using a cubed sphere mesh with 5437824 columns (3 columns 
per MPI process). Any suggestion what I could do to make this work?

Best,
Andreas

[libsc] This is libsc 1.1.23-fbcc
[libsc] CPP                      mpixlc_r -E
[libsc] CPPFLAGS                 -I/soft/libraries/alcf/current/xl/ZLIB/include
[libsc] CC                       mpixlc_r
[libsc] CFLAGS                   -g
[libsc] LDFLAGS                  -L/soft/libraries/alcf/current/xl/ZLIB/lib
[libsc] LIBS                       -lz   
[p4est] This is p4est 0.3.5.612-07bb-dirty
[p4est] CPP                      mpixlc_r -E
[p4est] CPPFLAGS                 -I/soft/libraries/alcf/current/xl/ZLIB/include
[p4est] CC                       mpixlc_r
[p4est] CFLAGS                   -g
[p4est] LDFLAGS                  -L/soft/libraries/alcf/current/xl/ZLIB/lib
[p4est] LIBS                       -lz   
[p4est] Into p4est_new with min quadrants 0 level 0 uniform 1
[p4est]  New p4est with 6 trees on 1 processors
[p4est]  Initial level 0 potential global quadrants 6 per tree 1
[p4est] Done p4est_new with 6 total quadrants
[p4est] Into p4est_ghost_new CORNER
[p4est] Done p4est_ghost_new
[p4est] Into p4est_lnodes_new, degree 119
[p4est] Done p4est_lnodes_new with 84968 global nodes
[p4est] Into p6est_new with min layers 0 z-level 3
[p4est]  Into p4est_new with min quadrants 0 level 0 uniform 1
[p4est]   New p4est with 84966 trees on 1812608 processors
[p4est]   Initial level 0 potential global quadrants 84966 per tree 1
[p4est]  Done p4est_new with 84966 total quadrants
[p4est] Done p6est_new with 424830 total layers in 84966 total columns
[p4est] Into p6est_refine_columns with 424830 total layers in 84966 total columns
[p4est]  Into p4est_refine with 84966 total quadrants, allowed level 29
[p4est]  Done p4est_refine with 339864 total quadrants
[p4est] Done p6est_refine_columns with 1699320 total layers in 339864 
total columns
[p4est] Into p6est_refine_columns with 1699320 total layers in 339864 
total columns
[p4est]  Into p4est_refine with 339864 total quadrants, allowed level 29
[p4est]  Done p4est_refine with 1359456 total quadrants
[p4est] Done p6est_refine_columns with 6797280 total layers in 1359456 
total columns
[p4est] Into p6est_refine_columns with 6797280 total layers in 1359456 
total columns
[p4est]  Into p4est_refine with 1359456 total quadrants, allowed level 29
[p4est]  Done p4est_refine with 5437824 total quadrants
[p4est] Done p6est_refine_columns with 27189120 total layers in 5437824 
total columns
[p4est] Into p6est_refine_layers with 27189120 total layers in 5437824 
total columns, allowed level -1
[p4est] Done p6est_refine_layers with 54378240 total layers  in 5437824 
total columns
[p4est] Into p6est_parition with 54378240 total layers in 5437824 total columns
[p4est]  Into p4est_partition with 5437824 total quadrants
[p4est]   Into p4est_partition_given with 5437824 total quadrants
[libsc 163279] Abort: Allocation
[libsc 5263] Abort: Allocation
[libsc 168463] Abort: Allocation
[libsc 197135] Abort: Allocation
[libsc 98767] Abort: Allocation
[libsc 163279] Abort: src/sc.c:286
[libsc 198351] Abort: Allocation
[libsc 5263] Abort: src/sc.c:286
[libsc 175695] Abort: Allocation
[libsc 26767] Abort: Allocation
[libsc 317647] Abort: Allocation
[libsc 168463] Abort: src/sc.c:286
[libsc 219919] Abort: Allocation
[libsc 124111] Abort: Allocation
[libsc 197135] Abort: src/sc.c:286
[libsc 247695] Abort: Allocation
[libsc 331343] Abort: Allocation
[libsc 287759] Abort: Allocation
[libsc 29391] Abort: Allocation
[libsc 141327] Abort: Allocation
[libsc 276175] Abort: Allocation
[libsc 5519] Abort: Allocation
[libsc 52047] Abort: Allocation
[libsc 253007] Abort: Allocation
[libsc 210063] Abort: Allocation
[libsc 349199] Abort: Allocation
[libsc 98767] Abort: src/sc.c:286
[libsc 184207] Abort: Allocation
[libsc 126607] Abort: Allocation
[libsc 326031] Abort: Allocation
[libsc 655] Abort: Allocation
[libsc 163599] Abort: Allocation
[libsc 20815] Abort: Allocation
[libsc 219727] Abort: Allocation
[libsc 223887] Abort: Allocation
[libsc 338383] Abort: Allocation
[libsc 163279] Abort: Obtained 12 stack frames
[libsc 93455] Abort: Allocation
[libsc 21519] Abort: Allocation
[libsc 449103] Abort: Allocation
[libsc 317135] Abort: Allocation
[libsc 44111] Abort: Allocation
[libsc 235407] Abort: Allocation
[libsc 357007] Abort: Allocation
[libsc 194191] Abort: Allocation
[libsc 77263] Abort: Allocation
[libsc 225039] Abort: Allocation
[libsc 104207] Abort: Allocation
[libsc 172367] Abort: Allocation
[libsc 122767] Abort: Allocation
[libsc 198351] Abort: src/sc.c:286
[libsc 163983] Abort: Allocation
[libsc 129999] Abort: Allocation
[libsc 201615] Abort: Allocation
[libsc 112143] Abort: Allocation
[libsc 233743] Abort: Allocation
[libsc 285263] Abort: Allocation
[libsc 342095] Abort: Allocation
[libsc 26127] Abort: Allocation
[libsc 22479] Abort: Allocation
[libsc 127311] Abort: Allocation
[libsc 35663] Abort: Allocation
[libsc 115471] Abort: Allocation
[libsc 461967] Abort: Allocation
[libsc 5263] Abort: Obtained 12 stack frames
[libsc 373903] Abort: Allocation
[libsc 422095] Abort: Allocation
[libsc 533135] Abort: Allocation
[libsc 576143] Abort: Allocation
[libsc 363983] Abort: Allocation
[libsc 380239] Abort: Allocation
[libsc 436751] Abort: Allocation
[libsc 367439] Abort: Allocation
[libsc 175695] Abort: src/sc.c:286
[libsc 77647] Abort: Allocation
[libsc 107471] Abort: Allocation
[libsc 160911] Abort: Allocation
[libsc 94095] Abort: Allocation
[libsc 160015] Abort: Allocation
[libsc 180303] Abort: Allocation
[libsc 394895] Abort: Allocation
[libsc 255247] Abort: Allocation
[libsc 566927] Abort: Allocation
[libsc 309391] Abort: Allocation
[libsc 109327] Abort: Allocation
[libsc 117007] Abort: Allocation
[libsc 26767] Abort: src/sc.c:286
...

Re: [p4est] cubed sphere mesh with p6est

From:
Tobin Isaac
Date:
2015-07-01 @ 02:48
Hi Andreas,

There is O(P) storage per process in the p6est_t object and p4est_t
object: it's kept minimal, but it's still there.  There is also some
in the workspace for the partitioning algorithm.  Let's count:

p6est_t:
  1 gloidx   = 8 bytes

p4est_t:
 +1 gloidx   = 8 bytes
 +1 quadrant = 20 bytes

p4est_partition_ext():
 +1 locidx   = 4 bytes

p4est_partition_given():
 +3 gloidx   = 24 bytes
 +2 locidx   = 8  bytes
 +2 char     = 2  bytes
-----------------------
             = 74 bytes / process / process
             * 1812608 processes
             * 64 processes / node
-----------------------
             = 8.58 Gbytes

Congratulations, more than half of your RAM is used up.

We were already seeing this on the scaling curves from juqueen, and
Carsten and I have been talking about what needs to change for O(10^6)
processes.  Those conversations were about API changes for the next
version of p4est, however, and I know you're looking for an immediate
solution.

The fastest solution might be to use MPI_Win_allocate_shared() to
reduce the redundancy (by a factor of 64).  The ALCF website says that
although MPI-3 isn't supported on BG/Q the RMA features are supported
through PAMI, so I think this might work.  I'll contact you with more
details.

  Toby

On Wed, Jul 01, 2015 at 12:51:10AM +0000, Mueller, Andreas FORNATL, GY, 
National Research Council Research  Associate wrote:
> Dear p4est developers,
> 
> Yesterday I tried to run a scalability study on the BG/Q Mira. My 
binaries work perfectly on 1024 nodes on Mira. However when I tried to run
this simulation (same binary, same settings, same input-file) on 28322 
nodes (1812608 MPI processes) p4est failed. Below is the beginning of the 
log-output. I’m using a cubed sphere mesh with 5437824 columns (3 columns 
per MPI process). Any suggestion what I could do to make this work?
> 
> Best,
> Andreas
> 
> [libsc] This is libsc 1.1.23-fbcc
> [libsc] CPP                      mpixlc_r -E
> [libsc] CPPFLAGS                 -I/soft/libraries/alcf/current/xl/ZLIB/include
> [libsc] CC                       mpixlc_r
> [libsc] CFLAGS                   -g
> [libsc] LDFLAGS                  -L/soft/libraries/alcf/current/xl/ZLIB/lib
> [libsc] LIBS                       -lz   
> [p4est] This is p4est 0.3.5.612-07bb-dirty
> [p4est] CPP                      mpixlc_r -E
> [p4est] CPPFLAGS                 -I/soft/libraries/alcf/current/xl/ZLIB/include
> [p4est] CC                       mpixlc_r
> [p4est] CFLAGS                   -g
> [p4est] LDFLAGS                  -L/soft/libraries/alcf/current/xl/ZLIB/lib
> [p4est] LIBS                       -lz   
> [p4est] Into p4est_new with min quadrants 0 level 0 uniform 1
> [p4est]  New p4est with 6 trees on 1 processors
> [p4est]  Initial level 0 potential global quadrants 6 per tree 1
> [p4est] Done p4est_new with 6 total quadrants
> [p4est] Into p4est_ghost_new CORNER
> [p4est] Done p4est_ghost_new
> [p4est] Into p4est_lnodes_new, degree 119
> [p4est] Done p4est_lnodes_new with 84968 global nodes
> [p4est] Into p6est_new with min layers 0 z-level 3
> [p4est]  Into p4est_new with min quadrants 0 level 0 uniform 1
> [p4est]   New p4est with 84966 trees on 1812608 processors
> [p4est]   Initial level 0 potential global quadrants 84966 per tree 1
> [p4est]  Done p4est_new with 84966 total quadrants
> [p4est] Done p6est_new with 424830 total layers in 84966 total columns
> [p4est] Into p6est_refine_columns with 424830 total layers in 84966 
total columns
> [p4est]  Into p4est_refine with 84966 total quadrants, allowed level 29
> [p4est]  Done p4est_refine with 339864 total quadrants
> [p4est] Done p6est_refine_columns with 1699320 total layers in 339864 
total columns
> [p4est] Into p6est_refine_columns with 1699320 total layers in 339864 
total columns
> [p4est]  Into p4est_refine with 339864 total quadrants, allowed level 29
> [p4est]  Done p4est_refine with 1359456 total quadrants
> [p4est] Done p6est_refine_columns with 6797280 total layers in 1359456 
total columns
> [p4est] Into p6est_refine_columns with 6797280 total layers in 1359456 
total columns
> [p4est]  Into p4est_refine with 1359456 total quadrants, allowed level 29
> [p4est]  Done p4est_refine with 5437824 total quadrants
> [p4est] Done p6est_refine_columns with 27189120 total layers in 5437824 
total columns
> [p4est] Into p6est_refine_layers with 27189120 total layers in 5437824 
total columns, allowed level -1
> [p4est] Done p6est_refine_layers with 54378240 total layers  in 5437824 
total columns
> [p4est] Into p6est_parition with 54378240 total layers in 5437824 total columns
> [p4est]  Into p4est_partition with 5437824 total quadrants
> [p4est]   Into p4est_partition_given with 5437824 total quadrants
> [libsc 163279] Abort: Allocation
> [libsc 5263] Abort: Allocation
> [libsc 168463] Abort: Allocation
> [libsc 197135] Abort: Allocation
> [libsc 98767] Abort: Allocation
> [libsc 163279] Abort: src/sc.c:286
> [libsc 198351] Abort: Allocation
> [libsc 5263] Abort: src/sc.c:286
> [libsc 175695] Abort: Allocation
> [libsc 26767] Abort: Allocation
> [libsc 317647] Abort: Allocation
> [libsc 168463] Abort: src/sc.c:286
> [libsc 219919] Abort: Allocation
> [libsc 124111] Abort: Allocation
> [libsc 197135] Abort: src/sc.c:286
> [libsc 247695] Abort: Allocation
> [libsc 331343] Abort: Allocation
> [libsc 287759] Abort: Allocation
> [libsc 29391] Abort: Allocation
> [libsc 141327] Abort: Allocation
> [libsc 276175] Abort: Allocation
> [libsc 5519] Abort: Allocation
> [libsc 52047] Abort: Allocation
> [libsc 253007] Abort: Allocation
> [libsc 210063] Abort: Allocation
> [libsc 349199] Abort: Allocation
> [libsc 98767] Abort: src/sc.c:286
> [libsc 184207] Abort: Allocation
> [libsc 126607] Abort: Allocation
> [libsc 326031] Abort: Allocation
> [libsc 655] Abort: Allocation
> [libsc 163599] Abort: Allocation
> [libsc 20815] Abort: Allocation
> [libsc 219727] Abort: Allocation
> [libsc 223887] Abort: Allocation
> [libsc 338383] Abort: Allocation
> [libsc 163279] Abort: Obtained 12 stack frames
> [libsc 93455] Abort: Allocation
> [libsc 21519] Abort: Allocation
> [libsc 449103] Abort: Allocation
> [libsc 317135] Abort: Allocation
> [libsc 44111] Abort: Allocation
> [libsc 235407] Abort: Allocation
> [libsc 357007] Abort: Allocation
> [libsc 194191] Abort: Allocation
> [libsc 77263] Abort: Allocation
> [libsc 225039] Abort: Allocation
> [libsc 104207] Abort: Allocation
> [libsc 172367] Abort: Allocation
> [libsc 122767] Abort: Allocation
> [libsc 198351] Abort: src/sc.c:286
> [libsc 163983] Abort: Allocation
> [libsc 129999] Abort: Allocation
> [libsc 201615] Abort: Allocation
> [libsc 112143] Abort: Allocation
> [libsc 233743] Abort: Allocation
> [libsc 285263] Abort: Allocation
> [libsc 342095] Abort: Allocation
> [libsc 26127] Abort: Allocation
> [libsc 22479] Abort: Allocation
> [libsc 127311] Abort: Allocation
> [libsc 35663] Abort: Allocation
> [libsc 115471] Abort: Allocation
> [libsc 461967] Abort: Allocation
> [libsc 5263] Abort: Obtained 12 stack frames
> [libsc 373903] Abort: Allocation
> [libsc 422095] Abort: Allocation
> [libsc 533135] Abort: Allocation
> [libsc 576143] Abort: Allocation
> [libsc 363983] Abort: Allocation
> [libsc 380239] Abort: Allocation
> [libsc 436751] Abort: Allocation
> [libsc 367439] Abort: Allocation
> [libsc 175695] Abort: src/sc.c:286
> [libsc 77647] Abort: Allocation
> [libsc 107471] Abort: Allocation
> [libsc 160911] Abort: Allocation
> [libsc 94095] Abort: Allocation
> [libsc 160015] Abort: Allocation
> [libsc 180303] Abort: Allocation
> [libsc 394895] Abort: Allocation
> [libsc 255247] Abort: Allocation
> [libsc 566927] Abort: Allocation
> [libsc 309391] Abort: Allocation
> [libsc 109327] Abort: Allocation
> [libsc 117007] Abort: Allocation
> [libsc 26767] Abort: src/sc.c:286
> ...
>