librelist archives

« back to archive

p4est and MPI IO

p4est and MPI IO

From:
Ethan Hereth
Date:
2015-03-20 @ 21:02
Hey all,

This is a vague question I know, but I'm at a bit of a loss here.

I can build p4est and run all of its tests successfully, including
p?est_test_loadsave.

I am attempting to implement a 'restart' functionality in my software that
uses p4est. When I call p4est_save from my software it gets into the
MPI_File_open function and gets stuck. The file that I'm requesting does
get created, but the function never returns.

Clearly, my code compiles and links; I can verify that my software and
p4est links against the same version of MPI.

Is there any extra configuration that I need to do to get MPI IO working on
my end? I see that in test_loadsave2.c there are calls to sc_init and
p4est_init. Are these required?

Any help would be appreciated.

Thanks, and have a good weekend.

Ethan Alan

Re: [p4est] p4est and MPI IO

From:
Carsten Burstedde
Date:
2015-03-20 @ 21:31
Ethan,

> This is a vague question I know, but I'm at a bit of a loss here.
> 
> I can build p4est and run all of its tests successfully, including
> p?est_test_loadsave.

it's been a long time since that failed on me last, but it did happen.
Not anymore since I'm compiling mpich myself on my computers.  What's
your MPI implementation and the command line?

If it still crashes with mpich, try to break the program down and maybe
write a file with MPI yourself to check the implementation.  We haven't
found a bug in p4est_load/save in a while, but never say never...

Carsten

> I am attempting to implement a 'restart' functionality in my software that
> uses p4est. When I call p4est_save from my software it gets into the
> MPI_File_open function and gets stuck. The file that I'm requesting does
> get created, but the function never returns.
> 
> Clearly, my code compiles and links; I can verify that my software and
> p4est links against the same version of MPI.
> 
> Is there any extra configuration that I need to do to get MPI IO working on
> my end? I see that in test_loadsave2.c there are calls to sc_init and
> p4est_init. Are these required?
> 
> Any help would be appreciated.
> 
> Thanks, and have a good weekend.

Re: [p4est] p4est and MPI IO

From:
Ethan Hereth
Date:
2015-03-23 @ 14:24
Good morning Carsten,

On Fri, Mar 20, 2015 at 5:31 PM, Carsten Burstedde <
burstedde@ins.uni-bonn.de> wrote:

> Ethan,
>
> > This is a vague question I know, but I'm at a bit of a loss here.
> >
> > I can build p4est and run all of its tests successfully, including
> > p?est_test_loadsave.
>
> it's been a long time since that failed on me last, but it did happen.
> Not anymore since I'm compiling mpich myself on my computers.  What's
> your MPI implementation and the command line?
>
> If it still crashes with mpich, try to break the program down and maybe
> write a file with MPI yourself to check the implementation.  We haven't
> found a bug in p4est_load/save in a while, but never say never...


> Carsten
>

This turned out to be a strange, tricky problem. Turns out that the two
problems were unrelated. The p?est_test_loadsave tests were failing because
I had recently installed a python distribution (Anaconda) that, for some
reason, provides a suite of mpi tools (like mpicc, mpiexec, mpirun, etc)
and put them first in my path. Turns out, these don't work in lieu of the
non-python alternatives! That problem is now fixed.

My restart problems, i.e. the fact that I was not able to run p4est_save()
in my code, was a file system problem. It took our IT guy and me quite some
time to track this one down; the file system I was using to run the test
seems to be broken (something to do with file locks or something) and I am
currently able to use p4est_save() as expected on a different file system.
I'd have never found that on my own...

Thank you very much for your time and help; I think this one is resolved.

Ethan Alan


> > I am attempting to implement a 'restart' functionality in my software
> that
> > uses p4est. When I call p4est_save from my software it gets into the
> > MPI_File_open function and gets stuck. The file that I'm requesting does
> > get created, but the function never returns.
> >
> > Clearly, my code compiles and links; I can verify that my software and
> > p4est links against the same version of MPI.
> >
> > Is there any extra configuration that I need to do to get MPI IO working
> on
> > my end? I see that in test_loadsave2.c there are calls to sc_init and
> > p4est_init. Are these required?
> >
> > Any help would be appreciated.
> >
> > Thanks, and have a good weekend.
>

Re: p4est and MPI IO

From:
Ethan Hereth
Date:
2015-03-20 @ 21:11
It turns out that I lied (by accident, I promise!)

It seems that 'make check' actually does fail sometimes. I've never seen
this before and I assume that it has something to do with my system. I'm
not sure what setting has changed recently to cause this. I'll have to look
into it.

Here's the test-suite.log in case it gives anybody an idea as to what might
be going on.


================================================
   p4est 1.1.105-b240-dirty: ./test-suite.log
================================================

2 of 48 tests failed.

.. contents:: :depth: 2


FAIL: test/p4est_test_loadsave (exit: 1)
========================================

[libsc] This is libsc 1.1.54-7459
[libsc] CPP                      /usr/local/openmpi-1.6.1-gcc/bin/mpicc -E
[libsc] CPPFLAGS
[libsc] CC                       /usr/local/openmpi-1.6.1-gcc/bin/mpicc
[libsc] CFLAGS                   -O3 -Wall
[libsc] LDFLAGS
[libsc] LIBS                     -llapack -lblas -llua -lz -L/usr/local/lib
[p4est] This is p4est 1.1.105-b240-dirty
[p4est] CPP                      /usr/local/openmpi-1.6.1-gcc/bin/mpicc -E
[p4est] CPPFLAGS
[p4est] CC                       /usr/local/openmpi-1.6.1-gcc/bin/mpicc
[p4est] CFLAGS                   -O3 -Wall
[p4est] LDFLAGS
[p4est] LIBS                     -llapack -lblas -llua -lz -L/usr/local/lib
[p4est] Using file names p4est.p4c and p4est.p4p
[p4est] Into p4est_new with min quadrants 0 level 0 uniform 0
[p4est]  New p4est with 6 trees on 1 processors
[p4est]  Initial level 0 potential global quadrants 6 per tree 1
[p4est] Done p4est_new with 6 total quadrants
[p4est] Into p4est_refine with 6 total quadrants, allowed level 29
[p4est] Done p4est_refine with 116163 total quadrants
[p4est] Into p4est_inflate
[p4est] Done p4est_inflate
[p4est] Into p4est_save p4est.p4p
[p4est] Done p4est_save
[p4est] Into p4est_load p4est.p4p
[p4est]  Into p4est_inflate
[p4est]  Done p4est_inflate
[p4est] Done p4est_load with 116163 total quadrants
[p4est] Into p4est_load p4est.p4p
[p4est]  Into p4est_inflate
[p4est]  Done p4est_inflate
[p4est] Done p4est_load with 116163 total quadrants
[p4est] Into p4est_inflate
[p4est] Done p4est_inflate
[p4est] Into p4est_partition with 116163 total quadrants
[p4est] Done p4est_partition no shipping
[p4est] Into p4est_balance CORNER with 116163 total quadrants
[p4est] Done p4est_balance with 426900 total quadrants
[p4est] Into p4est_save p4est.p4p
[p4est] Done p4est_save
[p4est] Into p4est_load p4est.p4p
[p4est]  Into p4est_inflate
[p4est]  Done p4est_inflate
[p4est] Done p4est_load with 426900 total quadrants
[p4est] Into p4est_save p4est.p4p
[p4est] Done p4est_save
[p4est] Into p4est_load p4est.p4p
[libsc] This is libsc 1.1.54-7459
[libsc] CPP                      /usr/local/openmpi-1.6.1-gcc/bin/mpicc -E
[libsc] CPPFLAGS
[libsc] CC                       /usr/local/openmpi-1.6.1-gcc/bin/mpicc
[libsc] CFLAGS                   -O3 -Wall
[libsc] LDFLAGS
[libsc] LIBS                     -llapack -lblas -llua -lz -L/usr/local/lib
[p4est] This is p4est 1.1.105-b240-dirty
[p4est] CPP                      /usr/local/openmpi-1.6.1-gcc/bin/mpicc -E
[p4est] CPPFLAGS
[p4est] CC                       /usr/local/openmpi-1.6.1-gcc/bin/mpicc
[p4est] CFLAGS                   -O3 -Wall
[p4est] LDFLAGS
[p4est] LIBS                     -llapack -lblas -llua -lz -L/usr/local/lib
[p4est] Using file names p4est.p4c and p4est.p4p
[p4est] Into p4est_new with min quadrants 0 level 0 uniform 0
[p4est]  New p4est with 6 trees on 1 processors
[p4est]  Initial level 0 potential global quadrants 6 per tree 1
[p4est] Done p4est_new with 6 total quadrants
[p4est] Into p4est_refine with 6 total quadrants, allowed level 29
[p4est] Done p4est_refine with 116163 total quadrants
[p4est] Into p4est_inflate
[p4est] Done p4est_inflate
[p4est] Into p4est_save p4est.p4p
[libsc 0] Abort: read quadrant with data
[libsc 0] Abort: ../src/p4est.c:3601
[libsc 0] Abort: Obtained 9 stack frames
[libsc 0] Stack 0: libsc-1.1.54-7459.so(sc_abort+0x2f) [0x7fa2f49d3e6f]
[libsc 0] Stack 1: libsc-1.1.54-7459.so(+0x9015) [0x7fa2f49d4015]
[libsc 0] Stack 2: libp4est-1.1.105-b240-dirty.so(p4est_source_ext+0x793)
[0x7fa2f4c10153]
[libsc 0] Stack 3: libp4est-1.1.105-b240-dirty.so(p4est_load_ext+0xa9)
[0x7fa2f4c103f9]
[libsc 0] Stack 4: libp4est-1.1.105-b240-dirty.so(p4est_load+0x18)
[0x7fa2f4c104a8]
[libsc 0] Stack 5: lt-p4est_test_loadsave() [0x401f7f]
[libsc 0] Stack 6: lt-p4est_test_loadsave(main+0xf3) [0x4017d3]
[libsc 0] Stack 7: libc.so.6(__libc_start_main+0xed) [0x7fa2f2a9423d]
[libsc 0] Stack 8: lt-p4est_test_loadsave() [0x4018d5]
[p4est] Done p4est_save
[p4est] Into p4est_load p4est.p4p
[p4est]  Into p4est_inflate
[p4est]  Done p4est_inflate
[p4est] Done p4est_load with 116163 total quadrants
[p4est] Into p4est_load p4est.p4p
[p4est]  Into p4est_inflate
[p4est]  Done p4est_inflate
[p4est] Done p4est_load with 116163 total quadrants
[p4est] Into p4est_inflate
[p4est] Done p4est_inflate
[p4est] Into p4est_partition with 116163 total quadrants
[p4est] Done p4est_partition no shipping
[p4est] Into p4est_balance CORNER with 116163 total quadrants
[p4est] Done p4est_balance with 426900 total quadrants
[p4est] Into p4est_save p4est.p4p
[p4est] Done p4est_save
[p4est] Into p4est_load p4est.p4p
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[p4est]  Into p4est_inflate
[p4est]  Done p4est_inflate
[p4est] Done p4est_load with 426900 total quadrants
[p4est] Into p4est_save p4est.p4p
[p4est] Done p4est_save
[p4est] Into p4est_load p4est.p4p
[p4est]  Into p4est_inflate
[p4est]  Done p4est_inflate
[p4est] Done p4est_load with 426900 total quadrants
[p4est] Into p4est_load p4est.p4p
[p4est]  Into p4est_inflate
[p4est]  Done p4est_inflate
[p4est] Done p4est_load with 426900 total quadrants
[p4est] Mean (sigma) for conn load:               0.000266075 (0 = 0%)
[p4est] Mean (sigma) for p4est save 1:            0.107831 (0 = 0%)
[p4est] Mean (sigma) for p4est load 1a:           0.089169 (0 = 0%)
[p4est] Mean (sigma) for p4est load 1b:           0.133394 (0 = 0%)
[p4est] Mean (sigma) for p4est elements:          426900 (0 = 0%)
[p4est] Mean (sigma) for p4est save 2:            0.138642 (0 = 0%)
[p4est] Mean (sigma) for p4est load 2:            0.196907 (0 = 0%)
[p4est] Mean (sigma) for p4est save 3:            0.141814 (0 = 0%)
[p4est] Mean (sigma) for p4est load 3:            0.339907 (0 = 0%)
[p4est] Mean (sigma) for p4est load 4:            0.211425 (0 = 0%)
[p4est] Summary = [ 0.000266075 0.107831 0.089169 0.133394 426900 0.138642
0.196907 0.141814 0.339907 0.211425 ];
[p4est] Maximum = [ 0.000266075 0.107831 0.089169 0.133394 426900 0.138642
0.196907 0.141814 0.339907 0.211425 ];
[p4est] Using file names p4est.p4c and p4est.p4p
[p4est] Into p4est_new with min quadrants 0 level 0 uniform 0
[p4est]  New p4est with 6 trees on 1 processors
[p4est]  Initial level 0 potential global quadrants 6 per tree 1
[p4est] Done p4est_new with 6 total quadrants
[p4est] Into p4est_refine with 6 total quadrants, allowed level 29
[p4est] Done p4est_refine with 116163 total quadrants
[p4est] Into p4est_inflate
[p4est] Done p4est_inflate
[p4est] Into p4est_save p4est.p4p
[p4est] Done p4est_save
[p4est] Into p4est_load p4est.p4p
[p4est]  Into p4est_inflate
[p4est]  Done p4est_inflate
[p4est] Done p4est_load with 116163 total quadrants
[p4est] Into p4est_load p4est.p4p
[p4est]  Into p4est_inflate
[p4est]  Done p4est_inflate
[p4est] Done p4est_load with 116163 total quadrants
[p4est] Into p4est_inflate
[p4est] Done p4est_inflate
[p4est] Into p4est_partition with 116163 total quadrants
[p4est] Done p4est_partition no shipping
[p4est] Into p4est_balance CORNER with 116163 total quadrants
[p4est] Done p4est_balance with 426900 total quadrants
[p4est] Into p4est_save p4est.p4p
[p4est] Done p4est_save
[p4est] Into p4est_load p4est.p4p
[p4est]  Into p4est_inflate
[p4est]  Done p4est_inflate
[p4est] Done p4est_load with 426900 total quadrants
[p4est] Into p4est_save p4est.p4p
[p4est] Done p4est_save
[p4est] Into p4est_load p4est.p4p
[p4est]  Into p4est_inflate
[p4est]  Done p4est_inflate
[p4est] Done p4est_load with 426900 total quadrants
[p4est] Into p4est_load p4est.p4p
[p4est]  Into p4est_inflate
[p4est]  Done p4est_inflate
[p4est] Done p4est_load with 426900 total quadrants
[p4est] Mean (sigma) for conn load:               0.0015099 (0 = 0%)
[p4est] Mean (sigma) for p4est save 1:            0.0784469 (0 = 0%)
[p4est] Mean (sigma) for p4est load 1a:           0.0767281 (0 = 0%)
[p4est] Mean (sigma) for p4est load 1b:           0.0665359 (0 = 0%)
[p4est] Mean (sigma) for p4est elements:          426900 (0 = 0%)
[p4est] Mean (sigma) for p4est save 2:            0.110731 (0 = 0%)
[p4est] Mean (sigma) for p4est load 2:            0.191438 (0 = 0%)
[p4est] Mean (sigma) for p4est save 3:            0.143216 (0 = 0%)
[p4est] Mean (sigma) for p4est load 3:            0.189403 (0 = 0%)
[p4est] Mean (sigma) for p4est load 4:            0.163675 (0 = 0%)
[p4est] Summary = [ 0.0015099 0.0784469 0.0767281 0.0665359 426900 0.110731
0.191438 0.143216 0.189403 0.163675 ];
[p4est] Maximum = [ 0.0015099 0.0784469 0.0767281 0.0665359 426900 0.110731
0.191438 0.143216 0.189403 0.163675 ];

FAIL: test/p8est_test_loadsave (exit: 1)
========================================

[libsc] This is libsc 1.1.54-7459
[libsc] CPP                      /usr/local/openmpi-1.6.1-gcc/bin/mpicc -E
[libsc] CPPFLAGS
[libsc] CC                       /usr/local/openmpi-1.6.1-gcc/bin/mpicc
[libsc] CFLAGS                   -O3 -Wall
[libsc] LDFLAGS
[libsc] LIBS                     -llapack -lblas -llua -lz -L/usr/local/lib
[p4est] This is p4est 1.1.105-b240-dirty
[p4est] CPP                      /usr/local/openmpi-1.6.1-gcc/bin/mpicc -E
[p4est] CPPFLAGS
[p4est] CC                       /usr/local/openmpi-1.6.1-gcc/bin/mpicc
[p4est] CFLAGS                   -O3 -Wall
[p4est] LDFLAGS
[p4est] LIBS                     -llapack -lblas -llua -lz -L/usr/local/lib
[p4est] Using file names p8est.p8c and p8est.p8p
[p4est] Into p8est_new with min quadrants 0 level 0 uniform 0
[p4est]  New p8est with 6 trees on 1 processors
[p4est]  Initial level 0 potential global quadrants 6 per tree 1
[p4est] Done p8est_new with 6 total quadrants
[p4est] Into p8est_refine with 6 total quadrants, allowed level 18
[p4est] Done p8est_refine with 21538 total quadrants
[p4est] Into p8est_inflate
[p4est] Done p8est_inflate
[p4est] Into p8est_save p8est.p8p
[p4est] Done p8est_save
[p4est] Into p8est_load p8est.p8p
[p4est]  Into p8est_inflate
[p4est]  Done p8est_inflate
[p4est] Done p8est_load with 21538 total quadrants
[p4est] Into p8est_load p8est.p8p
[p4est]  Into p8est_inflate
[p4est]  Done p8est_inflate
[p4est] Done p8est_load with 21538 total quadrants
[p4est] Into p8est_inflate
[p4est] Done p8est_inflate
[p4est] Into p8est_partition with 21538 total quadrants
[p4est] Done p8est_partition no shipping
[p4est] Into p8est_balance CORNER with 21538 total quadrants
[p4est] Done p8est_balance with 142127 total quadrants
[libsc] This is libsc 1.1.54-7459
[libsc] CPP                      /usr/local/openmpi-1.6.1-gcc/bin/mpicc -E
[libsc] CPPFLAGS
[libsc] CC                       /usr/local/openmpi-1.6.1-gcc/bin/mpicc
[libsc] CFLAGS                   -O3 -Wall
[libsc] LDFLAGS
[libsc] LIBS                     -llapack -lblas -llua -lz -L/usr/local/lib
[p4est] This is p4est 1.1.105-b240-dirty
[p4est] CPP                      /usr/local/openmpi-1.6.1-gcc/bin/mpicc -E
[p4est] CPPFLAGS
[p4est] CC                       /usr/local/openmpi-1.6.1-gcc/bin/mpicc
[p4est] CFLAGS                   -O3 -Wall
[p4est] LDFLAGS
[p4est] LIBS                     -llapack -lblas -llua -lz -L/usr/local/lib
[p4est] Using file names p8est.p8c and p8est.p8p
[p4est] Into p8est_new with min quadrants 0 level 0 uniform 0
[p4est]  New p8est with 6 trees on 1 processors
[p4est]  Initial level 0 potential global quadrants 6 per tree 1
[p4est] Done p8est_new with 6 total quadrants
[p4est] Into p8est_refine with 6 total quadrants, allowed level 18
[p4est] Into p8est_save p8est.p8p
[p4est] Done p8est_refine with 21538 total quadrants
[p4est] Into p8est_inflate
[p4est] Done p8est_inflate
[p4est] Into p8est_save p8est.p8p
[p4est] Done p8est_save
[p4est] Into p8est_load p8est.p8p
[libsc 0] Abort: invalid format
[libsc 0] Abort: ../src/p4est.c:3514
[libsc 0] Abort: Obtained 9 stack frames
[libsc 0] Stack 0: libsc-1.1.54-7459.so(sc_abort+0x2f) [0x7fbb5cb98e6f]
[libsc 0] Stack 1: libsc-1.1.54-7459.so(+0x9015) [0x7fbb5cb99015]
[libsc 0] Stack 2: libp4est-1.1.105-b240-dirty.so(p8est_source_ext+0x8cb)
[0x7fbb5ce1505b]
[libsc 0] Stack 3: libp4est-1.1.105-b240-dirty.so(p8est_load_ext+0xa9)
[0x7fbb5ce151c9]
[libsc 0] Stack 4: libp4est-1.1.105-b240-dirty.so(p8est_load+0x18)
[0x7fbb5ce15278]
[libsc 0] Stack 5: lt-p8est_test_loadsave() [0x401ee4]
[libsc 0] Stack 6: lt-p8est_test_loadsave(main+0xf3) [0x4017e3]
[libsc 0] Stack 7: libc.so.6(__libc_start_main+0xed) [0x7fbb5ac5923d]
[libsc 0] Stack 8: lt-p8est_test_loadsave() [0x4018e5]
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[p4est] Done p8est_save
[p4est] Into p8est_load p8est.p8p
[libsc 0] Abort: invalid format
[libsc 0] Abort: ../src/p4est.c:3514
[libsc 0] Abort: Obtained 9 stack frames
[libsc 0] Stack 0: libsc-1.1.54-7459.so(sc_abort+0x2f) [0x7f7efb784e6f]
[libsc 0] Stack 1: libsc-1.1.54-7459.so(+0x9015) [0x7f7efb785015]
[libsc 0] Stack 2: libp4est-1.1.105-b240-dirty.so(p8est_source_ext+0x8cb)
[0x7f7efba0105b]
[libsc 0] Stack 3: libp4est-1.1.105-b240-dirty.so(p8est_load_ext+0xa9)
[0x7f7efba011c9]
[libsc 0] Stack 4: libp4est-1.1.105-b240-dirty.so(p8est_load+0x18)
[0x7f7efba01278]
[libsc 0] Stack 5: lt-p8est_test_loadsave() [0x401d61]
[libsc 0] Stack 6: lt-p8est_test_loadsave(main+0xf3) [0x4017e3]
[libsc 0] Stack 7: libc.so.6(__libc_start_main+0xed) [0x7f7ef984523d]
[libsc 0] Stack 8: lt-p8est_test_loadsave() [0x4018e5]
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------


Ethan Alan

On Fri, Mar 20, 2015 at 5:02 PM, Ethan Hereth <advocateddrummer@gmail.com>
wrote:

> Hey all,
>
> This is a vague question I know, but I'm at a bit of a loss here.
>
> I can build p4est and run all of its tests successfully, including
> p?est_test_loadsave.
>
> I am attempting to implement a 'restart' functionality in my software that
> uses p4est. When I call p4est_save from my software it gets into the
> MPI_File_open function and gets stuck. The file that I'm requesting does
> get created, but the function never returns.
>
> Clearly, my code compiles and links; I can verify that my software and
> p4est links against the same version of MPI.
>
> Is there any extra configuration that I need to do to get MPI IO working
> on my end? I see that in test_loadsave2.c there are calls to sc_init and
> p4est_init. Are these required?
>
> Any help would be appreciated.
>
> Thanks, and have a good weekend.
>
> Ethan Alan
>