Gentlefolk, I am having some problems with the mongrel2 server core dumping because of a problem inside libzmq.so.0 on a FreeBSD 7.3 host. The stack trace I am getting from gdb appears as follows: %gdb -d /u0/local/src/mongrel2-1.7.5/src /usr/local/bin/mongrel2 mongrel2.core GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd"... Core was generated by `mongrel2'. Program terminated with signal 6, Aborted. Reading symbols from /usr/local/lib/libzmq.so.0...done. Loaded symbols for /usr/local/lib/libzmq.so.0 Reading symbols from /usr/local/lib/libsqlite3.so.8...done. Loaded symbols for /usr/local/lib/libsqlite3.so.8 Reading symbols from /lib/libthr.so.3...done. Loaded symbols for /lib/libthr.so.3 Reading symbols from /lib/libc.so.7...done. Loaded symbols for /lib/libc.so.7 Reading symbols from /usr/lib/libstdc++.so.6...done. Loaded symbols for /usr/lib/libstdc++.so.6 Reading symbols from /lib/libm.so.5...done. Loaded symbols for /lib/libm.so.5 Reading symbols from /lib/libgcc_s.so.1...done. Loaded symbols for /lib/libgcc_s.so.1 Reading symbols from /libexec/ld-elf.so.1...done. Loaded symbols for /libexec/ld-elf.so.1 #0 0x2820353b in thr_kill () from /lib/libc.so.7 [New Thread 0x28401370 (LWP 103387)] [New Thread 0x28401260 (LWP 101320)] [New Thread 0x28401040 (LWP 104193)] (gdb) where #0 0x2820353b in thr_kill () from /lib/libc.so.7 #1 0x281b5856 in pthread_kill () from /lib/libthr.so.3 #2 0x281b33b3 in raise () from /lib/libthr.so.3 #3 0x2829756a in abort () from /lib/libc.so.7 #4 0x2810c49f in zmq::kqueue_t::kevent_add () from /usr/local/lib/libzmq.so.0 #5 0x2810c4d6 in zmq::kqueue_t::set_pollout () from /usr/local/lib/libzmq.so.0 #6 0x2810ac87 in zmq::io_object_t::set_pollout () from /usr/local/lib/libzmq.so.0 #7 0x2812a4a9 in zmq::zmq_engine_t::revive () from /usr/local/lib/libzmq.so.0 #8 0x281199e0 in zmq::session_t::revive () from /usr/local/lib/libzmq.so.0 #9 0x2811297c in zmq::reader_t::process_revive () from /usr/local/lib/libzmq.so.0 #10 0x2811091a in zmq::object_t::process_command () from /usr/local/lib/libzmq.so.0 #11 0x2810b0df in zmq::io_thread_t::in_event () from /usr/local/lib/libzmq.so.0 #12 0x2810c936 in zmq::kqueue_t::loop () from /usr/local/lib/libzmq.so.0 #13 0x2810ca8d in zmq::kqueue_t::worker_routine () from /usr/local/lib/libzmq.so.0 #14 0x281234e1 in zmq::thread_t::thread_routine () from /usr/local/lib/libzmq.so.0 #15 0x281af73f in pthread_getprio () from /lib/libthr.so.3 #16 0x00000000 in ?? () The logs show nothing special happening at the time. The last query logged was yesterday in the late afternoon and the server was noticed down and was restarted at 08:53 this morning. The host has zmq-2.0.10 installed at the moment. Are my troubles because of using this version ? I have just checked and see that the FreeBSD ports tree has zmq 2.1.7 as the current version. If I am to upgrade, then should I be using zmq 2.1.4 as the manual specifies or can I go to 2.1.7 without further issues showing up ? Thanks in advance, -- Dave
On Fri, Jul 8, 2011 at 11:11 AM, Dave Dodd <dave@ci.com.au> wrote: > Gentlefolk, > > I am having some problems with the mongrel2 server core dumping because of a problem inside libzmq.so.0 on a FreeBSD 7.3 host. > > The stack trace I am getting from gdb appears as follows: > > %gdb -d /u0/local/src/mongrel2-1.7.5/src /usr/local/bin/mongrel2 mongrel2.core > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-marcel-freebsd"... > Core was generated by `mongrel2'. > Program terminated with signal 6, Aborted. > Reading symbols from /usr/local/lib/libzmq.so.0...done. > Loaded symbols for /usr/local/lib/libzmq.so.0 > Reading symbols from /usr/local/lib/libsqlite3.so.8...done. > Loaded symbols for /usr/local/lib/libsqlite3.so.8 > Reading symbols from /lib/libthr.so.3...done. > Loaded symbols for /lib/libthr.so.3 > Reading symbols from /lib/libc.so.7...done. > Loaded symbols for /lib/libc.so.7 > Reading symbols from /usr/lib/libstdc++.so.6...done. > Loaded symbols for /usr/lib/libstdc++.so.6 > Reading symbols from /lib/libm.so.5...done. > Loaded symbols for /lib/libm.so.5 > Reading symbols from /lib/libgcc_s.so.1...done. > Loaded symbols for /lib/libgcc_s.so.1 > Reading symbols from /libexec/ld-elf.so.1...done. > Loaded symbols for /libexec/ld-elf.so.1 > #0 0x2820353b in thr_kill () from /lib/libc.so.7 > [New Thread 0x28401370 (LWP 103387)] > [New Thread 0x28401260 (LWP 101320)] > [New Thread 0x28401040 (LWP 104193)] > (gdb) where > #0 0x2820353b in thr_kill () from /lib/libc.so.7 > #1 0x281b5856 in pthread_kill () from /lib/libthr.so.3 > #2 0x281b33b3 in raise () from /lib/libthr.so.3 > #3 0x2829756a in abort () from /lib/libc.so.7 > #4 0x2810c49f in zmq::kqueue_t::kevent_add () from /usr/local/lib/libzmq.so.0 > #5 0x2810c4d6 in zmq::kqueue_t::set_pollout () from /usr/local/lib/libzmq.so.0 > #6 0x2810ac87 in zmq::io_object_t::set_pollout () > from /usr/local/lib/libzmq.so.0 > #7 0x2812a4a9 in zmq::zmq_engine_t::revive () from /usr/local/lib/libzmq.so.0 > #8 0x281199e0 in zmq::session_t::revive () from /usr/local/lib/libzmq.so.0 > #9 0x2811297c in zmq::reader_t::process_revive () > from /usr/local/lib/libzmq.so.0 > #10 0x2811091a in zmq::object_t::process_command () > from /usr/local/lib/libzmq.so.0 > #11 0x2810b0df in zmq::io_thread_t::in_event () from /usr/local/lib/libzmq.so.0 > #12 0x2810c936 in zmq::kqueue_t::loop () from /usr/local/lib/libzmq.so.0 > #13 0x2810ca8d in zmq::kqueue_t::worker_routine () > from /usr/local/lib/libzmq.so.0 > #14 0x281234e1 in zmq::thread_t::thread_routine () > from /usr/local/lib/libzmq.so.0 > #15 0x281af73f in pthread_getprio () from /lib/libthr.so.3 > #16 0x00000000 in ?? () > > The logs show nothing special happening at the time. > > The last query logged was yesterday in the late afternoon and the server was > noticed down and was restarted at 08:53 this morning. > > The host has zmq-2.0.10 installed at the moment. Are my troubles because of > using this version ? > > I have just checked and see that the FreeBSD ports tree has zmq 2.1.7 as the > current version. If I am to upgrade, then should I be using zmq 2.1.4 as the > manual specifies or can I go to 2.1.7 without further issues showing up ? > > Thanks in advance, > > -- Dave > There were issues with zmq 2.1.6, but 2.1.7 should work fine.
> How are you even getting 2.1.6? I thought they took it down.
It's probably on their archived releases page still.
On Sat, Jul 9, 2011 at 6:04 AM, Nathan Duran <principal@khiltd.com> wrote: >> How are you even getting 2.1.6? I thought they took it down. > > It's probably on their archived releases page still. > He's not, he was using 2.0.10. I just mentioned 2.1.6 because I knew there were issues.
On Fri, Jul 08, 2011 at 11:16:17AM +1000, Josh Simmons wrote: > > There were issues with zmq 2.1.6, but 2.1.7 should work fine. I have upgraded to 2.1.7 and I will now wait to see if the problem returns. -- Dave
Despite the upgrade to using zmq 2.1.7, the server has crashed again. There was no activity logged for many hours before the server crashed. The stack trace does look subtly different... gdb /usr/local/bin/mongrel2 mongrel2.core GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd"... Core was generated by `mongrel2'. Program terminated with signal 6, Aborted. Reading symbols from /usr/local/lib/libzmq.so.1...done. Loaded symbols for /usr/local/lib/libzmq.so.1 Reading symbols from /usr/local/lib/libsqlite3.so.8...done. Loaded symbols for /usr/local/lib/libsqlite3.so.8 Reading symbols from /lib/libthr.so.3...done. Loaded symbols for /lib/libthr.so.3 Reading symbols from /lib/libc.so.7...done. Loaded symbols for /lib/libc.so.7 Reading symbols from /usr/lib/libstdc++.so.6...done. Loaded symbols for /usr/lib/libstdc++.so.6 Reading symbols from /lib/libm.so.5...done. Loaded symbols for /lib/libm.so.5 Reading symbols from /lib/libgcc_s.so.1...done. Loaded symbols for /lib/libgcc_s.so.1 Reading symbols from /libexec/ld-elf.so.1...done. Loaded symbols for /libexec/ld-elf.so.1 #0 0x281f653b in thr_kill () from /lib/libc.so.7 [New Thread 0x284016a0 (LWP 100836)] [New Thread 0x28401370 (LWP 100382)] [New Thread 0x28401260 (LWP 100303)] [New Thread 0x28401040 (LWP 101100)] (gdb) where #0 0x281f653b in thr_kill () from /lib/libc.so.7 #1 0x281a8856 in pthread_kill () from /lib/libthr.so.3 #2 0x281a63b3 in raise () from /lib/libthr.so.3 #3 0x2828a56a in abort () from /lib/libc.so.7 #4 0x28102d30 in zmq::kqueue_t::kevent_add () from /usr/local/lib/libzmq.so.1 #5 0x28102d5a in zmq::kqueue_t::set_pollout () from /usr/local/lib/libzmq.so.1 #6 0x281014b7 in zmq::io_object_t::set_pollout () from /usr/local/lib/libzmq.so.1 #7 0x2811ea19 in zmq::zmq_engine_t::activate_out () from /usr/local/lib/libzmq.so.1 #8 0x2810ef32 in zmq::session_t::activated () from /usr/local/lib/libzmq.so.1 #9 0x2810913c in zmq::reader_t::process_activate_reader () from /usr/local/lib/libzmq.so.1 #10 0x2810672f in zmq::object_t::process_command () from /usr/local/lib/libzmq.so.1 #11 0x281019df in zmq::io_thread_t::in_event () from /usr/local/lib/libzmq.so.1 #12 0x28102ec1 in zmq::kqueue_t::loop () from /usr/local/lib/libzmq.so.1 #13 0x28118150 in thread_routine () from /usr/local/lib/libzmq.so.1 #14 0x281a273f in pthread_getprio () from /lib/libthr.so.3 #15 0x00000000 in ?? () I am about to turn on a watchdog process to restart mongrel2 it if it dies but I would like to resolve the actual cause of the problem before we deploy for production use. Any thoughts folks ? --Dave On Fri, Jul 08, 2011 at 03:20:24PM +1000, Dave Dodd wrote: > On Fri, Jul 08, 2011 at 11:16:17AM +1000, Josh Simmons wrote: > > > > There were issues with zmq 2.1.6, but 2.1.7 should work fine. > > I have upgraded to 2.1.7 and I will now wait to see if the problem returns. > > -- Dave -- David Dodd Corinthian Engineering Pty Limited Suite 54, Jones Bay Wharf 26-32 Pirrama Road Pyrmont NSW 2009 Australia Telephone +612 9552 5500 Fax +612 9552 5549
On Wed, Jul 13, 2011 at 10:21:40AM +1000, Dave Dodd wrote: > Despite the upgrade to using zmq 2.1.7, the server has crashed again. I've done a bunch of fixing in the reload code and stuff. Any chance you can try the latest on the git develop branch? I could throw up a .tar.gz for you to try. > > I am about to turn on a watchdog process to restart mongrel2 it if it dies but > I would like to resolve the actual cause of the problem before we deploy for > production use. > What I'd do is run this under valgrind like this: valgrind --log-file=valgrind.log mongrel2 config.sqlite UUID And you can get the UUID with: m2sh servers Let it run, then when it crashes send me the valgrind.log and I can find out what's going on real easily. -- Zed A. Shaw http://zedshaw.com/
On Fri, Jul 08, 2011 at 03:20:24PM +1000, Dave Dodd wrote: > On Fri, Jul 08, 2011 at 11:16:17AM +1000, Josh Simmons wrote: > > > > There were issues with zmq 2.1.6, but 2.1.7 should work fine. > > I have upgraded to 2.1.7 and I will now wait to see if the problem returns. How are you even getting 2.1.6? I thought they took it down. -- Zed A. Shaw http://zedshaw.com/