librelist archives

« back to archive

Concurrency issue in TestKgioUnixConnect?

Concurrency issue in TestKgioUnixConnect?

From:
Jérémy Bobbio
Date:
2013-09-01 @ 14:03
Hi!

With the last upload of kgio to Debian, it failed to complete its test
suite on ia64 and mipsel:


https://buildd.debian.org/status/fetch.php?pkg=ruby-kgio&arch=ia64&ver=2.8.0-1&stamp=1376819434

https://buildd.debian.org/status/fetch.php?pkg=ruby-kgio&arch=mipsel&ver=2.8.0-1&stamp=1376855122

Is there a concurrency issue in TestKgioUnixConnect?

Looking at the code, I see there's an underlying TOCTOU issue with how
the temporary socket path name is determined, but I'm not sure if that's
what we are seeing there…

Any opinions?

-- 
Lunar                                .''`. 
lunar@debian.org                    : :Ⓐ  :  # apt-get install anarchism
                                    `. `'` 
                                      `-   

Re: [kgio] Concurrency issue in TestKgioUnixConnect?

From:
Hleb Valoshka
Date:
2013-09-01 @ 16:57
On 9/1/13, Jérémy Bobbio <lunar@debian.org> wrote:

> Is there a concurrency issue in TestKgioUnixConnect?
> Looking at the code, I see there's an underlying TOCTOU issue with how
> the temporary socket path name is determined, but I'm not sure if that's
> what we are seeing there…
> Any opinions?

I think you are right.

I've just pushed trivial patch into team's repository to prevent this issue:

http://anonscm.debian.org/gitweb/?p=pkg-ruby-extras/ruby-kgio.git;a=blob;f=debian/patches/0002-Change-prefix-of-temporary-sockets-to-prevent-races.patch

(But I want to delay package upload until ruby 2.0 go into repository.)

Re: [kgio] Concurrency issue in TestKgioUnixConnect?

From:
Eric Wong
Date:
2013-09-01 @ 19:58
Hleb Valoshka <375gnu@gmail.com> wrote:
> On 9/1/13, Jérémy Bobbio <lunar@debian.org> wrote:
> 
> > Is there a concurrency issue in TestKgioUnixConnect?
> > Looking at the code, I see there's an underlying TOCTOU issue with how
> > the temporary socket path name is determined, but I'm not sure if that's
> > what we are seeing there…
> > Any opinions?
> 
> I think you are right.
> 
> I've just pushed trivial patch into team's repository to prevent this issue:
> 
http://anonscm.debian.org/gitweb/?p=pkg-ruby-extras/ruby-kgio.git;a=blob;f=debian/patches/0002-Change-prefix-of-temporary-sockets-to-prevent-races.patch

Btw, do you want me to take that patch into the upstream repo?

It could be improved, too, to use a tmpdir for each test invocation.
That would allow multiple instances of the kgio test to run safely at
once (e.g. for different versions of ruby/rubinius).

Re: [kgio] Concurrency issue in TestKgioUnixConnect?

From:
Hleb Valoshka
Date:
2013-09-02 @ 12:57
On 9/1/13, Eric Wong <normalperson@yhbt.net> wrote:
>> 
http://anonscm.debian.org/gitweb/?p=pkg-ruby-extras/ruby-kgio.git;a=blob;f=debian/patches/0002-Change-prefix-of-temporary-sockets-to-prevent-races.patch
> Btw, do you want me to take that patch into the upstream repo?

As you wish, your tests are also affected by this issue, cause they
run in parallel too.

> It could be improved, too, to use a tmpdir for each test invocation.
> That would allow multiple instances of the kgio test to run safely at
> once (e.g. for different versions of ruby/rubinius).

Hmm, interesting idea, currently in Debian tests with different
versions of ruby are run one-by-one, but who knows how such process
may be implemented in other OSes.

Re: [kgio] Concurrency issue in TestKgioUnixConnect?

From:
Hleb Valoshka
Date:
2013-09-05 @ 07:58
On 9/2/13, Hleb Valoshka <375gnu@gmail.com> wrote:
> As you wish, your tests are also affected by this issue, cause they
> run in parallel too.

BTW, I was wrong, you run tests one by one, so they are no affected.

Re: [kgio] Concurrency issue in TestKgioUnixConnect?

From:
Eric Wong
Date:
2013-09-05 @ 08:14
Hleb Valoshka <375gnu@gmail.com> wrote:
> On 9/2/13, Hleb Valoshka <375gnu@gmail.com> wrote:
> > As you wish, your tests are also affected by this issue, cause they
> > run in parallel too.
> 
> BTW, I was wrong, you run tests one by one, so they are no affected.

"make -j" runs them in parallel, but in completely isolated processes
(not loading unrelated test files, either).  I haven't checked, does the
Debian ruby build scripts use threads or load all the tests into the
same process?

Re: [kgio] Concurrency issue in TestKgioUnixConnect?

From:
Hleb Valoshka
Date:
2013-09-05 @ 12:35
On 9/5/13, Eric Wong <normalperson@yhbt.net> wrote:
> "make -j" runs them in parallel, but in completely isolated processes
> (not loading unrelated test files, either).  I haven't checked, does the
> Debian ruby build scripts use threads or load all the tests into the
> same process?

gem2deb logic is very simple:

if exists ruby-tests.rake then run rake ruby-tests.rake
else if exists ruby-tests.rb then run ruby ruby-tests.rb
else if exists ruby-test-files.yaml then
  YAML.load('ruby-test-files.yaml').each {|f| require f}
end

So in 1st and 2nd cases this is the maintainer's job to deside how to
run tests but in 3rd everything depends on test/unit from ruby
distribution.

Re: [kgio] Concurrency issue in TestKgioUnixConnect?

From:
Eric Wong
Date:
2013-09-02 @ 21:24
Hleb Valoshka <375gnu@gmail.com> wrote:
> On 9/1/13, Eric Wong <normalperson@yhbt.net> wrote:
> >> 
http://anonscm.debian.org/gitweb/?p=pkg-ruby-extras/ruby-kgio.git;a=blob;f=debian/patches/0002-Change-prefix-of-temporary-sockets-to-prevent-races.patch
> > Btw, do you want me to take that patch into the upstream repo?
> 
> As you wish, your tests are also affected by this issue, cause they
> run in parallel too.

I just signed-off and pushed the following out:

Hleb Valoshka (2):
      Change prefix of temporary sockets to prevent races
      Don't dump 20M in case of failure

I've never been hit by the issue even though I've always known it's a
potential issue on very busy systems.  It looks like Tempfile embeds the
PID in the temporary name, so maybe the buildds are recycling PIDs very
quickly?

I haven't looked closely at your fakeroot workaround.

> > It could be improved, too, to use a tmpdir for each test invocation.
> > That would allow multiple instances of the kgio test to run safely at
> > once (e.g. for different versions of ruby/rubinius).
> 
> Hmm, interesting idea, currently in Debian tests with different
> versions of ruby are run one-by-one, but who knows how such process
> may be implemented in other OSes.

I used to test some software in parallel under Ruby
1.8/Rubinius/1.9.1/1.9.2/1.9.3; but eventually stopped caring about
1.8, 1.9.{1,2} and Rubinius itself was taking too long to build.

Re: [kgio] Concurrency issue in TestKgioUnixConnect?

From:
Hleb Valoshka
Date:
2013-09-03 @ 08:50
On 9/3/13, Eric Wong <normalperson@yhbt.net> wrote:
> I just signed-off and pushed the following out:
>
> Hleb Valoshka (2):
>       Change prefix of temporary sockets to prevent races
>       Don't dump 20M in case of failure
>
> I've never been hit by the issue even though I've always known it's a
> potential issue on very busy systems.  It looks like Tempfile embeds the

I was able to reproduce it only one or two times in virtualized GNU/kFreeBSD.

> PID in the temporary name, so maybe the buildds are recycling PIDs very
> quickly?

I think the problem is caused by parallel tests invocation in ruby 1.9
(ruby 1.8 is
never affected by this issue). Tempfile builds file name using prefix,
PID and random
value, in rare cases this random value isn't random enough.

> I haven't looked closely at your fakeroot workaround.

It's a Debian specific patch.

Re: [kgio] Concurrency issue in TestKgioUnixConnect?

From:
Eric Wong
Date:
2013-09-03 @ 20:05
Hleb Valoshka <375gnu@gmail.com> wrote:
> On 9/3/13, Eric Wong <normalperson@yhbt.net> wrote:
> > I haven't looked closely at your fakeroot workaround.
> 
> It's a Debian specific patch.

Its problematic when run as root, too (which may affect some
folks).  I'll push out the following:

Subject: [PATCH] test_tryopen: skip EACCES test when euid == 0

This fails when the test is run as root (which may be the case
of some Ruby installations) or fakeroot (which is the case of
Debian build systems).
---
 test/test_tryopen.rb | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/test/test_tryopen.rb b/test/test_tryopen.rb
index 8a8278c..abcbd37 100644
--- a/test/test_tryopen.rb
+++ b/test/test_tryopen.rb
@@ -30,7 +30,12 @@ def test_tryopen_EACCES
     tmp = Tempfile.new "tryopen"
     File.chmod 0000, tmp.path
     tmp = Kgio::File.tryopen(tmp.path)
-    assert_equal(:EACCES, tmp)
+    if Process.euid == 0
+      assert_kind_of Kgio::File, tmp
+      warn "cannot test EACCES when euid == 0"
+    else
+      assert_equal(:EACCES, tmp)
+    end
   end
 
   def test_tryopen_readwrite
-- 
Eric Wong

Re: [kgio] Concurrency issue in TestKgioUnixConnect?

From:
Hleb Valoshka
Date:
2013-09-05 @ 20:53
On 9/3/13, Hleb Valoshka <375gnu@gmail.com> wrote:
> I was able to reproduce it only one or two times in virtualized
> GNU/kFreeBSD.

Problem is more interesting. I decided to check your latest changes
for test_poll.rb under kfreebsd-i386 and i'd got failure in
test_unix_connect.rb with teardown of test_unix_socket_new_invalid:

  1) Error:
test_unix_socket_new_invalid(TestKgioUnixConnect):
Errno::ENOENT: No such file or directory -
/tmp/kgio_unix_120130906-13330-m89438/kgio_unix_120130906-13330-ch4t73
    test/test_unix_connect.rb:33:in `unlink'
    test/test_unix_connect.rb:33:in `teardown'

The same failure as on ia64, mipsel and x32 Jeremy pointed in the
first message. And it definitely not a race condition because for the
next time i run it as make test/test_unix_connect.rb

So I decided to add some debug output:


  def setup
...
puts "\n\nsetup"
system 'ls -lR /tmp/kgio*'
  end

  def teardown
puts "\nteardown"
system 'ls -lR /tmp/kgio*'
...
end

  def test_unix_socket_new_invalid
puts "\ntest in"
system 'ls -lR /tmp/kgio*'
    assert_raises(ArgumentError) { Kgio::UNIXSocket.new('*' * 1024 * 1024) }
puts "\ntest out"
system 'ls -lR /tmp/kgio*'
  end

And voila:

# Running tests:
...
setup
/tmp/kgio_unix_120130906-13330-m89438:
total 0
srwxr-xr-x 1 root root 0 Sep  6 01:23 kgio_unix_120130906-13330-ch4t73

test in
/tmp/kgio_unix_120130906-13330-m89438:
total 0
srwxr-xr-x 1 root root 0 Sep  6 01:23 kgio_unix_120130906-13330-ch4t73

test out
/tmp/kgio_unix_120130906-13330-m89438:
total 0

teardown
/tmp/kgio_unix_120130906-13330-m89438:
total 0
E
...

The socket has disappeared somewhere. Interesting, isn't it? The more
interesting that kfreebsd on amd64 doesn't have this issue. Both are
up-to-date Debian Sid.

Any ideas?

> I think the problem is caused by parallel tests invocation in ruby 1.9

It seems that I was wrong: it's possible to invoke tests in parallel,
and there is an explicit option for this `--jobs'. But it doesn't
mean, that tests are invoked in parallel by default.

Re: [kgio] Concurrency issue in TestKgioUnixConnect?

From:
Eric Wong
Date:
2013-09-05 @ 23:11
Hleb Valoshka <375gnu@gmail.com> wrote:
> On 9/3/13, Hleb Valoshka <375gnu@gmail.com> wrote:
> > I was able to reproduce it only one or two times in virtualized
> > GNU/kFreeBSD.
> 
> The socket has disappeared somewhere. Interesting, isn't it? The more
> interesting that kfreebsd on amd64 doesn't have this issue. Both are
> up-to-date Debian Sid.
> 
> Any ideas?

Weird, not right now.  Does ktrace/ktruss reveal anything?
And you can't get this on GNU/Linux?

Re: [kgio] Concurrency issue in TestKgioUnixConnect?

From:
Hleb Valoshka
Date:
2013-09-06 @ 08:56
On 9/6/13, Eric Wong <normalperson@yhbt.net> wrote:
>> The socket has disappeared somewhere. Interesting, isn't it? The more
>> interesting that kfreebsd on amd64 doesn't have this issue. Both are
>> up-to-date Debian Sid.
>> Any ideas?
> Weird, not right now.  Does ktrace/ktruss reveal anything?

I'll try to use them (I never used them before).

> And you can't get this on GNU/Linux?

No, at least on amd64. Buildd logs show the same error on x32, so I'll
try to install it into chroot and test (or add debug output and ask DD
to upload package into experimental to check it on mips and itanium)