librelist archives

« back to archive

System calls in Unix

System calls in Unix

From:
Eric Wong
Date:
2011-09-14 @ 22:34
As mentioned before, syscalls are the interface user space interacts
with kernel space.  When a user space application makes a syscall,
it is telling the kernel to execute code on its behalf.

Ruby itself provides global "syscall" method on many platforms.  It is
useful for learning and experimentation, but not recommended for general
use as it is fragile and non-portable.   There is usually no need to use
this method as many useful syscalls are already provided+wrapped by Ruby
methods

For a user space application to make a system call: architecture and
OS-dependent code must be invoked.  At the lowest (userspace) levels
this is implemented in non-portable assembly code.

Fortunately, most system calls are already provided as wrappers by the
system C library (libc) so they appear to user space as portable C
functions.  Ruby itself wraps these C functions as Ruby methods.  Even
non-C Ruby implementations are likely to call these C functions in libc
(rather than implement the non-portable assembly themselves).

Thus:

  IO.pipe in Ruby is a wrapper for the pipe(3) C function which
  in turn wraps the pipe(2) system call.

You may not find the pipe(3) manpage because pipe(3) is a very thin
wrapper for the pipe(2) syscall and the pipe(2) manpage is equivalent.


License: GPLv3 (or later, at the discretion of Eric Wong)
         http://www.gnu.org/licenses/gpl-3.0.txt
-- 
Eric Wong

/Avoiding/ system calls

From:
Eric Wong
Date:
2011-09-14 @ 23:49
Eric Wong <normalperson@yhbt.net> wrote:
> As mentioned before, syscalls are the interface user space interacts
> with kernel space.  When a user space application makes a syscall,
> it is telling the kernel to execute code on its behalf.

The mode switch from user space to kernel space has more overhead
and is slower than a normal library function call[1].

Thus user space can (and will often attempt to) aggregate several user
space calls into fewer system calls to avoid this switching overhead.
This is a common concept found in user space code and Ruby itself is no
exception.  This aggregation often does not happen transparently, so it
should be understood and explained to avoid confusion.


I/O Buffering
-------------

As a Ruby programmer, you'll notice the IO class (and subclasses like
File) will buffer data you write and you need to call IO#flush or set
"IO#sync = true" to ensure other processes can read it.

If you're a C programmer, you'll know the stdio library can do the same
type of buffering in user space.   In fact, MRI 1.8 used the stdio
library internally for its user space buffering needs.

Kernel space may also implement its own buffering to avoid overhead when
interacting with the storage and network layers.  This buffering can be
influenced in some cases from Ruby.  We'll cover this later.


Memory Allocation
-----------------

While Ruby programmers do not often worry about memory allocation,
sometimes the following question comes up:

  Why did my Ruby process stay so big even after I've cleared all
  references to big objects?  I'm /sure/ GC has run several times
  and freed my big objects and I'm not leaking memory.

A C programmer might ask the same question:

  I free()-ed a lot of memory, why is my process still so big?

Memory allocation to user space from the kernel is cheaper in large
chunks, thus user space avoids interaction with the kernel by doing
more work itself.

User space libraries/runtimes implement a memory allocator (e.g.:
malloc(3) in libc) which takes large chunks of kernel memory[2] and
divides them up into smaller pieces for user space applications to use.

Thus, several user space memory allocations may occur before user space
needs to ask the kernel for more memory.  Thus if you got a large chunk
of memory from the kernel and are only using a small part of that, that
large chunk of memory remains allocated.

Releasing memory back to the kernel also has a cost.  User space memory
allocators may hold onto that memory (privately) in the hope it can be
reused within the same process and not give it back to the kernel
for use in other processes.


[1] - why this is expensive outside the scope of this list so
      explaining this is left as an exercise for the reader :)

[2] - via brk(2), sbrk(2), mmap(2) or various non-portable methods.


License: GPLv3 (or later, at the discretion of Eric Wong)
         http://www.gnu.org/licenses/gpl-3.0.txt
-- 
Eric Wong

Re: [usp.ruby] /Avoiding/ system calls

From:
Christian Pedaschus
Date:
2011-09-15 @ 14:25
On 09/15/2011 01:49 AM, Eric Wong wrote:
> Eric Wong <normalperson@yhbt.net> wrote:
> Thus if you got a large chunk
> of memory from the kernel and are only using a small part of that, that
> large chunk of memory remains allocated.
> 
> Releasing memory back to the kernel also has a cost.  User space memory
> allocators may hold onto that memory (privately) in the hope it can be
> reused within the same process and not give it back to the kernel
> for use in other processes.

So what can we do to free the memory anyway?

Hit that problem a few times and used restartable worker processes to
get away with it, but would admire to know howto do it the real way :)

Chris

Re: [usp.ruby] /Avoiding/ system calls

From:
Eric Wong
Date:
2011-09-15 @ 16:53
Christian Pedaschus <chris@s-4-u.net> wrote:
> On 09/15/2011 01:49 AM, Eric Wong wrote:
> > Eric Wong <normalperson@yhbt.net> wrote:
> > Thus if you got a large chunk
> > of memory from the kernel and are only using a small part of that, that
> > large chunk of memory remains allocated.
> > 
> > Releasing memory back to the kernel also has a cost.  User space memory
> > allocators may hold onto that memory (privately) in the hope it can be
> > reused within the same process and not give it back to the kernel
> > for use in other processes.
> 
> So what can we do to free the memory anyway?

Not standardized nor portable, but you can use mallopt(3) to set trim
thresholds and when to start using mmap (which is easier to release).
With recent glibc, you can use malloc_trim(3) to force trimming, too.

There's a "mall" RubyGem I wrote which allows access to these functions.
http://bogomips.org/mall/  I've never actually used it for real
projects[1], and don't recall testing it outside of glibc, either :x



[1] I incrementally process everything I can to avoid working with large
     strings and large amounts of objects all at once.
-- 
Eric Wong

Re: [usp.ruby] /Avoiding/ system calls

From:
Christian Pedaschus
Date:
2011-09-16 @ 16:25
On 09/15/2011 06:53 PM, Eric Wong wrote:
> Christian Pedaschus <chris@s-4-u.net> wrote:
>> So what can we do to free the memory anyway?
> 
> There's a "mall" RubyGem I wrote which allows access to these functions.
> http://bogomips.org/mall/  I've never actually used it for real
> projects[1], and don't recall testing it outside of glibc, either :x

Now thats 'the' answer :)
Thank you, Eric

regards, Chris



Re: [usp.ruby] /Avoiding/ system calls

From:
Eric Wong
Date:
2011-09-15 @ 17:04
Eric Wong <normalperson@yhbt.net> wrote:
> Not standardized nor portable, but you can use mallopt(3) to set trim
> thresholds and when to start using mmap (which is easier to release).
> With recent glibc, you can use malloc_trim(3) to force trimming, too.

Also, with glibc, you can also set these thresholds without calling
mallopt(3) by setting MALLOC_MMAP_THRESHOLD_ and other MALLOC_*_ vars in
the environment before starting your process.  The GNU mallopt(3)
manpage should document all of them (but I always have glibc sources
handy on all my dev machines anyways)

Re: [usp.ruby] /Avoiding/ system calls

From:
Gary Wright
Date:
2011-09-15 @ 15:08
On Sep 15, 2011, at 10:25 AM, Christian Pedaschus wrote:

> On 09/15/2011 01:49 AM, Eric Wong wrote:
>> Eric Wong <normalperson@yhbt.net> wrote:
>> Thus if you got a large chunk
>> of memory from the kernel and are only using a small part of that, that
>> large chunk of memory remains allocated.
>> 
>> Releasing memory back to the kernel also has a cost.  User space memory
>> allocators may hold onto that memory (privately) in the hope it can be
>> reused within the same process and not give it back to the kernel
>> for use in other processes.
> 
> So what can we do to free the memory anyway?
> 
> Hit that problem a few times and used restartable worker processes to
> get away with it, but would admire to know howto do it the real way :)


I think this is exactly what most long running systems do.  Otherwise a
process ends up holding on to the high-water mark of its memory usage.

The kernel thinks the process still wants to use the memory and the
process doesn't ever inform the kernel differently (via brk, sbrk, or mmap).

Gary Wright

Re: [usp.ruby] /Avoiding/ system calls

From:
Gary Wright
Date:
2011-09-15 @ 15:05
On Sep 14, 2011, at 7:49 PM, Eric Wong wrote:
> Releasing memory back to the kernel also has a cost.  User space memory
> allocators may hold onto that memory (privately) in the hope it can be
> reused within the same process and not give it back to the kernel
> for use in other processes.

I'm not aware of any common language runtimes that actual release
memory back to the kernel. One huge problem is that memory usage
becomes fragmented and so there is no contiguous block of memory that
can be handed back to the kernel even if there is very little memory
actually "in use".

Runtimes that have the freedom to relocate 'objects' in memory might
be able to coalesce the free space but this sort of relocation is not
common as I understand it.

BTW the applicable system call for this discussion is brk(2) which also
has a C library variation called sbrk(3).  These interfaces simply allow
a process to add memory to the end of the block of memory currently
allocated to the process (or take it away).  So the block of memory
used by the process remains contiguous.

An alternative approach could use mmap(2) to allocate non-contiguous blocks
of memory but that doesn't help with the fragmentation issue I mentioned above.

Gary Wright

Re: [usp.ruby] /Avoiding/ system calls

From:
Evan Phoenix
Date:
2011-09-15 @ 16:28
 See below. 

-- 
Evan Phoenix // evan@phx.io


On Thursday, September 15, 2011 at 8:05 AM, Gary Wright wrote:

> 
> On Sep 14, 2011, at 7:49 PM, Eric Wong wrote:
> > Releasing memory back to the kernel also has a cost. User space memory
> > allocators may hold onto that memory (privately) in the hope it can be
> > reused within the same process and not give it back to the kernel
> > for use in other processes.
> 
> I'm not aware of any common language runtimes that actual release
> memory back to the kernel. One huge problem is that memory usage
> becomes fragmented and so there is no contiguous block of memory that
> can be handed back to the kernel even if there is very little memory
> actually "in use".
Rubinius releases memory back to the kernel. It can do this because it's 
garbage collector compacts objects to eliminate fragmentation.
> 
> Runtimes that have the freedom to relocate 'objects' in memory might
> be able to coalesce the free space but this sort of relocation is not
> common as I understand it.
It's a common feature of all accurate garbage collectors. But you're 
correct, many runtimes use conservative GCs because they don't require as 
much work to wire into a system and conservative GCs can not compact 
memory. 
> 
> BTW the applicable system call for this discussion is brk(2) which also
> has a C library variation called sbrk(3). These interfaces simply allow
> a process to add memory to the end of the block of memory currently
> allocated to the process (or take it away). So the block of memory
> used by the process remains contiguous.
> 
> An alternative approach could use mmap(2) to allocate non-contiguous blocks
> of memory but that doesn't help with the fragmentation issue I mentioned above.
> 
> Gary Wright

Re: [usp.ruby] /Avoiding/ system calls

From:
Eric Wong
Date:
2011-09-15 @ 17:00
Gary Wright <at2002@me.com> wrote:
> On Sep 14, 2011, at 7:49 PM, Eric Wong wrote:
> > Releasing memory back to the kernel also has a cost.  User space memory
> > allocators may hold onto that memory (privately) in the hope it can be
> > reused within the same process and not give it back to the kernel
> > for use in other processes.
> 
> I'm not aware of any common language runtimes that actual release
> memory back to the kernel. One huge problem is that memory usage
> becomes fragmented and so there is no contiguous block of memory that
> can be handed back to the kernel even if there is very little memory
> actually "in use".

Yes, fragmentation is a problem, but if you manage to get around or
avoid it, memory may get released depending on the allocator
implementation.

For MRI, this depends on the malloc implementation you're using.  It's
possible with glibc/dlmalloc.  It was not possible with tcmalloc (used
by REE) last I checked.

For Rubinius, what Evan said :>

Perl can have the same issue(s) depending on which allocator it's built
to use.

> BTW the applicable system call for this discussion is brk(2) which also
> has a C library variation called sbrk(3).  These interfaces simply allow
> a process to add memory to the end of the block of memory currently
> allocated to the process (or take it away).  So the block of memory
> used by the process remains contiguous.

Yep.  Though neither brk(2) nor sbrk(3) are standardized by the latest
POSIX anymore.

> An alternative approach could use mmap(2) to allocate non-contiguous
> blocks of memory but that doesn't help with the fragmentation issue I
> mentioned above.

Yup, some allocators like glibc (ptmalloc) can use mmap(2) for larger
chunks.  It's easier to manage releasing memory this way.

With glibc, you can set M_MMAP_THRESHOLD via mallopt(3) to control when
it starts using mmap(2).  Current versions use a sliding window for the
mmap threshold.  I don't know these settings off the top of my head for
other allocators.

I don't think anonymous mappings with mmap(2) (MAP_ANON/MAP_ANONYMOUS)
are standardized in POSIX, yet, but practically all implementations
support it.

Re: [usp.ruby] /Avoiding/ system calls

From:
Gary Wright
Date:
2011-09-15 @ 17:53
On Sep 15, 2011, at 1:00 PM, Eric Wong wrote:
> Gary Wright <at2002@me.com> wrote:
>> 
>> I'm not aware of any common language runtimes that actual release
>> memory back to the kernel. One huge problem is that memory usage
>> becomes fragmented and so there is no contiguous block of memory that
>> can be handed back to the kernel even if there is very little memory
>> actually "in use".
> 
> Yes, fragmentation is a problem, but if you manage to get around or
> avoid it, memory may get released depending on the allocator
> implementation.
> 
> For MRI, this depends on the malloc implementation you're using.  It's
> possible with glibc/dlmalloc.  It was not possible with tcmalloc (used
> by REE) last I checked.
> 
> For Rubinius, what Evan said :>

Rubinius is the exception here (+1 for Rubinius) but I think the 'take away'
for a typical Ruby programmer is that they should be aware that most Ruby
implementations will *not* release memory back to the kernel until they die.

For example, if you slurp a huge file into memory and then discard the data:

   bigstring = File.read "reallybigfile"
   result = compute_something(bigstring)
   bigstring = nil

Ruby's garbage collector will eventually free the memory referenced briefly by
bigstring so that the memory can be used by Ruby, but that memory will 
never be released
back to the kernel until the Ruby process exits.

If you have multiple instances of your program running, perhaps processing
requests submitted via http, you could find yourself quickly using up all the
memory in your system from the kernel's point of view.

Gary Wright

Re: [usp.ruby] /Avoiding/ system calls

From:
Arvind Laxminaryan
Date:
2011-09-15 @ 19:37
On Thu, Sep 15, 2011 at 11:23 PM, Gary Wright <at2002@me.com> wrote:

> bigstring = File.read "reallybigfile"
>   result = compute_something(bigstring)
>   bigstring = nil
>
> Ruby's garbage collector will eventually free the memory referenced briefly
> by
> bigstring so that the memory can be used by Ruby, but that memory will
> never be released
> back to the kernel until the Ruby process exits.
>

Will the memory be released back to kernel if the big file is read and
results computed by forking
a child process? Since the child process dies after executing the last line
of ruby code passed
as a block to fork.

-- 
Arvind

Re: [usp.ruby] /Avoiding/ system calls

From:
Gary Wright
Date:
2011-09-15 @ 19:52
On Sep 15, 2011, at 3:37 PM, Arvind Laxminaryan wrote:
> On Thu, Sep 15, 2011 at 11:23 PM, Gary Wright <at2002@me.com> wrote:
> bigstring = File.read "reallybigfile"
>   result = compute_something(bigstring)
>   bigstring = nil
> 
> Ruby's garbage collector will eventually free the memory referenced briefly by
> bigstring so that the memory can be used by Ruby, but that memory will 
never be released
> back to the kernel until the Ruby process exits.
> 
> Will the memory be released back to kernel if the big file is read and 
results computed by forking
> a child process? Since the child process dies after executing the last 
line of ruby code passed
> as a block to fork.


Your question is a bit ambiguous regarding the event ordering.  The file 
read has to occur in the child process, not the parent.  When the child is
done, it exits returning any memory used back to the kernel.  Meanwhile 
the parent can handle/dispatch new jobs to new children.

Gary Wright

Re: [usp.ruby] /Avoiding/ system calls

From:
Qian Ye
Date:
2011-09-17 @ 02:15
so, a possible code optimization for memory use here is that, if you get
some memory consuming temporary to do, you'd better do it in a child
process, and send the result back, right?


On Fri, Sep 16, 2011 at 3:52 AM, Gary Wright <at2002@me.com> wrote:

>
> On Sep 15, 2011, at 3:37 PM, Arvind Laxminaryan wrote:
>
> On Thu, Sep 15, 2011 at 11:23 PM, Gary Wright <at2002@me.com> wrote:
>
>> bigstring = File.read "reallybigfile"
>>   result = compute_something(bigstring)
>>   bigstring = nil
>>
>> Ruby's garbage collector will eventually free the memory referenced
>> briefly by
>> bigstring so that the memory can be used by Ruby, but that memory will
>> never be released
>> back to the kernel until the Ruby process exits.
>>
>
> Will the memory be released back to kernel if the big file is read and
> results computed by forking
> a child process? Since the child process dies after executing the last line
> of ruby code passed
> as a block to fork.
>
>
> Your question is a bit ambiguous regarding the event ordering.  The file
> read has to occur in the child process, not the parent.  When the child is
> done, it exits returning any memory used back to the kernel.  Meanwhile the
> parent can handle/dispatch new jobs to new children.
>
> Gary Wright
>



-- 
With Regards!

Ye, Qian

Re: [usp.ruby] /Avoiding/ system calls

From:
James Gray
Date:
2011-09-17 @ 02:21
On Fri, Sep 16, 2011 at 9:15 PM, Qian Ye <yeqian.zju@gmail.com> wrote:
> so, a possible code optimization for memory use here is that, if you get
> some memory consuming temporary to do, you'd better do it in a child
> process, and send the result back, right?

I think that's the ideal approach if you want to keep a long running
process in the best condition.  I always say exit() is the ultimate
GC.

James Edward Gray II

Re: [usp.ruby] /Avoiding/ system calls

From:
Sasada Koichi
Date:
2011-09-17 @ 02:56
Hi James and all,

(11/09/16 19:21), James Gray wrote:
> On Fri, Sep 16, 2011 at 9:15 PM, Qian Ye <yeqian.zju@gmail.com> wrote:
>> so, a possible code optimization for memory use here is that, if you get
>> some memory consuming temporary to do, you'd better do it in a child
>> process, and send the result back, right?
> 
> I think that's the ideal approach if you want to keep a long running
> process in the best condition.  I always say exit() is the ultimate
> GC.

How about to use mmap/munmap directly?
I'm not sure it is portable or not.

Re: [usp.ruby] /Avoiding/ system calls

From:
James Gray
Date:
2011-09-17 @ 03:50
On Fri, Sep 16, 2011 at 9:56 PM, SASADA Koichi <ko1@atdot.net> wrote:
> Hi James and all,
>
> (11/09/16 19:21), James Gray wrote:
>> On Fri, Sep 16, 2011 at 9:15 PM, Qian Ye <yeqian.zju@gmail.com> wrote:
>>> so, a possible code optimization for memory use here is that, if you get
>>> some memory consuming temporary to do, you'd better do it in a child
>>> process, and send the result back, right?
>>
>> I think that's the ideal approach if you want to keep a long running
>> process in the best condition.  I always say exit() is the ultimate
>> GC.
>
> How about to use mmap/munmap directly?
> I'm not sure it is portable or not.

I don't have any experience with these techniques, so I too am anxious
to hear what others say about them.

James Edward Gray II

Re: [usp.ruby] /Avoiding/ system calls

From:
Eric Wong
Date:
2011-09-17 @ 04:10
SASADA Koichi <ko1@atdot.net> wrote:
> (11/09/16 19:21), James Gray wrote:
> > On Fri, Sep 16, 2011 at 9:15 PM, Qian Ye <yeqian.zju@gmail.com> wrote:
> >> so, a possible code optimization for memory use here is that, if you get
> >> some memory consuming temporary to do, you'd better do it in a child
> >> process, and send the result back, right?
> > 
> > I think that's the ideal approach if you want to keep a long running
> > process in the best condition.  I always say exit() is the ultimate
> > GC.
> 
> How about to use mmap/munmap directly?
> I'm not sure it is portable or not.

Yes mmap/munmap work great for large allocations, but they're not
portable (especially MAP_ANON/MAP_ANONYMOUS).  I think I saw a proposed
patch for REE to make object heaps use mmap/munmap, but not Strings.

Over-using mmap/munmap hurts, too.  I remember glibc implemented the
malloc() sliding window to favor brk() over mmap() because mmap()/munmap()
constantly forces the kernel to zero out memory before handing it
to user space.

Re: [usp.ruby] /Avoiding/ system calls

From:
Sasada Koichi
Date:
2011-09-17 @ 04:24
(11/09/16 21:10), Eric Wong wrote:
> Over-using mmap/munmap hurts, too.  I remember glibc implemented the
> malloc() sliding window to favor brk() over mmap() because mmap()/munmap()
> constantly forces the kernel to zero out memory before handing it
> to user space.

Due to consuming virtual memory page (and management area such as page
table/page directory table)?

Re: [usp.ruby] /Avoiding/ system calls

From:
Eric Wong
Date:
2011-09-17 @ 05:16
SASADA Koichi <ko1@atdot.net> wrote:
> (11/09/16 21:10), Eric Wong wrote:
> > Over-using mmap/munmap hurts, too.  I remember glibc implemented the
> > malloc() sliding window to favor brk() over mmap() because mmap()/munmap()
> > constantly forces the kernel to zero out memory before handing it
> > to user space.
> 
> Due to consuming virtual memory page (and management area such as page
> table/page directory table)?

I'm not sure if this is an advantage over brk(), but I am not an MM hacker.

malloc/malloc.c in glibc[1] cites zeroing memory as the performance issue.

Memory taken from brk() can be zeroed once when the kernel gives it to
user space, but that memory can be reused in user space indefinitely for
the life of the process.


For non-C programmers reading this list:
   malloc() does not need to zero memory (only calloc() does))


However, if munmap() gives memory back to the kernel, the kernel must
zero that memory it gives back to user space[2].   Repeatedly zeroing
memory is the performance problem.


[1] - git://sourceware.org/git/glibc.git -  "Update in 2006" section
[2] - otherwise private data like passwords/private keys can get
      leaked to processes that shouldn't have them.

-- 
Eric Wong

Re: [usp.ruby] /Avoiding/ system calls

From:
Eric Wong
Date:
2011-09-17 @ 03:12
James Gray <james@graysoftinc.com> wrote:
> On Fri, Sep 16, 2011 at 9:15 PM, Qian Ye <yeqian.zju@gmail.com> wrote:
> > so, a possible code optimization for memory use here is that, if you get
> > some memory consuming temporary to do, you'd better do it in a child
> > process, and send the result back, right?
> 
> I think that's the ideal approach if you want to keep a long running
> process in the best condition. 

Not ideal to other processes or the rest of the system.  I always favor
incremental processing in smaller chunks since:

1) it's less strain on the kernel allocator (especially for *other*
   processes in the system)

2) it works on datasets larger than RAM without swapping
   Regardless of language/runtime you use, swap thrashing is a
   painful experience (even with fast SSDs).

   (Not all swaping is bad, though, things that sit mostly idle
   are perfect for sitting in swap).

You'll probably get improved memory locality and better performance
there, too.

Of course incremental processing requires more effort to implement,
so it depends on the projected longetivity of your code.

-- 
Eric Wong

Re: [usp.ruby] /Avoiding/ system calls

From:
James Gray
Date:
2011-09-17 @ 03:48
On Fri, Sep 16, 2011 at 10:12 PM, Eric Wong <normalperson@yhbt.net> wrote:
> James Gray <james@graysoftinc.com> wrote:
>> On Fri, Sep 16, 2011 at 9:15 PM, Qian Ye <yeqian.zju@gmail.com> wrote:
>> > so, a possible code optimization for memory use here is that, if you get
>> > some memory consuming temporary to do, you'd better do it in a child
>> > process, and send the result back, right?
>>
>> I think that's the ideal approach if you want to keep a long running
>> process in the best condition.
>
> Not ideal to other processes or the rest of the system.  I always favor
> incremental processing in smaller chunks since:
>
> 1) it's less strain on the kernel allocator (especially for *other*
>   processes in the system)
>
> 2) it works on datasets larger than RAM without swapping
>   Regardless of language/runtime you use, swap thrashing is a
>   painful experience (even with fast SSDs).
>
>   (Not all swaping is bad, though, things that sit mostly idle
>   are perfect for sitting in swap).
>
> You'll probably get improved memory locality and better performance
> there, too.
>
> Of course incremental processing requires more effort to implement,
> so it depends on the projected longetivity of your code.

I'm sorry, I should have been more clear.  I agree with Eric 100%.

I just meant in cases where I have no choice but to do big work or I
am processing jobs I don't control that might become big work, I
prefer to do it in a child process.

James Edward Gray II

Re: [usp.ruby] /Avoiding/ system calls

From:
hemant
Date:
2011-09-17 @ 03:54
Hi,

On Sat, Sep 17, 2011 at 9:18 AM, James Gray <james@graysoftinc.com> wrote:
> On Fri, Sep 16, 2011 at 10:12 PM, Eric Wong <normalperson@yhbt.net> wrote:
>> James Gray <james@graysoftinc.com> wrote:
>>> On Fri, Sep 16, 2011 at 9:15 PM, Qian Ye <yeqian.zju@gmail.com> wrote:
>>> > so, a possible code optimization for memory use here is that, if you get
>>> > some memory consuming temporary to do, you'd better do it in a child
>>> > process, and send the result back, right?
>>>
>>> I think that's the ideal approach if you want to keep a long running
>>> process in the best condition.
>>
>> Not ideal to other processes or the rest of the system.  I always favor
>> incremental processing in smaller chunks since:
>>
>> 1) it's less strain on the kernel allocator (especially for *other*
>>   processes in the system)
>>
>> 2) it works on datasets larger than RAM without swapping
>>   Regardless of language/runtime you use, swap thrashing is a
>>   painful experience (even with fast SSDs).
>>
>>   (Not all swaping is bad, though, things that sit mostly idle
>>   are perfect for sitting in swap).

writev & readv or Vectored IO can help in reducing system calls too,

http://en.wikipedia.org/wiki/Vectored_I/O

Re: [usp.ruby] /Avoiding/ system calls

From:
Eric Wong
Date:
2011-09-17 @ 04:13
hemant <gethemant@gmail.com> wrote:
> writev & readv or Vectored IO can help in reducing system calls too,
> 
> http://en.wikipedia.org/wiki/Vectored_I/O

Yep, the io-extra RubyGem provides writev() (but not readv(), too much
work to get right).

io.c in ruby/trunk has a TODO item for using writev() internally
if anybody wants to implement it :)