librelist archives

« back to archive

A few attic ideas

A few attic ideas

From:
Date:
2014-04-30 @ 10:10
Hello,

I have a few ideas I'd like to put up to discussion. I don't want them to 
be understood as feature requests (yet) as I'm not sure how useful they 
would be without understanding much of the inner workings of attic.

1) different compression or no compression
 - I'm experimenting a bit trying to get attic to work on an QNAP. No huge 
successes yet (due to lack of time), but once (if) it works, the 
compression (zlib i think) will probably be the bottleneck of the little 
ARM CPU in this machine (or maybe RAM, dunno yet). How much of a design 
change would it be to a) skip compression or b) switch it to something 
lighter like LZ4? This might also open the possibility to use LZMA on 
beefier CPUs.
2) multithreading
- this is for machines more on the high end. If you're waiting for a 
backup to finnish, sometimes you'd like to dedicate more cores to the 
process. attic seems to be single-threaded. Though this seems to be a 
problem on initial backups, as updating a repo seems to tax IO more than 
the CPU.
3) control cache path. 
- At the moment, cache seems to be hardcoded to $HOME/.cache/attic. Would 
it be reasonable to have this configurable? My backup scripts set HOME to 
/var/cache/attic as a workaround :)
4) donations
- I'm missing a paypal button of some sorts. Attic is useful enough I 
think :)

Best Regards
 Heiko

Re: [attic] A few attic ideas

From:
Jonas Borgström
Date:
2014-05-01 @ 12:07
On 2014-04-30 12:10, heiko.helmle@horiba.com wrote:
> Hello,
> 
> I have a few ideas I'd like to put up to discussion. I don't want them
> to be understood as feature requests (yet) as I'm not sure how useful
> they would be without understanding much of the inner workings of attic.
> 
> 1) different compression or no compression
>  - I'm experimenting a bit trying to get attic to work on an QNAP. No
> huge successes yet (due to lack of time), but once (if) it works, the
> compression (zlib i think) will probably be the bottleneck of the little
> ARM CPU in this machine (or maybe RAM, dunno yet). How much of a design
> change would it be to a) skip compression or b) switch it to something
> lighter like LZ4? This might also open the possibility to use LZMA on
> beefier CPUs.

Deflate/zlib has served us well so far, it compresses well and and is
fast enough to not become a bottleneck. But since LZMA is now included
with Python 3.3+ we could add support for that in the future.

But since most embedded systems are very low on ram, that will probably
be your main concern...

I backup all my stuff to a Synology NAS (ds214+) and it works very well...

> 2) multithreading
> - this is for machines more on the high end. If you're waiting for a
> backup to finnish, sometimes you'd like to dedicate more cores to the
> process. attic seems to be single-threaded. Though this seems to be a
> problem on initial backups, as updating a repo seems to tax IO more than
> the CPU.

Yeah, I've been thinking a bit about this. It would be great to be able
to make use of one core. Unfortunately multi-threaded programming is a
lot more difficult and error prone than single-threadeding. So in order
to justify all the added code complexity and risk the performance gains
must be significant. But this is definitely an area to look into in the
future.

> 3) control cache path.
> - At the moment, cache seems to be hardcoded to $HOME/.cache/attic.
> Would it be reasonable to have this configurable? My backup scripts set
> HOME to /var/cache/attic as a workaround :)

You can specify an alternate cache dir location using an environment
variable like this:

ATTIC_CACHE_DIR=/var/cache/attic attic create ...

I think this is currently undocumented but I'll add an entry to the FAQ
about this.

(Same goes for ATTIC_KEYS_DIR)

> 4) donations
> - I'm missing a paypal button of some sorts. Attic is useful enough I
> think :)

Thanks!, I'll have to read up on stuff like gittip.com works. But in the
meantime my email address is also my paypal-account if anyone's
interested ;)

Thanks for your feedback!

/ Jonas

Re: [attic] A few attic ideas

From:
Dan Christensen
Date:
2014-05-01 @ 13:12
Jonas Borgström <jonas@borgstrom.se> writes:

> On 2014-04-30 12:10, heiko.helmle@horiba.com wrote:
> 
>> 2) multithreading
>> - this is for machines more on the high end. If you're waiting for a
>> backup to finnish, sometimes you'd like to dedicate more cores to the
>> process. attic seems to be single-threaded. Though this seems to be a
>> problem on initial backups, as updating a repo seems to tax IO more than
>> the CPU.
>
> Yeah, I've been thinking a bit about this. It would be great to be able
> to make use of one core. Unfortunately multi-threaded programming is a
> lot more difficult and error prone than single-threadeding. So in order
> to justify all the added code complexity and risk the performance gains
> must be significant. But this is definitely an area to look into in the
> future.

For multi-threaded compression, I wonder if the methods used for pigz
would be helpful:

  http://www.zlib.net/pigz/

But maybe compression isn't the bottleneck for attic?  I wonder if
there's an easy way to at least partially use more than one core,
e.g. moving compression (or some other task) to another thread?

But I also agree that the cost in complexity might be too high.
Attic is already very fast, and reliability and maintainability are
the most important goals.

And there are other ways speed could be improved without multi-threading,
e.g. by improving the cache rebuilding...  :-)

Dan

Re: [attic] A few attic ideas

From:
Jonas Borgström
Date:
2014-05-02 @ 13:52
On 2014-05-01 15:12, Dan Christensen wrote:
> Jonas Borgström <jonas@borgstrom.se> writes:
> 
>> On 2014-04-30 12:10, heiko.helmle@horiba.com wrote:
>>
>>> 2) multithreading
>>> - this is for machines more on the high end. If you're waiting for a
>>> backup to finnish, sometimes you'd like to dedicate more cores to the
>>> process. attic seems to be single-threaded. Though this seems to be a
>>> problem on initial backups, as updating a repo seems to tax IO more than
>>> the CPU.
>>
>> Yeah, I've been thinking a bit about this. It would be great to be able
>> to make use of one core. Unfortunately multi-threaded programming is a
>> lot more difficult and error prone than single-threadeding. So in order
>> to justify all the added code complexity and risk the performance gains
>> must be significant. But this is definitely an area to look into in the
>> future.
> 
> For multi-threaded compression, I wonder if the methods used for pigz
> would be helpful:
> 
>   http://www.zlib.net/pigz/
> 
> But maybe compression isn't the bottleneck for attic?  I wonder if
> there's an easy way to at least partially use more than one core,
> e.g. moving compression (or some other task) to another thread?
> 
> But I also agree that the cost in complexity might be too high.
> Attic is already very fast, and reliability and maintainability are
> the most important goals.

Agreed, the first step should should always be benchmarking to make sure
we're optimizing a real bottleneck and not just introducing complexity
for no good reason...


> And there are other ways speed could be improved without multi-threading,
> e.g. by improving the cache rebuilding...  :-)

:)

/ Jonas