librelist archives

« back to archive

a few questions on attic

a few questions on attic

From:
Christian Neukirchen
Date:
2014-03-16 @ 16:43
Hi,

I just learned about Attic yesterday and I like very much what I see.
After reading the manual and inspecting the source, I have a few
questions:

- When I use encryption, what exactly is encrypted?  Just the file
  contents or the metadata (file names and attributes) and internal data
  (number of backups and their name and time) as well?  Perhaps this
  could be clarified in the manual.

- Are there plans to support more encryption algorithms?  (Supporting
  public key cryptography (even if it's just shelling out to GPG) could
  be very appealing, since it lets machines backup without requiring the
  key to de-crypt the data again---but that depends how the internal
  protocol works.)

- AFAICS only the mtime of files is stored.  Any reason why ctime and
  atime are not backed up?

Finally a suggestion: It would be useful if one saw during the backup
which files contain new data (-v seems to show all files).

Thanks!
-- 
Christian Neukirchen  <chneukirchen@gmail.com>  http://chneukirchen.org

Re: [attic] a few questions on attic

From:
Jonas Borgström
Date:
2014-03-16 @ 21:03
On 2014-03-16 17:43, Christian Neukirchen wrote:
> Hi,
> 
> I just learned about Attic yesterday and I like very much what I see.
> After reading the manual and inspecting the source, I have a few
> questions:
> 
> - When I use encryption, what exactly is encrypted?  Just the file
>   contents or the metadata (file names and attributes) and internal data
>   (number of backups and their name and time) as well?  Perhaps this
>   could be clarified in the manual.

Everything is encrypted before leaving the client. The deduplication
algorithm seed is also derived from the passphrase/key file. This means
that it is not possible to detect if a specific file is backup up by
looking at the size of the deduplicated file chunks.

That said, if someone is able to observe the filesystem of a remote
repository over a period of time it would be possible to figure out the
number of archives that are created and/or deleted and the approximate size.

There's a ticket about improving the documentation here:

https://github.com/jborg/attic/issues/29

> - Are there plans to support more encryption algorithms?  (Supporting
>   public key cryptography (even if it's just shelling out to GPG) could
>   be very appealing, since it lets machines backup without requiring the
>   key to de-crypt the data again---but that depends how the internal
>   protocol works.)

No immediate plans for that. I experimented a bit with RSA encryption
before settling on the current system. But I found it wasn't worth the
extra complexity and overhead.
"Encrypt only"-keys sounds good on paper but in real life they won't
help much since if somebody is able to access the "encrypt only"-key
they are likely also able to access the filesystem itself, so there's no
need to decrypt any backups.

> - AFAICS only the mtime of files is stored.  Any reason why ctime and
>   atime are not backed up?

ctime is not backup up since there's no platform independent way to
restore it.

atime is not backed up since it is changed every time a file is read (by
attic for example). Since it always changes it would also make
deduplication of file metadata almost impossible.

> 
> Finally a suggestion: It would be useful if one saw during the backup
> which files contain new data (-v seems to show all files).

Yeah, that could be useful. Any suggestion for how that would look?

/ Jonas

Re: [attic] a few questions on attic

From:
Christian Neukirchen
Date:
2014-03-16 @ 22:51
"Jonas Borgström" <jonas@borgstrom.se> writes:

> On 2014-03-16 17:43, Christian Neukirchen wrote:
>> Hi,
>> 
>> I just learned about Attic yesterday and I like very much what I see.
>> After reading the manual and inspecting the source, I have a few
>> questions:
>> 
>> - When I use encryption, what exactly is encrypted?  Just the file
>>   contents or the metadata (file names and attributes) and internal data
>>   (number of backups and their name and time) as well?  Perhaps this
>>   could be clarified in the manual.
>
> Everything is encrypted before leaving the client.

Ok, great!

>> - Are there plans to support more encryption algorithms?  (Supporting
>>   public key cryptography (even if it's just shelling out to GPG) could
>>   be very appealing, since it lets machines backup without requiring the
>>   key to de-crypt the data again---but that depends how the internal
>>   protocol works.)
>
> No immediate plans for that. I experimented a bit with RSA encryption
> before settling on the current system. But I found it wasn't worth the
> extra complexity and overhead.
> "Encrypt only"-keys sounds good on paper but in real life they won't
> help much since if somebody is able to access the "encrypt only"-key
> they are likely also able to access the filesystem itself, so there's no
> need to decrypt any backups.

Yes, but he cannot decrypt data other servers stored there.

>> - AFAICS only the mtime of files is stored.  Any reason why ctime and
>>   atime are not backed up?
>
> ctime is not backup up since there's no platform independent way to
> restore it.
>
> atime is not backed up since it is changed every time a file is read (by
> attic for example). Since it always changes it would also make
> deduplication of file metadata almost impossible.

Good points.

>> Finally a suggestion: It would be useful if one saw during the backup
>> which files contain new data (-v seems to show all files).
>
> Yeah, that could be useful. Any suggestion for how that would look?

Well, it could just be a plain list of files, or perhaps also show how
many bytes/percent of the file contents have been deduplicated...

Thanks!
-- 
Christian Neukirchen  <chneukirchen@gmail.com>  http://chneukirchen.org

Re: [attic] a few questions on attic

From:
Petros Moisiadis
Date:
2014-03-17 @ 07:49
On 03/16/2014 11:03 PM, Jonas Borgström wrote:
> On 2014-03-16 17:43, Christian Neukirchen wrote:
>
>> Finally a suggestion: It would be useful if one saw during the backup
>> which files contain new data (-v seems to show all files).
> Yeah, that could be useful. Any suggestion for how that would look?
>
> / Jonas
>

What about:

Changed: /path/to/some/file [original size: 50KB, compressed size: 20KB,
unique data: 2KB]
New: /path/to/another/file [original size: 1MB, compressed size: 600KB,
unique data: 30KB]

'compression size' could be omitted if it is not known per file (but
done on the whole stream).

Re: [attic] a few questions on attic

From:
Christian Neukirchen
Date:
2014-03-17 @ 11:32
Petros Moisiadis <ernest0x@yahoo.gr> writes:

> On 03/16/2014 11:03 PM, Jonas Borgström wrote:
>> On 2014-03-16 17:43, Christian Neukirchen wrote:
>>
>>> Finally a suggestion: It would be useful if one saw during the backup
>>> which files contain new data (-v seems to show all files).
>> Yeah, that could be useful. Any suggestion for how that would look?
>>
>> / Jonas
>>
>
> What about:
>
> Changed: /path/to/some/file [original size: 50KB, compressed size: 20KB,
> unique data: 2KB]
> New: /path/to/another/file [original size: 1MB, compressed size: 600KB,
> unique data: 30KB]

duplicity uses a shorter, better scanable format:

A /tmp/addedfile
M /tmp/changedfile
D /tmp/deletedfile

(I guess "D" is not detected by Attic.)

-- 
Christian Neukirchen  <chneukirchen@gmail.com>  http://chneukirchen.org

Re: [attic] a few questions on attic

From:
Jonas Borgström
Date:
2014-03-19 @ 20:06
On 2014-03-17 12:32, Christian Neukirchen wrote:
> Petros Moisiadis <ernest0x@yahoo.gr> writes:
> 
>> On 03/16/2014 11:03 PM, Jonas Borgström wrote:
>>> On 2014-03-16 17:43, Christian Neukirchen wrote:
>>>
>>>> Finally a suggestion: It would be useful if one saw during the backup
>>>> which files contain new data (-v seems to show all files).
>>> Yeah, that could be useful. Any suggestion for how that would look?
>>>
>>> / Jonas
>>>
>>
>> What about:
>>
>> Changed: /path/to/some/file [original size: 50KB, compressed size: 20KB,
>> unique data: 2KB]
>> New: /path/to/another/file [original size: 1MB, compressed size: 600KB,
>> unique data: 30KB]
> 
> duplicity uses a shorter, better scanable format:
> 
> A /tmp/addedfile
> M /tmp/changedfile
> D /tmp/deletedfile
> 
> (I guess "D" is not detected by Attic.)

Yeah, that's right. We can actually only determine if regular files have
been modified.. We have no way of knowing if a directory or symlink is
new or not.

I've created a ticker for this here:

https://github.com/jborg/attic/issues/55

I also added a format suggestion of my own.

/ Jonas