librelist archives

« back to archive

Mount repo?

Mount repo?

From:
Kenneth Jernigan
Date:
2014-03-24 @ 03:04
Hello All -
I'm beginning to evaluate attic to see if it's more efficient for my PC's 
use over obnam, which I've been using for quite a few months now.  It 
seems to have a similar feature set for my use (local PC backup), but 
appears to be lighter weight and small quicker.  I've setup a new repo and
configured a script to automatically backup hourly.  The hourly backups 
have a 12 hour and 30 day pruning period.   All seems well.
However, when data has been lost and I don't know when, with obnam I have 
mounted all generations (archives) from the repository and using the file 
browser found the missing data.  This feature seems absent on attic.  All 
I can find is the mount command, which requires a specific archive (obnam 
equivalent of generations).  Are there currently any plans to add the 
ability to mount the entire repository and be able to browse the different
archives maintained in the repo?
ThanksKen 		 	   		  

Re: [attic] Mount repo?

From:
Jonas Borgström
Date:
2014-03-24 @ 11:15
On 2014-03-24 04:04 , Kenneth Jernigan wrote:
> Hello All -
> 
> I'm beginning to evaluate attic to see if it's more efficient for my
> PC's use over obnam, which I've been using for quite a few months now.
>  It seems to have a similar feature set for my use (local PC backup),
> but appears to be lighter weight and small quicker.  I've setup a new
> repo and configured a script to automatically backup hourly.  The hourly
> backups have a 12 hour and 30 day pruning period.   All seems well.
> 
> However, when data has been lost and I don't know when, with obnam I
> have mounted all generations (archives) from the repository and using
> the file browser found the missing data.  This feature seems absent on
> attic.  All I can find is the mount command, which requires a specific
> archive (obnam equivalent of generations).  Are there currently any
> plans to add the ability to mount the entire repository and be able to
> browse the different archiv es maintained in the repo?

Hi Kenneth,

Yeah, that's a known limitation of the current implementation.

The main reason why this is not already implemented is that Attic needs
to read the archive metadata to be able to make it mountable. So if a
repository contains many archives it would be both time consuming and
consume a lot of memory to process it all at mount time.

One possible solution is to initially only present one top level folder
containing the archive names and then fetch and process metadata for
individual archives on demand when they are first accessed.

I've created a ticket for it here and I hope to be able to include it in
Attic 0.12, unless it turn out to be more difficult than I expected.

https://github.com/jborg/attic/issues/59

/ Jonas


Re: [attic] Mount repo?

From:
Kenneth Jernigan
Date:
2014-04-01 @ 02:45
The following is performed with the latest GIT repo cloned.
I mounted the repository (14 archives) with a size of 113GB (according to 
'du -sh ./data.attic'). With Nautilus set to display icons only, the repo 
mount was very quick to show up.  I then did a list with attributes and it
took >5 minutes to complete the listing with dates and times on 14 
archives in my repo (I know this time is dependent upon processor speed 
and RAM performance).  I suspect this is because it was creating a local 
users .cache (since my backup script is run from cronie and therefore uses
root for the primary .cache folder).   The followup mount was practically 
instant, occuring quicker than a typical archive mount.  During the 
initial 5 min mount, the RAM would hit 350MiB with 49-50% cpu usage.  The 
max CPU usage is in line with a single process utilizing 100% of one core 
in my dual core PC.  Subsequent mount of the repo shows only 14.7 MiB of 
memory utilization immediately after execution.  Once I access the root 
folder, however, the CPU hits 50% and the memory grew until around 100MiB 
(again taking several minutes including memory spikes up to 350 MiB).  
However, as I browse more archives, the attic memory usage climbed.  After
accessing all 14 archives via Nautilus, my memory usage was at 295 MiB.  
This amount seems to increase with each archive accessed. 
However, I'm not sure the ram usage is much of an issue.  Using obnam, I 
noticed that accessing about 6 different "archives" caused memory usage to
grow to 466 MiB.  This is the same base data as attic has.  Obnam, on that
same dataset uses 162 GB data (with 41 historical generations vs attic's 
14).
My only concern would be the access speed.  Waiting several minutes for 1 
processor to provide the archives (only 14 of them) available for access 
is my biggest concern. My second concern would be the potential RAM 
growth.   I'm still trying to studying how the archive is assembled and 
how the cache is utilized.  I'm thinking there ought to be some way to 
utilize the cache files to reduce the RAM and CPU usage....
Anyhow, thanks again for making this happen!
Ken 

> Date: Fri, 28 Mar 2014 23:38:42 +0100
> From: jonas@borgstrom.se
> Subject: Re: [attic] Mount repo?
> To: attic@librelist.com
> 
> On 2014-03-25 03:57, Kenneth Jernigan wrote:
> > I feel that the best method is to list all the archives at the top
> > level.  I believe the main goal of attic is to provide a backup system,
> > not a revision system.  I agree that if the goal was to provide a
> > revision system, then having a version/archive of each file as a
> > "sub"-folder to the file would be best (I'm thinking of git, subversion,
> > etc).  However, as a backup system, it seems the goal is to have
> > versions in time of an entire folder structure.
> > 
> > That said, however, if it would be possible to show either a single
> > archive or multiple archives at the top level, I feel that the best of
> > the current design and a multiple archival design would be best!
> > 
> > Thanks all.  Watching this conversation today has been very encouraging.
> >  It's always good to see others opinions on topics.
> 
> Just a short update, I've now pushed some code that adds support for
> mounting an entire repository. I'm still not very happy with the
> performance and memory usage when mounting repositories with many large
> archives.
> Archive metadata is downloaded and processed when an archive is first
> accessed. This works fairly well most of the time but some graphical
> file managers seems to trigger this for all archives at once as soon as
> the repository top directory is browsed.
> 
> / Jonas
> 
> 
 		 	   		  

Re: [attic] Mount repo?

From:
Jonas Borgström
Date:
2014-04-01 @ 10:59
On 01/04/14 04:45, Kenneth Jernigan wrote:
> The following is performed with the latest GIT repo cloned.
> 
> I mounted the repository (14 archives) with a size of 113GB (according
> to 'du -sh ./data.attic'). With Nautilus set to display icons only, the
> repo mount was very quick to show up.  I then did a list with attributes
> and it took >5 minutes to complete the listing with dates and times on
> 14 archives in my repo (I know this time is dependent upon processor
> speed and RAM performance).  I suspect this is because it was creating a
> local users .cache (since my backup script is run from cronie and
> therefore uses root for the primary .cache folder).

The $HOME/.cache/attic directory is only used when modifying a
repository (create, delete, prune).

> The followup mount
> was practically instant, occuring quicker than a typical archive mount.
>  During the initial 5 min mount, the RAM would hit 350MiB with 49-50%
> cpu usage.  The max CPU usage is in line with a single process utilizing
> 100% of one core in my dual core PC.  Subsequent mount of the repo shows
> only 14.7 MiB of memory utilization immediately after execution.  Once I
> access the root folder, however, the CPU hits 50% and the memory grew
> until around 100MiB (again taking several minutes including memory
> spikes up to 350 MiB).  However, as I browse more archives, the attic
> memory usage climbed.  After accessing all 14 archives via Nautilus, my
> memory usage was at 295 MiB.  This amount seems to increase with each
> archive accessed. 
> 
> However, I'm not sure the ram usage is much of an issue.  Using obnam, I
> noticed that accessing about 6 different "archives" caused memory usage
> to grow to 466 MiB.  This is the same base data as attic has.  Obnam, on
> that same dataset uses 162 GB data (with 41 historical generations vs
> attic's 14).
> 
> My only concern would be the access speed.  Waiting several minutes for
> 1 processor to provide the archives (only 14 of them) available for
> access is m y biggest concern. My second concern would be the potential
> RAM growth.   I'm still trying to studying how the archive is assembled
> and how the cache is utilized.  I'm thinking there ought to be some way
> to utilize the cache files to reduce the RAM and CPU usage....

This is how "attic mount" of a full repository works:

Initially only the root inode is created and dummy inodes for each
archive are added. This is fast since no archive metadata needs to be
accessed, only the archive manifest.

As soon as the user (or their file manager) calls stat() or opendir() on
one of the dummy archive inodes the entire archive inode hierarchy is
created by downloading all of the archive metadata. This uses a lot of
disk io, memory and cpu.

The two main culprits in your use case (as I see it) is:

1. The Nautilus list view access all archives at once, making a bad
situation a lot worse since it forces Attic to download and process all
archive metadata at once. If a user only accesses the archives he/she is
interested in everything will be a lot quicker.

2. The Attic metadata is a sequence of files not a hierarchy. This means
that "attic mount" needs to process all of the metadata in order to
build the inode hierarchy required by fuse. This most likely why attic
is slower than obnam at exporting a repository as a fuse filesystem (and
why attic is faster at creating and restoring backups).

One way to make the metadata processing faster is to cache the result
allowing Attic to read the archive hierarchy data from a cache instead
of re-downloading and re-parsing the metadata every time.

/ Jonas


Re: [attic] Mount repo?

From:
Kenneth Jernigan
Date:
2014-03-25 @ 02:57
I feel that the best method is to list all the archives at the top level.
I believe the main goal of attic is to provide a backup system, not a 
revision system.  I agree that if the goal was to provide a revision 
system, then having a version/archive of each file as a "sub"-folder to 
the file would be best (I'm thinking of git, subversion, etc).  However, 
as a backup system, it seems the goal is to have versions in time of an 
entire folder structure.
That said, however, if it would be possible to show either a single 
archive or multiple archives at the top level, I feel that the best of the
current design and a multiple archival design would be best!
Thanks all.  Watching this conversation today has been very encouraging.  
It's always good to see others opinions on topics.
Ken

> Date: Mon, 24 Mar 2014 19:10:16 -0400
> From: jdc@uwo.ca
> Subject: Re: [attic] Mount repo?
> To: attic@librelist.com
> 
> Jonas Borgström <jonas@borgstrom.se> writes:
> 
> > But I'm not sure "attic mount" would be the best place to provide such
> > an interface. I think most users would expect it to behave like it does
> > today (an exact representation of the filesystem that was originally
> > backed up).
> 
> I think that would be the best default, but still think it could be very
> nice to have an option that puts the various versions of a file in the
> same place.
> 
> > If we would like to implement a time machine like UI I think using an
> > embedded http server with a slick html5 UI would be a better fit than
> > using a fuse filesystem.
> 
> The advantage of a fuse view is that you could use standard tools.  If
> the original file is html, you can do "google-chrome *" to open all
> versions.  If the original file is a jpg, you can do "eog *" to view all
> versions.  If the original is text, you can do "grep foo *" to find a
> version containing the paragraph you deleted.  If the original is a
> video, you can do "vlc *" to view all versions.  If the original is a
> zip file, you can use nautilus to drill into the various version.  Etc.
> 
> Believe it or not, back in the 80s a system I used did backups of
> changed files every 10 minutes, and stored the backups in a tree like
> I'm describing.  It was really convenient.
> 
> Dan
 		 	   		  

Re: [attic] Mount repo?

From:
Jonas Borgström
Date:
2014-03-28 @ 22:38
On 2014-03-25 03:57, Kenneth Jernigan wrote:
> I feel that the best method is to list all the archives at the top
> level.  I believe the main goal of attic is to provide a backup system,
> not a revision system.  I agree that if the goal was to provide a
> revision system, then having a version/archive of each file as a
> "sub"-folder to the file would be best (I'm thinking of git, subversion,
> etc).  However, as a backup system, it seems the goal is to have
> versions in time of an entire folder structure.
> 
> That said, however, if it would be possible to show either a single
> archive or multiple archives at the top level, I feel that the best of
> the current design and a multiple archival design would be best!
> 
> Thanks all.  Watching this conversation today has been very encouraging.
>  It's always good to see others opinions on topics.

Just a short update, I've now pushed some code that adds support for
mounting an entire repository. I'm still not very happy with the
performance and memory usage when mounting repositories with many large
archives.
Archive metadata is downloaded and processed when an archive is first
accessed. This works fairly well most of the time but some graphical
file managers seems to trigger this for all archives at once as soon as
the repository top directory is browsed.

/ Jonas

Re: [attic] Mount repo?

From:
Dan Christensen
Date:
2014-03-24 @ 14:05
Jonas Borgström <jonas@borgstrom.se> writes:

> The main reason why this is not already implemented is that Attic needs
> to read the archive metadata to be able to make it mountable. So if a
> repository contains many archives it would be both time consuming and
> consume a lot of memory to process it all at mount time.
>
> One possible solution is to initially only present one top level folder
> containing the archive names and then fetch and process metadata for
> individual archives on demand when they are first accessed.

The main use case will be to compare several versions of one file, so
rather than have the archives at the top of the tree, it might be useful
to (optionally) have the archives at the leaves.  E.g. I would mount the
repo, and then in place of a file /path/to/file.ext, there would be a
directory /path/to/file.ext/ whose contents are the various versions of
that file, maybe named something like

  /path/to/file.ext/file-date-time.ext  [or .../file-archivename.ext ?]

The date and time could be the modification timestamp or the archive
timestamp.  If different archives have exactly matching versions of a
file, they could (optionally?) be shown as just one version in the
directory.  This suggests that modification time stamp would make the
most sense in the file name.

I haven't used it, but I've read that Apple's Time Machine let's you
visually flip through the various versions of a given file, so this
would be an approximation to that behavior.

If file.ext is a file in some archives and a directory in others,
I think it will still work.  Inside file.ext will be some versions
of the file as well as some subdirectories.

One thing that doesn't work is when there are different versions
of directories (e.g. different permissions).  I think the documentation
will just have to explain that this mode of mounting is for browsing
files, not for restoring directory trees.

As for efficiency, I don't suppose it's possible to quickly parse just
part of the meta data for each archive?  But even if it is slow, I think
it would be quite useful.  One thing that might help with the efficiency
(and reduce clutter) would be if some range of archives could be selected:

  --prefix foo:  just archives matching foo*
  --after date:  just archives after date
  --before date: just archives before date

Or specific archives could be named:

  --archive backup1 --archive backup2

I can also imagine that having the archives at the top of the
tree would sometimes be useful, so maybe a command-line switch
could choose between the two behaviors.

Dan

Re: [attic] Mount repo?

From:
Kenneth Jernigan
Date:
2014-04-01 @ 12:48
Thanks for the reply.  I do see that accessing only a single folder via a 
terminal does  leave a lower memory footprint.  Nautilus now either opens 
or stats all archive sub-directories when I open the root folder.  I 
suspect this is because I have recently accessed those archive 
sub-directories.  
Ken
> Date: Tue, 1 Apr 2014 12:59:01 +0200
> From: jonas@borgstrom.se
> Subject: Re: [attic] Mount repo?
> To: attic@librelist.com
> 
> On 01/04/14 04:45, Kenneth Jernigan wrote:
> > The following is performed with the latest GIT repo cloned.
> > 
> > I mounted the repository (14 archives) with a size of 113GB (according
> > to 'du -sh ./data.attic'). With Nautilus set to display icons only, the
> > repo mount was very quick to show up.  I then did a list with attributes
> > and it took >5 minutes to complete the listing with dates and times on
> > 14 archives in my repo (I know this time is dependent upon processor
> > speed and RAM performance).  I suspect this is because it was creating a
> > local users .cache (since my backup script is run from cronie and
> > therefore uses root for the primary .cache folder).
> 
> The $HOME/.cache/attic directory is only used when modifying a
> repository (create, delete, prune).
> 
> > The followup mount
> > was practically instant, occuring quicker than a typical archive mount.
> >  During the initial 5 min mount, the RAM would hit 350MiB with 49-50%
> > cpu usage.  The max CPU usage is in line with a single process utilizing
> > 100% of one core in my dual core PC.  Subsequent mount of the repo shows
> > only 14.7 MiB of memory utilization immediately after execution.  Once I
> > access the root folder, however, the CPU hits 50% and the memory grew
> > until around 100MiB (again taking several minutes including memory
> > spikes up to 350 MiB).  However, as I browse more archives, the attic
> > memory usage climbed.  After accessing all 14 archives via Nautilus, my
> > memory usage was at 295 MiB.  This amount seems to increase with each
> > archive accessed. 
> > 
> > However, I'm not sure the ram usage is much of an issue.  Using obnam, I
> > noticed that accessing about 6 different "archives" caused memory usage
> > to grow to 466 MiB.  This is the same base data as attic has.  Obnam, on
> > that same dataset uses 162 GB data (with 41 historical generations vs
> > attic's 14).
> > 
> > My only concern would be the access speed.  Waiting several minutes for
> > 1 processor to provide the archives (only 14 of them) available for
> > access is m y biggest concern. My second concern would be the potential
> > RAM growth.   I'm still trying to studying how the archive is assembled
> > and how the cache is utilized.  I'm thinking there ought to be some way
> > to utilize the cache files to reduce the RAM and CPU usage....
> 
> This is how "attic mount" of a full repository works:
> 
> Initially only the root inode is created and dummy inodes for each
> archive are added. This is fast since no archive metadata needs to be
> accessed, only the archive manifest.
> 
> As soon as the user (or their file manager) calls stat() or opendir() on
> one of the dummy archive inodes the entire archive inode hierarchy is
> created by downloading all of the archive metadata. This uses a lot of
> disk io, memory and cpu.
> 
> The two main culprits in your use case (as I see it) is:
> 
> 1. The Nautilus list view access all archives at once, making a bad
> situation a lot worse since it forces Attic to download and process all
> archive metadata at once. If a user only accesses the archives he/she is
> interested in everything will be a lot quicker.
> 
> 2. The Attic metadata is a sequence of files not a hierarchy. This means
> that "attic mount" needs to process all of the metadata in order to
> build the inode hierarchy required by fuse. This most likely why attic
> is slower than obnam at exporting a repository as a fuse filesystem (and
> why attic is faster at creating and restoring backups).
> 
> One way to make the metadata processing faster is to cache the result
> allowing Attic to read the archive hierarchy data from a cache instead
> of re-downloading and re-parsing the metadata every time.
> 
> / Jonas
> 
> 
> 
 		 	   		  

Re: [attic] Mount repo?

From:
Petros Moisiadis
Date:
2014-03-24 @ 19:11
On 03/24/14 16:05, Dan Christensen wrote:
> Jonas Borgström <jonas@borgstrom.se> writes:
>
>> The main reason why this is not already implemented is that Attic needs
>> to read the archive metadata to be able to make it mountable. So if a
>> repository contains many archives it would be both time consuming and
>> consume a lot of memory to process it all at mount time.
>>
>> One possible solution is to initially only present one top level folder
>> containing the archive names and then fetch and process metadata for
>> individual archives on demand when they are first accessed.
> The main use case will be to compare several versions of one file, so
> rather than have the archives at the top of the tree, it might be useful
> to (optionally) have the archives at the leaves.  E.g. I would mount the
> repo, and then in place of a file /path/to/file.ext, there would be a
> directory /path/to/file.ext/ whose contents are the various versions of
> that file, maybe named something like
>
>   /path/to/file.ext/file-date-time.ext  [or .../file-archivename.ext ?]
>
> The date and time could be the modification timestamp or the archive
> timestamp.  If different archives have exactly matching versions of a
> file, they could (optionally?) be shown as just one version in the
> directory.  This suggests that modification time stamp would make the
> most sense in the file name.
>
> I haven't used it, but I've read that Apple's Time Machine let's you
> visually flip through the various versions of a given file, so this
> would be an approximation to that behavior.
>
> If file.ext is a file in some archives and a directory in others,
> I think it will still work.  Inside file.ext will be some versions
> of the file as well as some subdirectories.
>
> One thing that doesn't work is when there are different versions
> of directories (e.g. different permissions).  I think the documentation
> will just have to explain that this mode of mounting is for browsing
> files, not for restoring directory trees.
>
> As for efficiency, I don't suppose it's possible to quickly parse just
> part of the meta data for each archive?  But even if it is slow, I think
> it would be quite useful.  One thing that might help with the efficiency
> (and reduce clutter) would be if some range of archives could be selected:
>
>   --prefix foo:  just archives matching foo*
>   --after date:  just archives after date
>   --before date: just archives before date
>
> Or specific archives could be named:
>
>   --archive backup1 --archive backup2
>
> I can also imagine that having the archives at the top of the
> tree would sometimes be useful, so maybe a command-line switch
> could choose between the two behaviors.
>
> Dan

I think that the 'mount' command should not do anything special, but
behave in expected, standard ways:
- mount an archive at a given mount path if the source argument is an
archive specification
- mount all archives of a repository on different directories under the
same base mount path if the source argument is a repository specification

If we need a specific representation for comparing, then this should be
the task of a special attic command (e.g. 'attic diff /path/to/file
REPOSITORY') or a wrapper script.
For example, you can use simple one-line bash loops like these:
|
||$ archive_prev=; for archive in /mnt/attic_repo/*; do diff -ru
"${archive_prev}" "${archive}"; archive_prev="${archive}"; done||# for
diffing between all files||
||$ archive_prev=; for archive in /mnt/attic_repo/*; do diff -ru
"${archive_prev}/path/to/somefile" "${archive}/path/to/somefile";
archive_prev="${archive}"; done||# for diffing between successive
backups of a specific file|


Re: [attic] Mount repo?

From:
Dan Christensen
Date:
2014-03-24 @ 20:32
Petros Moisiadis <ernest0x@yahoo.gr> writes:

> I think that the 'mount' command should not do anything special, but
> behave in expected, standard ways:
> - mount an archive at a given mount path if the source argument is an
>   archive specification
> - mount all archives of a repository on different directories under
>   the same base mount path if the source argument is a repository
>   specification

I agree that this should be supported, but I really think that a common
use case for mounting multiple archives will be to drill down to a
particular file and compare versions.  Your suggestion of using diff is
fine for text files, but if I want to browse various edits to an image,
or view various versions of a pdf file, or of an html file, etc, it will
be more convenient to see the versions in one place.  And we shouldn't
expect users to be proficient with shell scripting.  For example, they
might mount the repo and then use nautilus or another graphical file
browser to try to find a version that contains something they want to
retrieve.

Dan

Re: [attic] Mount repo?

From:
Petros Moisiadis
Date:
2014-03-24 @ 19:07
<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">On 03/24/14 16:05, Dan Christensen
      wrote:<br>
    </div>
    <blockquote cite="mid:87r45rbsku.fsf@jdc.math.uwo.ca" type="cite">
      <pre wrap="">Jonas Borgström <a class="moz-txt-link-rfc2396E" 
href="mailto:jonas@borgstrom.se"><jonas@borgstrom.se></a> writes:

</pre>
      <blockquote type="cite">
        <pre wrap="">The main reason why this is not already implemented 
is that Attic needs
to read the archive metadata to be able to make it mountable. So if a
repository contains many archives it would be both time consuming and
consume a lot of memory to process it all at mount time.

One possible solution is to initially only present one top level folder
containing the archive names and then fetch and process metadata for
individual archives on demand when they are first accessed.
</pre>
      </blockquote>
      <pre wrap="">
The main use case will be to compare several versions of one file, so
rather than have the archives at the top of the tree, it might be useful
to (optionally) have the archives at the leaves.  E.g. I would mount the
repo, and then in place of a file /path/to/file.ext, there would be a
directory /path/to/file.ext/ whose contents are the various versions of
that file, maybe named something like

  /path/to/file.ext/file-date-time.ext  [or .../file-archivename.ext ?]

The date and time could be the modification timestamp or the archive
timestamp.  If different archives have exactly matching versions of a
file, they could (optionally?) be shown as just one version in the
directory.  This suggests that modification time stamp would make the
most sense in the file name.

I haven't used it, but I've read that Apple's Time Machine let's you
visually flip through the various versions of a given file, so this
would be an approximation to that behavior.

If file.ext is a file in some archives and a directory in others,
I think it will still work.  Inside file.ext will be some versions
of the file as well as some subdirectories.

One thing that doesn't work is when there are different versions
of directories (e.g. different permissions).  I think the documentation
will just have to explain that this mode of mounting is for browsing
files, not for restoring directory trees.

As for efficiency, I don't suppose it's possible to quickly parse just
part of the meta data for each archive?  But even if it is slow, I think
it would be quite useful.  One thing that might help with the efficiency
(and reduce clutter) would be if some range of archives could be selected:

  --prefix foo:  just archives matching foo*
  --after date:  just archives after date
  --before date: just archives before date

Or specific archives could be named:

  --archive backup1 --archive backup2

I can also imagine that having the archives at the top of the
tree would sometimes be useful, so maybe a command-line switch
could choose between the two behaviors.

Dan
</pre>
    </blockquote>
    <br>
    I think that the 'mount' command should not do anything special, but
    behave in expected, standard ways: <br>
    - mount an archive at a given mount path if the source argument is
    an archive specification<br>
    - mount all archives of a repository on different directories under
    the same base mount path if the source argument is a repository
    specification<br>
    <br>
    If we need a specific representation for comparing, then this should
    be the task of a special attic command (e.g. 'attic diff
    /path/to/file REPOSITORY') or a wrapper script.<br>
    For example, you can use simple one-line bash loops like these:<br>
    <code><br>
    </code><code>$ archive_prev=; for archive in /mnt/attic_repo/*; do
      diff -ru "${archive_prev}" "${archive}";
      archive_prev="${archive}"; done</code><code> # for diffing between
      all files</code><code><br>
    </code><code>$ archive_prev=; for archive in /mnt/attic_repo/*; do
      diff -ru "${archive_prev}/path/to/somefile"
      "${archive}/path/to/somefile"; archive_prev="${archive}"; done</code><code>
      # for diffing between successive backups of a specific file</code><br>
    <br>
    <br>
    <br>
  </body>
</html>

Re: [attic] Mount repo?

From:
Jonas Borgström
Date:
2014-03-24 @ 21:10
On 2014-03-24 15:05, Dan Christensen wrote:
> Jonas Borgström <jonas@borgstrom.se> writes:
> 
>> The main reason why this is not already implemented is that Attic needs
>> to read the archive metadata to be able to make it mountable. So if a
>> repository contains many archives it would be both time consuming and
>> consume a lot of memory to process it all at mount time.
>>
>> One possible solution is to initially only present one top level folder
>> containing the archive names and then fetch and process metadata for
>> individual archives on demand when they are first accessed.
> 
> The main use case will be to compare several versions of one file, so
> rather than have the archives at the top of the tree, it might be useful
> to (optionally) have the archives at the leaves.  E.g. I would mount the
> repo, and then in place of a file /path/to/file.ext, there would be a
> directory /path/to/file.ext/ whose contents are the various versions of
> that file, maybe named something like
> 
>   /path/to/file.ext/file-date-time.ext  [or .../file-archivename.ext ?]
> 
> The date and time could be the modification timestamp or the archive
> timestamp.  If different archives have exactly matching versions of a
> file, they could (optionally?) be shown as just one version in the
> directory.  This suggests that modification time stamp would make the
> most sense in the file name.
> 
> I haven't used it, but I've read that Apple's Time Machine let's you
> visually flip through the various versions of a given file, so this
> would be an approximation to that behavior.
> 
> If file.ext is a file in some archives and a directory in others,
> I think it will still work.  Inside file.ext will be some versions
> of the file as well as some subdirectories.
> 
> One thing that doesn't work is when there are different versions
> of directories (e.g. different permissions).  I think the documentation
> will just have to explain that this mode of mounting is for browsing
> files, not for restoring directory trees.

I agree with you that Apple Time Machine has a nice user interface that
lets you easily flip between different backup dates for the selected folder.
IIRC it does not indicate if files have changed or not between different
backup dates. You just get N different views for each folder if you have
N backups (archives).

But I'm not sure "attic mount" would be the best place to provide such
an interface. I think most users would expect it to behave like it does
today (an exact representation of the filesystem that was originally
backed up).
If we start to change the directory structure and/or add additional
magic files or folders it would no longer be possible to restore more
than one file using commands such as "cp -r" or to directly access more
complex things like git/subversion checkouts.

Under the hood Apple Time Machine uses a fileystem structure that is
very similar to my "archives at the top" proposal". That structure is
later used by a separate GUI program to let the user visually flip
through different snapshots of his/her folders.

If we would like to implement a time machine like UI I think using an
embedded http server with a slick html5 UI would be a better fit than
using a fuse filesystem.

/ Jonas

Re: [attic] Mount repo?

From:
Dan Christensen
Date:
2014-03-24 @ 23:10
Jonas Borgström <jonas@borgstrom.se> writes:

> But I'm not sure "attic mount" would be the best place to provide such
> an interface. I think most users would expect it to behave like it does
> today (an exact representation of the filesystem that was originally
> backed up).

I think that would be the best default, but still think it could be very
nice to have an option that puts the various versions of a file in the
same place.

> If we would like to implement a time machine like UI I think using an
> embedded http server with a slick html5 UI would be a better fit than
> using a fuse filesystem.

The advantage of a fuse view is that you could use standard tools.  If
the original file is html, you can do "google-chrome *" to open all
versions.  If the original file is a jpg, you can do "eog *" to view all
versions.  If the original is text, you can do "grep foo *" to find a
version containing the paragraph you deleted.  If the original is a
video, you can do "vlc *" to view all versions.  If the original is a
zip file, you can use nautilus to drill into the various version.  Etc.

Believe it or not, back in the 80s a system I used did backups of
changed files every 10 minutes, and stored the backups in a tree like
I'm describing.  It was really convenient.

Dan