librelist archives

« back to archive

Random access to archive metadata/more efficient mount?

Random access to archive metadata/more efficient mount?

From:
Jeremy Maitin-Shepard
Date:
2014-04-30 @ 23:46
Although I haven't examined the source code very thoroughly, I
understand that archives are represented as a list of files, and FUSE
mounting currently involves (at initialization) reading all of the
archive metadata and constructing a complete tree structure of the
filesystem, which seems to be a significant source of inefficiency for
large archives.

Is it possible to access the list of files non-sequentially?  Provided
that the files are listed in an appropriate traversal order,
non-sequential access would allow a binary search to be used to look
up a file or directory by its path.  The FUSE filesystem could just
read the contents of directories as they are accessed.

I believe the traversal order required is given by the following pseudocode:

def traverse(root):
  entries = []
  subdirs = []
  for e in list_entries_to_be_backed_up(root):
    entries.add(e)
    if is_dir(e): subdirs.add(e)
  entries.sort()
  subdirs.sort()
  for e in entries:  add_archive_entry(e)
  for d in subdirs: traverse(d)

This differs from the current order, which interleaves subdirectory
contents with root directory files, but is there anything that depends
on the existing order?  The sorting would not likely impose a
significant cost.

For a binary search, it would not be necessary to be able to skip
forward/backward an exact number of items in the list of files, just
to be able to find the next/previous entry at a given byte offset
(like UTF-8).

Re: Random access to archive metadata/more efficient mount?

From:
Cyril Roussillon
Date:
2014-05-07 @ 20:25
Just a remark, I think Jeremy is right and it is actually possible to
get random access (I mean hierarchical access) with only slight
modification of how the metadata are stored.

It would just require to write dir entries after their content, to write
the entry size after the entries, and to add a recursive directory size
after the dir entry. Then it allows to navigate in the hierarchy
starting at the end (it's necessary to do it reverse like this because
the sizes can only be written after the content).

The traversal order would be :

def traverse(root):
  files = []
  subdirs = []
  for e in list_entries_to_be_backed_up(root):
    if is_dir(e): subdirs.add(e)
    else: files.add(e)
  for f in files:  add_archive_entry(f)
  for d in subdirs: traverse(d)
  add_archive_entry(root) 


For instance consider the following hierarchy :
root
  dir1
    file11
    dir11
    dir12
  dir2
    file21
    dir21
      file211
    dir22
  dir3
    file31
    dir31
    dir32

With 4-bytes numbers it would be stored as :

file11[6][10]dir11[5][9]dir12[5][9]dir1[4][48]
file21[6][10]file211[7][11]dir21[5][24]dir22[5][9]dir2[4][63]
file31[6][10]dir31[5][9]dir32[5][9]dir3[4][48]
root[4][111]

Now if you want to access root/dir2/dir21/file211 : go the the end, drop
111, read 4, is "root" what you want ? yes so enter by reading the
previous value 48, read 4, is "dir3" what you want ? no so go 48 bytes
back (here if dir3 is huge you can save reading and processing a lot of
metadata ; as you know all the chunks ids that make the metadata and the
chunks size from the cache, you can go directly to the right chunk),
read 63, read 4, is "dir2" what you want ? yes so enter by reading the
previous value 9, read 5, is "dir22" what you want ? no so go 9 bytes
back, read 24, read 5, is "dir21" what you want ? yes so enter by
reading the previous value 11, read 7, is "file211" what you want ? yes
you're done !

If you want to list root/dir2 : start the same, once you have found
"dir2", read 9, read 5, print "dir22", go 9 bytes back, read 24, read 5,
print dir21, go 24 bytes back, read 10, read 6, print file21, go back 10
bytes backward, now detect that you are 63 bytes back compared to when
you started listing the directory so stop.

You can also by ignoring the recursive directory sizes linearly list all
files, but you can also list all files recursively the same way and
avoid storing the full path of entries but just their name.


Anyway as it is required to process the entire archive to correctly
implement hard links, it was more for the challenge, and there is no
guarantee that it would be faster once the fuse hierarchy is completely
built.

Just out of curiosity, how are hard links implemented in the repository
and in the fuse mount ?





-- 
Cyril Roussillon
http://crteknologies.fr/

Re: [attic] Re: Random access to archive metadata/more efficient mount?

From:
Jeremy Maitin-Shepard
Date:
2014-05-08 @ 20:42
One possibility would be to have an incremental mount mode that does not
correctly handle hard links (they would just appear as distinct files with
the same data), but also the existing mount mode that does.  The added
complexity of two fuse filesystems to maintain might not be worth it,
though.


On Wed, May 7, 2014 at 1:25 PM, Cyril Roussillon <cyril42e@gmail.com> wrote:

> Just a remark, I think Jeremy is right and it is actually possible to
> get random access (I mean hierarchical access) with only slight
> modification of how the metadata are stored.
>
> It would just require to write dir entries after their content, to write
> the entry size after the entries, and to add a recursive directory size
> after the dir entry. Then it allows to navigate in the hierarchy
> starting at the end (it's necessary to do it reverse like this because
> the sizes can only be written after the content).
>
> The traversal order would be :
>
> def traverse(root):
>   files = []
>   subdirs = []
>   for e in list_entries_to_be_backed_up(root):
>     if is_dir(e): subdirs.add(e)
>     else: files.add(e)
>   for f in files:  add_archive_entry(f)
>   for d in subdirs: traverse(d)
>   add_archive_entry(root)
>
>
> For instance consider the following hierarchy :
> root
>   dir1
>     file11
>     dir11
>     dir12
>   dir2
>     file21
>     dir21
>       file211
>     dir22
>   dir3
>     file31
>     dir31
>     dir32
>
> With 4-bytes numbers it would be stored as :
>
> file11[6][10]dir11[5][9]dir12[5][9]dir1[4][48]
> file21[6][10]file211[7][11]dir21[5][24]dir22[5][9]dir2[4][63]
> file31[6][10]dir31[5][9]dir32[5][9]dir3[4][48]
> root[4][111]
>
> Now if you want to access root/dir2/dir21/file211 : go the the end, drop
> 111, read 4, is "root" what you want ? yes so enter by reading the
> previous value 48, read 4, is "dir3" what you want ? no so go 48 bytes
> back (here if dir3 is huge you can save reading and processing a lot of
> metadata ; as you know all the chunks ids that make the metadata and the
> chunks size from the cache, you can go directly to the right chunk),
> read 63, read 4, is "dir2" what you want ? yes so enter by reading the
> previous value 9, read 5, is "dir22" what you want ? no so go 9 bytes
> back, read 24, read 5, is "dir21" what you want ? yes so enter by
> reading the previous value 11, read 7, is "file211" what you want ? yes
> you're done !
>
> If you want to list root/dir2 : start the same, once you have found
> "dir2", read 9, read 5, print "dir22", go 9 bytes back, read 24, read 5,
> print dir21, go 24 bytes back, read 10, read 6, print file21, go back 10
> bytes backward, now detect that you are 63 bytes back compared to when
> you started listing the directory so stop.
>
> You can also by ignoring the recursive directory sizes linearly list all
> files, but you can also list all files recursively the same way and
> avoid storing the full path of entries but just their name.
>
>
> Anyway as it is required to process the entire archive to correctly
> implement hard links, it was more for the challenge, and there is no
> guarantee that it would be faster once the fuse hierarchy is completely
> built.
>
> Just out of curiosity, how are hard links implemented in the repository
> and in the fuse mount ?
>
>
>
>
>
> --
> Cyril Roussillon
> http://crteknologies.fr/
>
>
>

Re: [attic] Random access to archive metadata/more efficient mount?

From:
Jonas Borgström
Date:
2014-05-02 @ 13:46
On 2014-05-01 01:46, Jeremy Maitin-Shepard wrote:
> Although I haven't examined the source code very thoroughly, I
> understand that archives are represented as a list of files, and FUSE
> mounting currently involves (at initialization) reading all of the
> archive metadata and constructing a complete tree structure of the
> filesystem, which seems to be a significant source of inefficiency for
> large archives.
> 
> Is it possible to access the list of files non-sequentially?  Provided
> that the files are listed in an appropriate traversal order,
> non-sequential access would allow a binary search to be used to look
> up a file or directory by its path.  The FUSE filesystem could just
> read the contents of directories as they are accessed.

Unfortunately it's not really possible to access the archive metadata
non-sequentially.
Even if that would be possible correctly implementing hardlinked files
(nlinks > 1) would only be possible by processing the entire archive at
once.

I think the best thing we could do to improve performance short term is
to cache fuse hierarchy locally. This way the expensive processing only
has to be done once per archive and not every time the repository is
(re-)mounted.

/ Jonas