librelist archives

« back to archive

Informing Git Status

Informing Git Status

From:
Albert Krawczyk
Date:
2012-01-03 @ 08:08
Hi

I originally posted this on the libgit2 issue list on github. Sorry about
that it's been one of those days.


Recently I started working on some code that uses the Windows NTFS USN
(Change) Journal to work out what files have changed in a directory, this
means not all files in a directory need to be checked by a status operation.

I was hoping somebody could tell me which file I should be hacking up. For
example if I could provide a list of paths that have changed in a working
copy, where would I feed that list into libgit2 so that it doesn't have to
scan the entire directory for changes?

My code is currently in C# so it would be useless for wider distribution,
but I was hoping to hack myself up libgit2 to see how hard it would be to
make the 'status' operation be 'informed' an opposed to a 'foreach file in
whole repository' thing.

Hope this makes sense... it's been a long day :)

Albert 

Re: [libgit2] Informing Git Status

From:
Vicent Marti
Date:
2012-01-04 @ 00:41
Hey Albert,

as you probably already know, this Windows-only "optimization" using
the USN journal is rather tricky -- specially when you consider that
access to the journal requires admin permissions.

The status code you're looking for can be found on `status.c` and
`status.h`, and I'm not going to discourage you from writing a
"notification" API for the status, specially if it's a clean one,
because we could reuse it with other (saner) FS change APIs such as
inotify on Unix systems.

If you're up for the task, open an issue and I can guide you with any
issues you come up with.

Cheers,
Vicent

On Tue, Jan 3, 2012 at 9:08 AM, Albert Krawczyk
<pro-logic@optusnet.com.au> wrote:
> Hi
>
> I originally posted this on the libgit2 issue list on github. Sorry about
> that it's been one of those days.
>
>
> Recently I started working on some code that uses the Windows NTFS USN
> (Change) Journal to work out what files have changed in a directory, this
> means not all files in a directory need to be checked by a status operation.
>
> I was hoping somebody could tell me which file I should be hacking up. For
> example if I could provide a list of paths that have changed in a working
> copy, where would I feed that list into libgit2 so that it doesn't have to
> scan the entire directory for changes?
>
> My code is currently in C# so it would be useless for wider distribution,
> but I was hoping to hack myself up libgit2 to see how hard it would be to
> make the 'status' operation be 'informed' an opposed to a 'foreach file in
> whole repository' thing.
>
> Hope this makes sense... it's been a long day :)
>
> Albert
>

Re: [libgit2] Informing Git Status

From:
Albert Krawczyk
Date:
2012-01-04 @ 05:45
Hey Vincent,

> this Windows-only "optimization" using the USN journal is rather tricky

Yeah, it is, from my current C# implementation it seems that when a
repository has more than 5000 files in it does the USN become faster (I
suspect in a pure C implementation this would be a smaller number). 

> The status code you're looking for can be found on `status.c` and
`status.h`
Thanks!

> and I'm not going to discourage you from writing a "notification" API for
the status
That's my job :)

>, specially if it's a clean one, because we could reuse it with other
(saner) FS change APIs such as inotify on Unix systems
>If you're up for the task, open an issue and I can guide you with any
issues you come up with.

I'm not up for the job at all, I can barely comprehend C, let alone write
it. This is more of a 'if I have some spare time excuse to learn about C'
kind of thing. 

I do know some stuff about the USN Journal though from using it, so we can
talk about a decent API design (especially if somebody knows about inotify)
so if at some point in future somebody decides to do this the API can work
with all the change journals taken into the design.

Out of curiosity does git have such a notification API that can be used? 

Albert

Re: [libgit2] Informing Git Status

From:
Carlos Martín Nieto
Date:
2012-01-04 @ 18:13
On Wed, Jan 04, 2012 at 04:45:49PM +1100, Albert Krawczyk wrote:
> I do know some stuff about the USN Journal though from using it, so we can
> talk about a decent API design (especially if somebody knows about inotify)
> so if at some point in future somebody decides to do this the API can work
> with all the change journals taken into the design.
> 
> Out of curiosity does git have such a notification API that can be used? 

git is a program that gets called every time you want to do a git
operation. I'm not sure how USN works, but for inotify, you subscribe
to changes in a set of files. This requires the application to run
continuosly, or it won't get any notifications (as there is no
application to send the notifications to). Git doesn't run most of the
time and two 'git status' commands have to relation to each other from
the point of view of the system. For git to be able to use such a
notification system, it would need a daemon (similar to ssh-agent or
gpg-agent) that runs in the background and stores the notification for
when git needs them again. The notifications would make it very
complicated.

libgit2 could keep track, as long as the application runs (which a GUI
probably would), but it would also be complicated, as there are bound
to be a lot of corner cases whith no clear solution.

   cmn

Re: [libgit2] Informing Git Status

From:
Albert Krawczyk
Date:
2012-01-04 @ 22:54
>
> I'm not sure how USN works, but for inotify, you subscribe
> to changes in a set of files. This requires the application to run
> continuosly, or it won't get any notifications (as there is no
> application to send the notifications to).
>

Right, USN on the other hand in essentially a gigantic log with each file
modification on a partition being logged in it. To see what's changed you
have to provide it the last 'change id' (each file change has a unique
sequential id) and it will provide you a list of all changes (on the entire
drive) that happened since that change id. It's then up to the user to
filter those changes. The user can get the list of files that were
'touched' in a directory. It would then be up to the git status code to see
if there were any actual changes to the file.

In practice its a fairly fast process, each change record is about 80-90
bytes long.

Re: [libgit2] Informing Git Status

From:
Matthieu Moy
Date:
2012-01-04 @ 23:12
"Carlos Martín Nieto" <carlos@cmartin.tk> writes:

> libgit2 could keep track, as long as the application runs (which a GUI
> probably would), but it would also be complicated, as there are bound
> to be a lot of corner cases whith no clear solution.

You can also have a daemon running and keeping track of modifications
compared to last run of "status". Mercurial does something like this
with its inotify extension. AFAIK, no such thing has ever been tried for
Git.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

Re: [libgit2] Informing Git Status

From:
Vicent Marti
Date:
2012-01-04 @ 14:34
On Wed, Jan 4, 2012 at 6:45 AM, Albert Krawczyk
<pro-logic@optusnet.com.au> wrote:
> Out of curiosity does git have such a notification API that can be used?

No, it does not, and unfortunately there doesn't seem to be any
interest on implementing it.