librelist archives

« back to archive

Status of push

Status of push

From:
Josh Bleecher Snyder
Date:
2012-03-09 @ 23:57
Hi all,

I'm sure you're sick of this question, but I was hoping to get a quick
status update on pushing to a remote. I'd love to use libgit2 for my
project, and this is the only missing piece for me right now...

Thanks!

-josh

Re: [libgit2] Status of push

From:
Vicent Marti
Date:
2012-03-10 @ 01:44
Hey Josh,

unfortunately we're still delayed on push. We're hoping that this
year's Summer of Code student will finish it -- it's quite a bit of
work, because it has pack-objects as a prerequisite.

You're of course welcome to help ;)

Cheers,
Vicent

On Sat, Mar 10, 2012 at 12:57 AM, Josh Bleecher Snyder
<josharian@gmail.com> wrote:
> Hi all,
>
> I'm sure you're sick of this question, but I was hoping to get a quick
> status update on pushing to a remote. I'd love to use libgit2 for my
> project, and this is the only missing piece for me right now...
>
> Thanks!
>
> -josh

Re: [libgit2] Status of push

From:
Josh Bleecher Snyder
Date:
2012-03-10 @ 02:03
Hi Vincent,

Thanks for the update. I'd be interested in helping, but I would be
coming in completely cold, both to this project and generally to git
internals.

Are there any relevant entry-level chunks I could tackle to get up to
speed? Alternatively, is there a write-up anywhere of what needs to
get done that I could use to orient myself?

Feel free to reply by simply filing issues and pointing me at them. :)

Cheers,
Josh


On Fri, Mar 9, 2012 at 5:44 PM, Vicent Marti <vicent@github.com> wrote:
> Hey Josh,
>
> unfortunately we're still delayed on push. We're hoping that this
> year's Summer of Code student will finish it -- it's quite a bit of
> work, because it has pack-objects as a prerequisite.
>
> You're of course welcome to help ;)
>
> Cheers,
> Vicent
>
> On Sat, Mar 10, 2012 at 12:57 AM, Josh Bleecher Snyder
> <josharian@gmail.com> wrote:
>> Hi all,
>>
>> I'm sure you're sick of this question, but I was hoping to get a quick
>> status update on pushing to a remote. I'd love to use libgit2 for my
>> project, and this is the only missing piece for me right now...
>>
>> Thanks!
>>
>> -josh

Re: [libgit2] Status of push

From:
Matias Piipari
Date:
2012-03-10 @ 02:17
I was going to ask exactly the same question. Would be very willing, but
probably not very able, to help by contributing any entry level parts of
remote support.

Matias

On Sat, Mar 10, 2012 at 2:03 AM, Josh Bleecher Snyder
<josharian@gmail.com>wrote:

> Hi Vincent,
>
> Thanks for the update. I'd be interested in helping, but I would be
> coming in completely cold, both to this project and generally to git
> internals.
>
> Are there any relevant entry-level chunks I could tackle to get up to
> speed? Alternatively, is there a write-up anywhere of what needs to
> get done that I could use to orient myself?
>
> Feel free to reply by simply filing issues and pointing me at them. :)
>
> Cheers,
> Josh
>
>
> On Fri, Mar 9, 2012 at 5:44 PM, Vicent Marti <vicent@github.com> wrote:
> > Hey Josh,
> >
> > unfortunately we're still delayed on push. We're hoping that this
> > year's Summer of Code student will finish it -- it's quite a bit of
> > work, because it has pack-objects as a prerequisite.
> >
> > You're of course welcome to help ;)
> >
> > Cheers,
> > Vicent
> >
> > On Sat, Mar 10, 2012 at 12:57 AM, Josh Bleecher Snyder
> > <josharian@gmail.com> wrote:
> >> Hi all,
> >>
> >> I'm sure you're sick of this question, but I was hoping to get a quick
> >> status update on pushing to a remote. I'd love to use libgit2 for my
> >> project, and this is the only missing piece for me right now...
> >>
> >> Thanks!
> >>
> >> -josh
>

Re: [libgit2] Status of push

From:
Sean M. Collins
Date:
2012-03-10 @ 05:10
Hi,

On Fri, Mar 09, 2012 at 06:03:00PM -0800, Josh Bleecher Snyder wrote:
> Are there any relevant entry-level chunks I could tackle to get up to
> speed? Alternatively, is there a write-up anywhere of what needs to
> get done that I could use to orient myself?

These links helped me:

* Git Concepts (http://schacon.github.com/git/user-manual.html#git-concepts)
* Git magic - Secrets Revealed 
(http://www-cs-students.stanford.edu/~blynn/gitmagic/ch08.html)
* Hacking Git (http://schacon.github.com/git/user-manual.html#hacking-git)

I highly recommend what the "Hacking Git" link suggests: Check out the
first version of Git and read through some of the code. You'll discover
that the push / pull functionality for git started out as a script
that would rsync the objects directory 
(https://github.com/gitster/git/blob/839a7a06f35bf8cd563a41d6db97f453ab108129/git-pull-script)

Re: [libgit2] Status of push

From:
Jonathan Nieder
Date:
2012-03-10 @ 05:27
Sean M. Collins wrote:

> I highly recommend what the "Hacking Git" link suggests: Check out the
> first version of Git and read through some of the code. You'll discover
> that the push / pull functionality for git started out as a script
> that would rsync the objects directory

Documentation/technical/pack-protocol.txt and early historical
versions of pack-objects.c and unpack-objects.c found by

  git log -- pack-objects.c unpack-objects.c

might be useful as well.

Re: [libgit2] Status of push

From:
Sean M. Collins
Date:
2012-03-10 @ 05:39
On Fri, Mar 09, 2012 at 11:27:57PM -0600, Jonathan Nieder wrote:
> Documentation/technical/pack-protocol.txt and early historical
> versions of pack-objects.c and unpack-objects.c found by
> 
>   git log -- pack-objects.c unpack-objects.c
> 
> might be useful as well.

Indeed. Commit c323ac7 in git has a pretty good commit message 
from Linus that describes the overall idea.

-- 
Sean M. Collins

Re: [libgit2] Status of push

From:
Matias Piipari
Date:
2012-03-10 @ 13:26
Thanks Sean and Josh

Thanks for the pointers, most helpful! I've poked a bit around in git and
libgit2 source. Do I understand the high level of the problem with libgit2
correctly: the equivalent of pack-objects, which creates the packaged
representations, is missing but unpack-objects is already available?
Similarly, delta computing from git's delta.c that pack-objects requires,
is missing, but applying a delta is available?

Matias

(Please be gentle: this is all new to me.)


On Sat, Mar 10, 2012 at 5:39 AM, Sean M. Collins <sean@coreitpro.com> wrote:

> On Fri, Mar 09, 2012 at 11:27:57PM -0600, Jonathan Nieder wrote:
> > Documentation/technical/pack-protocol.txt and early historical
> > versions of pack-objects.c and unpack-objects.c found by
> >
> >   git log -- pack-objects.c unpack-objects.c
> >
> > might be useful as well.
>
> Indeed. Commit c323ac7 in git has a pretty good commit message
> from Linus that describes the overall idea.
>
> --
> Sean M. Collins
>

Re: [libgit2] Status of push

From:
Vicent Marti
Date:
2012-03-11 @ 00:16
On Sat, Mar 10, 2012 at 2:26 PM, Matias Piipari
<matias.piipari@gmail.com> wrote:
> Do I understand the high level of the problem with libgit2
> correctly: the equivalent of pack-objects, which creates the packaged
> representations, is missing but unpack-objects is already available?
> Similarly, delta computing from git's delta.c that pack-objects requires, is
> missing, but applying a delta is available?

Yep, that's exactly it! The code to actually build packfiles is
currently missing from the library, and it'd be a great starting point
to get Push up to speed.

> (Please be gentle: this is all new to me.)

I have unlimited patience for people willing to throw us a hand.

Cheers,
Vicent

Re: [libgit2] Status of push

From:
Josh Bleecher Snyder
Date:
2012-03-12 @ 20:27
>> Do I understand the high level of the problem with libgit2
>> correctly: the equivalent of pack-objects, which creates the packaged
>> representations, is missing but unpack-objects is already available?
>> Similarly, delta computing from git's delta.c that pack-objects requires, is
>> missing, but applying a delta is available?
>
> Yep, that's exactly it! The code to actually build packfiles is
> currently missing from the library, and it'd be a great starting point
> to get Push up to speed.

I've just worked through all the resources -- thanks to everyone who
sent them along, very helpful.

It looks like the patch-delta stuff is, strictly speaking, optional
when writing a pack-objects implementation. For a first pass, all
objects could be put into the pack file directly (without any of them
in a delta representation), and the delta calculations could be added
as an optimization. Does that sound right? (I'm also surprised that
there aren't multiple delta strategies aimed at different file types.)

Matias, where are you with this? I don't want to duplicate work or
step on your toes. I could get started on this this week, most
likely...

Duke, same question. Let us know if/when you write any tests. Failing
tests are a wonderful place to start. :)

Perhaps we should start a libgit2 fork for coordinating work on push?
Or is there a better way to coordinate?

-josh

Re: [libgit2] Status of push

From:
Matias Piipari
Date:
2012-03-12 @ 21:31
Hi Josh

Re: packing full objects, was my understanding from the documentation too
that one could simply create packfiles consisting of the types other than
the two delta representations to get started. That part seems fairly
straightforward.

The delta computing part looks more involved. diff-delta.c in git.git says
it's 'greatly inspired by parts of LibXDiff' (a LGPLd library available at
http://www.xmailserver.org/xdiff-lib.html). Not sure from this wording if
one could get started with lifting it from LibXDiff and with this generate
something that libgit2 is able to successfully unpack? Didn't really have
the time to try this out. Can try though if there's some reason to suggest
it would work, and if it would be legal to lift code from libXDiff that is
LGPL to libgit2 which is GPL with a linking exemption? The other problem
with the delta representations is of course choosing the right original
based on which you make a delta rep. For this I have some ideas from
reading source + the stuff that people have pointed out for some simple
heuristics, and would be happy to contribute some time and code if the
delta itself can be sorted. Re-implementing the delta itself from scratch
does not sound appealing.

Also came to my mind that alternative delta formats would be something
interesting to look into as a little extension (not for a general purpose
normal git repo of course), especially as the git packfile entry type field
has two unused bit combinations :-) Have been experimenting a bit with
binary delta compression with open-vcdiff and hacking together something
based on it that creates tight binary deltas with rather a small memory
footprint.

Re: duplicating work & my toes, very happy to take your lead Josh on a
joint fork. Let me know if you have further ideas of how to coordinate
this, again very open to suggestions and can be contacted via mail or IM at
this address too. I'm a noob with the delta compression algorithmic work
and busy at least until after NSConference next week, so more than happy to
be told what to do rather than try to start this from a blank slate. Can
contribute some help and feedback with designing an API and of course some
failing tests too. Also can put together Objective-C bindings to the object
packing functionality into Objective-Git, although I suppose this last
point is outside of the remit of this list.

Matias

On Mon, Mar 12, 2012 at 8:27 PM, Josh Bleecher Snyder
<josharian@gmail.com>wrote:

> >> Do I understand the high level of the problem with libgit2
> >> correctly: the equivalent of pack-objects, which creates the packaged
> >> representations, is missing but unpack-objects is already available?
> >> Similarly, delta computing from git's delta.c that pack-objects
> requires, is
> >> missing, but applying a delta is available?
> >
> > Yep, that's exactly it! The code to actually build packfiles is
> > currently missing from the library, and it'd be a great starting point
> > to get Push up to speed.
>
> I've just worked through all the resources -- thanks to everyone who
> sent them along, very helpful.
>
> It looks like the patch-delta stuff is, strictly speaking, optional
> when writing a pack-objects implementation. For a first pass, all
> objects could be put into the pack file directly (without any of them
> in a delta representation), and the delta calculations could be added
> as an optimization. Does that sound right? (I'm also surprised that
> there aren't multiple delta strategies aimed at different file types.)
>
> Matias, where are you with this? I don't want to duplicate work or
> step on your toes. I could get started on this this week, most
> likely...
>
> Duke, same question. Let us know if/when you write any tests. Failing
> tests are a wonderful place to start. :)
>
> Perhaps we should start a libgit2 fork for coordinating work on push?
> Or is there a better way to coordinate?
>
> -josh
>

Re: [libgit2] Status of push

From:
Josh Bleecher Snyder
Date:
2012-03-13 @ 03:19
> Re: packing full objects, was my understanding from the documentation too
> that one could simply create packfiles consisting of the types other than
> the two delta representations to get started. That part seems fairly
> straightforward.

Yep. Let's start there.


> The delta computing part looks more involved.
> [...]
> Re-implementing the delta itself from scratch does not sound
> appealing.

Indeed! Particularly when you read this:

http://schacon.github.com/git/technical/pack-heuristics.txt

Ugh.

Some of that also (in theory) applies to just building the pack-file,
but I think we can leave fine-tuning the object ordering as a todo in
the short term. Thinking: (1) The initial purpose is push, not gc, so
performant random object access isn't as critical -- the remote is
just going to unpack our packfile anyway. (2) Picking the ideal object
ordering should be pretty pluggable. We should just make sure we have
enough information at hand to make the right decisions later, and put
in hooks as we go.


> Re: duplicating work & my toes, very happy to take your lead Josh on a 
joint fork.

Ok, let's give that a try; and if it's not working, we'll retrench.
I've made a fork at https://github.com/josharian/libgit2, with the
primary branch being packfile (branched off development). I've added
you (@mz2, right?) as a collaborator.


> Let me know if you have further ideas of how to coordinate this, again
> very open to suggestions and can be contacted via mail or IM at this address
> too.

Unless/until the others object, let's keep our communication on-list
-- that'll give everyone a chance to chime in and keep us on the right
track. :) If you do need to get ahold of me directly, mail and IM both
work for this address.


> I'm a noob with the delta compression algorithmic work and busy at
> least until after NSConference next week, so more than happy to be told what
> to do rather than try to start this from a blank slate. Can contribute some
> help and feedback with designing an API and of course some failing tests
> too.

Agreed, first step is putting together an API, and some very basic
failing tests. I have more background in Objective-C and Python than
in C, so feedback will be most welcome. I'll spend a bit of time
looking through the rest of libgit2's API and plan to send along an
API proposal for feedback this week.


> Also can put together Objective-C bindings to the object packing
> functionality into Objective-Git, although I suppose this last point is
> outside of the remit of this list.

Let's tackle that one later. :)

-josh




> On Mon, Mar 12, 2012 at 8:27 PM, Josh Bleecher Snyder <josharian@gmail.com>
> wrote:
>>
>> >> Do I understand the high level of the problem with libgit2
>> >> correctly: the equivalent of pack-objects, which creates the packaged
>> >> representations, is missing but unpack-objects is already available?
>> >> Similarly, delta computing from git's delta.c that pack-objects
>> >> requires, is
>> >> missing, but applying a delta is available?
>> >
>> > Yep, that's exactly it! The code to actually build packfiles is
>> > currently missing from the library, and it'd be a great starting point
>> > to get Push up to speed.
>>
>> I've just worked through all the resources -- thanks to everyone who
>> sent them along, very helpful.
>>
>> It looks like the patch-delta stuff is, strictly speaking, optional
>> when writing a pack-objects implementation. For a first pass, all
>> objects could be put into the pack file directly (without any of them
>> in a delta representation), and the delta calculations could be added
>> as an optimization. Does that sound right? (I'm also surprised that
>> there aren't multiple delta strategies aimed at different file types.)
>>
>> Matias, where are you with this? I don't want to duplicate work or
>> step on your toes. I could get started on this this week, most
>> likely...
>>
>> Duke, same question. Let us know if/when you write any tests. Failing
>> tests are a wonderful place to start. :)
>>
>> Perhaps we should start a libgit2 fork for coordinating work on push?
>> Or is there a better way to coordinate?
>>
>> -josh
>
>

Re: [libgit2] Status of push

From:
Vicent Marti
Date:
2012-03-13 @ 14:25
On Tue, Mar 13, 2012 at 4:19 AM, Josh Bleecher Snyder
<josharian@gmail.com> wrote:
> Some of that also (in theory) applies to just building the pack-file,
> but I think we can leave fine-tuning the object ordering as a todo in
> the short term. Thinking: (1) The initial purpose is push, not gc, so
> performant random object access isn't as critical -- the remote is
> just going to unpack our packfile anyway. (2) Picking the ideal object
> ordering should be pretty pluggable. We should just make sure we have
> enough information at hand to make the right decisions later, and put
> in hooks as we go.

I very much like this approach! Make sure to design the packing API so
that the packing ordering heuristics can be eventually plugged in
without rewriting much/any code. They are going to be critical,
eventually.

>
>> Re: duplicating work & my toes, very happy to take your lead Josh on a 
joint fork.
>
> Ok, let's give that a try; and if it's not working, we'll retrench.
> I've made a fork at https://github.com/josharian/libgit2, with the
> primary branch being packfile (branched off development). I've added
> you (@mz2, right?) as a collaborator.

Brilliant! I'll keep an eye on this and see if I find any issues as you work.


>> Let me know if you have further ideas of how to coordinate this, again
>> very open to suggestions and can be contacted via mail or IM at this address
>> too.

As a reminder, using the code from core Git as an inspiration is
always a good idea. There are no licensing problems, because we've
explicitly asked for permission -- just be careful when copying &
pasting given that the core Git code is not reentrant at all, and
we're working on a library here.

Anyway, can't wait to see what you guys come up with.

Cheers,
Vicent

Re: [libgit2] Status of push

From:
Jonathan \"Duke\" Leto
Date:
2012-03-11 @ 17:16
Howdy fellow libgit2-hackers,

> Yep, that's exactly it! The code to actually build packfiles is
> currently missing from the library, and it'd be a great starting point
> to get Push up to speed.

This discussion really helped me understand the current status of libgit2 push.
I am very interested in helping test the creation of packfiles.

> I have unlimited patience for people willing to throw us a hand.

Is it still the case that all new tests should be written using clay and that
we are still transitioning the old test suite? I have been out of the loop for
a bit.

Duke

-- 
Jonathan "Duke" Leto <jonathan@leto.net>
Leto Labs LLC
209.691.DUKE // http://labs.leto.net
NOTE: Personal email is only checked twice a day at 10am/2pm PST,
please call/text for time-sensitive matters.

Re: [libgit2] Status of push

From:
Vicent Marti
Date:
2012-03-11 @ 17:46
On Sun, Mar 11, 2012 at 6:16 PM, Jonathan "Duke" Leto <jonathan@leto.net> wrote:
> Is it still the case that all new tests should be written using clay and that
> we are still transitioning the old test suite? I have been out of the loop for
> a bit.

Yep. We haven't written any new tests for the old suite in a while.

Re: [libgit2] Status of push

From:
Matias Piipari
Date:
2012-03-11 @ 19:56
Cool!

I've been reading about the packfile format from the resources you guys
pointed out and this book: http://book.git-scm.com/7_the_packfile.html),
and dicked around with a hex editor with some pack files from repos on my
disk. I think I got the idea of how to put together the header to the
packfile and all the undeltified pack entry types. A question based on
reading:

The diagram in the Git book (link above) has an example of working out the
uncompressed data size for an entry, and it shows 0010010 0000 as the bits
that encode the length information. I can see how that would be the case
from the example given and the rules listed on the page and in
Documentation/technical/package-format.txt, However, the book suggests this
means a 144 byte length. If I interpret this sequence of bits as a
big-endian integer, I get 288, not 144. Shifting by one bit to right then
would obviously give 144 as noted in the book. I'm wondering if I'm
misunderstanding something very simple here?



On Sun, Mar 11, 2012 at 5:46 PM, Vicent Marti <vicent@github.com> wrote:

> On Sun, Mar 11, 2012 at 6:16 PM, Jonathan "Duke" Leto <jonathan@leto.net>
> wrote:
> > Is it still the case that all new tests should be written using clay and
> that
> > we are still transitioning the old test suite? I have been out of the
> loop for
> > a bit.
>
> Yep. We haven't written any new tests for the old suite in a while.
>

Re: [libgit2] Status of push

From:
Vicent Marti
Date:
2012-03-12 @ 18:43
On Sun, Mar 11, 2012 at 8:56 PM, Matias Piipari
<matias.piipari@gmail.com> wrote:
> The diagram in the Git book (link above) has an example of working out the
> uncompressed data size for an entry, and it shows 0010010 0000 as the bits
> that encode the length information. I can see how that would be the case
> from the example given and the rules listed on the page and in
> Documentation/technical/package-format.txt, However, the book suggests this
> means a 144 byte length. If I interpret this sequence of bits as a
> big-endian integer, I get 288, not 144. Shifting by one bit to right then
> would obviously give 144 as noted in the book. I'm wondering if I'm
> misunderstanding something very simple here?

I've done the math several times, and I'm also getting 288. It's
probably a mistake Scott made on the picture. ^^