librelist archives

« back to archive

Fwd: [Abstrakti - Contact] Libgit2 & LibGit2Sharp

Fwd: [Abstrakti - Contact] Libgit2 & LibGit2Sharp

From:
Emeric Fermas
Date:
2012-03-04 @ 13:34
Hello guys,

Abstrakti recently released Castellum, a Windows tool based on
libgit2sharp/libgit2. This tool automates the migration from
SourceSafe to Git.
Ricardo, the author (cc'ed on this email), agreed to share some issues
he encountered with us.

Cheers,
Em.




---------- Forwarded message ----------
From: Ricardo Drizin <drizin@gmail.com>
Date: Fri, Mar 2, 2012 at 9:08 PM
Subject: Re: [Abstrakti - Contact] Libgit2 & LibGit2Sharp
To: emeric.fermas@gmail.com


Hello Emeric,

Sure I can share what I've found.

I apologize for not having contributed myself with those bugs, but I'm
not a good C programmer, and my workarounds were so ugly that it would
be a shame :-)
And also, I apologize if I sound rude in my sentence about the bugs -
of course it was not my intention.

First problem is about adding to the index a file with special
characters in the path.
For example, when adding C:\SQL\Funções\Script.sql:
The library finds the file, so it creates the blob with file contents,
but the file path is added to the index with the wrong encoding.
(As far as I remember, the blob was being created reading the source
filepath as UTF8, which is ok, but when writing to the index it should
be ISO8859-1 (Latin1)).
The result is that even adding the file (with a weird path encoding),
the pending add (unversioned file) is always shown, even after adding
it to index.

My ugly workaround was sending to the library both the UTF8 and the
ISO8859 encoding:
static int index_add(git_index *index, const char *path_UTF8, const
char *path_ISO88591, int stage, int replace)
{
git_index_entry *entry = NULL;
int ret;

//ret = index_entry_init(&entry, index, path, stage);
ret = index_entry_init(&entry, index, path_UTF8, stage); //Ricardo Drizin
if (ret)
goto err;

strcpy(entry->path, path_ISO88591);//Ricardo Drizin
...
(git_index_add and git_index_append were also changed)
...

and the conversion was done in C# (libgit2sharp):
            int res = NativeMethods.git_index_add(handle,
relativePath, relativePath);

...
        [DllImport(libgit2)]
        public static extern int git_index_add(
            IndexSafeHandle index,
            //[MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef
= typeof(Utf8Marshaler))] string path,
            [MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef =
typeof(Utf8Marshaler))] string path_UTF8,
            [MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef =
typeof(DefaultEncodingMarshaler))] string path_ISO88591,
            int stage = 0);

..
DefaultEncodingMarshaler is similar to Utf8Marshaler, but using
Encoding.Default (which in my system was Latin1  = ISO8859-1).



Second bug I found was that p_mktemps allows creating only 26
concurrent temporary files (a to z), so it was bombing errors in some
cases. I rewrote to use _tempnam, which does not have this 26
limitation. (please note that I'm running this in Windows).

Third problem I found was that method "loose_backend__write" does not
overwrite a file when the file already exists (which is ok) but it
does not exclude the temporary file in this case, so the number of
temporary files just grow up and up.

Both the second and third problem (above) happened when I tryied to
add to the index some files which were already there, and tried a
commit (which in case was an empty commit... and i noticed that on
every empty commits the temporary files were growing..).


The forth and last problem I found was when committing buffers
(git_filebuf_commit_at):
In random times, the p_rename (from file->path_lock to
file->path_original) fails.
I'm not sure if this is related to Windows Index Service, or maybe
antivirus (I'm using avast), but it happens randomly in almost long
running operation I have tried. (remember I'm migrating large
SourceSafe databases to Git).
My workaround was the ugliest possible:
When rename fails, I wait 100ms, try renaming again. If it fails
again, I wait 1000ms and try again. If it fails again I wait 2000ms
and try last time.
I know this is extremly ugly, but it seemed to fix the problem - I
could run a dozen large migrations and all commits worked fine.


Let me know if you want my sources. I'll be glad to share with you and
with the community.

Regards,
Ricardo Drizin


On Wed, Feb 29, 2012 at 8:14 AM, <drizin@gmail.com> wrote:
>
> Name: emeric.fermas@gmail.com
> Message:
> Hello, It is stated in the product release post of Castellum that "We 
have been suffering for a few weeks hacking into low-level C programming, 
digging into libgit2 and libgit2Sharp plumbings, and fixing a few bugs in 
those libraries.". I happen to contribute to both of those libraries 
(github.com/nulltoken) and would be very thankful if you could share those
bugs with us. This might give us a chance to fix them in the source, 
rather than compelling you to workaround them. Thanks in advance, Cheers, 
Em.

Re: [libgit2] Fwd: [Abstrakti - Contact] Libgit2 & LibGit2Sharp

From:
Vicent Marti
Date:
2012-03-04 @ 14:51
This is brilliant, Em.

Thanks for taking the time to reach out to the Castellum guys. I'm
guessing they're not CC'ed in this list, but thanks to them regardless
for the extensive bug report.

Most of these issues we're well aware of -- I think the issue with
renaming has been fixed in the development branch. I've just opened an
issue with these details so the other Windows guys can chim in.

Let's keep this discussion in https://github.com/libgit2/libgit2/issues/584

Cheers,
Vicent

On Sun, Mar 4, 2012 at 2:34 PM, Emeric Fermas <emeric.fermas@gmail.com> wrote:
> Hello guys,
>
> Abstrakti recently released Castellum, a Windows tool based on
> libgit2sharp/libgit2. This tool automates the migration from
> SourceSafe to Git.
> Ricardo, the author (cc'ed on this email), agreed to share some issues
> he encountered with us.
>
> Cheers,
> Em.
>
>
>
>
> ---------- Forwarded message ----------
> From: Ricardo Drizin <drizin@gmail.com>
> Date: Fri, Mar 2, 2012 at 9:08 PM
> Subject: Re: [Abstrakti - Contact] Libgit2 & LibGit2Sharp
> To: emeric.fermas@gmail.com
>
>
> Hello Emeric,
>
> Sure I can share what I've found.
>
> I apologize for not having contributed myself with those bugs, but I'm
> not a good C programmer, and my workarounds were so ugly that it would
> be a shame :-)
> And also, I apologize if I sound rude in my sentence about the bugs -
> of course it was not my intention.
>
> First problem is about adding to the index a file with special
> characters in the path.
> For example, when adding C:\SQL\Funções\Script.sql:
> The library finds the file, so it creates the blob with file contents,
> but the file path is added to the index with the wrong encoding.
> (As far as I remember, the blob was being created reading the source
> filepath as UTF8, which is ok, but when writing to the index it should
> be ISO8859-1 (Latin1)).
> The result is that even adding the file (with a weird path encoding),
> the pending add (unversioned file) is always shown, even after adding
> it to index.
>
> My ugly workaround was sending to the library both the UTF8 and the
> ISO8859 encoding:
> static int index_add(git_index *index, const char *path_UTF8, const
> char *path_ISO88591, int stage, int replace)
> {
> git_index_entry *entry = NULL;
> int ret;
>
> //ret = index_entry_init(&entry, index, path, stage);
> ret = index_entry_init(&entry, index, path_UTF8, stage); //Ricardo Drizin
> if (ret)
> goto err;
>
> strcpy(entry->path, path_ISO88591);//Ricardo Drizin
> ...
> (git_index_add and git_index_append were also changed)
> ...
>
> and the conversion was done in C# (libgit2sharp):
>             int res = NativeMethods.git_index_add(handle,
> relativePath, relativePath);
>
> ...
>         [DllImport(libgit2)]
>         public static extern int git_index_add(
>             IndexSafeHandle index,
>             //[MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef
> = typeof(Utf8Marshaler))] string path,
>             [MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef =
> typeof(Utf8Marshaler))] string path_UTF8,
>             [MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef =
> typeof(DefaultEncodingMarshaler))] string path_ISO88591,
>             int stage = 0);
>
> ..
> DefaultEncodingMarshaler is similar to Utf8Marshaler, but using
> Encoding.Default (which in my system was Latin1  = ISO8859-1).
>
>
>
> Second bug I found was that p_mktemps allows creating only 26
> concurrent temporary files (a to z), so it was bombing errors in some
> cases. I rewrote to use _tempnam, which does not have this 26
> limitation. (please note that I'm running this in Windows).
>
> Third problem I found was that method "loose_backend__write" does not
> overwrite a file when the file already exists (which is ok) but it
> does not exclude the temporary file in this case, so the number of
> temporary files just grow up and up.
>
> Both the second and third problem (above) happened when I tryied to
> add to the index some files which were already there, and tried a
> commit (which in case was an empty commit... and i noticed that on
> every empty commits the temporary files were growing..).
>
>
> The forth and last problem I found was when committing buffers
> (git_filebuf_commit_at):
> In random times, the p_rename (from file->path_lock to
> file->path_original) fails.
> I'm not sure if this is related to Windows Index Service, or maybe
> antivirus (I'm using avast), but it happens randomly in almost long
> running operation I have tried. (remember I'm migrating large
> SourceSafe databases to Git).
> My workaround was the ugliest possible:
> When rename fails, I wait 100ms, try renaming again. If it fails
> again, I wait 1000ms and try again. If it fails again I wait 2000ms
> and try last time.
> I know this is extremly ugly, but it seemed to fix the problem - I
> could run a dozen large migrations and all commits worked fine.
>
>
> Let me know if you want my sources. I'll be glad to share with you and
> with the community.
>
> Regards,
> Ricardo Drizin
>
>
> On Wed, Feb 29, 2012 at 8:14 AM, <drizin@gmail.com> wrote:
>>
>> Name: emeric.fermas@gmail.com
>> Message:
>> Hello, It is stated in the product release post of Castellum that "We 
have been suffering for a few weeks hacking into low-level C programming, 
digging into libgit2 and libgit2Sharp plumbings, and fixing a few bugs in 
those libraries.". I happen to contribute to both of those libraries 
(github.com/nulltoken) and would be very thankful if you could share those
bugs with us. This might give us a chance to fix them in the source, 
rather than compelling you to workaround them. Thanks in advance, Cheers, 
Em.