Skip to content

Enable thread affinity for life of Repository #203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

dahlbyk
Copy link
Member

@dahlbyk dahlbyk commented Aug 18, 2012

See libgit2/libgit2#624 and libgit2/libgit2#624 (comment)

One caveat: Thread.BeginThreadAffinity() is decorated with SecurityCriticalAttribute:

Requires full trust for the immediate caller. This member cannot be used by partially trusted or transparent code.

@nulltoken
Copy link
Member

@dahlbyk Hmm. Are you sure about this? Is it "safe" or recommended to do it at the repository level?

Although I've never used this pattern, I was rather expecting this Begin/End to embrace each interop call and keep it for the smallest possible duration. Similarly to a lock pattern. Google seems to agree with this.

If you agree with the former statement, in order to reduce the code noise, how about something like the following. It's not a very elegant approach, but I haven't been able to make lambdas play nicely with out parameters.

---
 LibGit2Sharp/Core/NativeMethods.cs |   19 +++++++++++++++++++
 LibGit2Sharp/Repository.cs         |    7 ++++++-
 2 files changed, 25 insertions(+), 1 deletions(-)

diff --git a/LibGit2Sharp/Core/NativeMethods.cs b/LibGit2Sharp/Core/NativeMethods.cs
index 274f2d7..812a9b7 100644
--- a/LibGit2Sharp/Core/NativeMethods.cs
+++ b/LibGit2Sharp/Core/NativeMethods.cs
@@ -4,6 +4,7 @@
 using System.Reflection;
 using System.Runtime.CompilerServices;
 using System.Runtime.InteropServices;
+using System.Threading;
 using LibGit2Sharp.Core.Handles;

 namespace LibGit2Sharp.Core
@@ -788,5 +789,23 @@ public static bool RepositoryStateChecker(RepositorySafeHandle repositoryPtr, Fu

         [DllImport(libgit2)]
         public static extern void git_treebuilder_free(IntPtr bld);
+
+        public static IDisposable ThreadAffinity()
+        {
+            return new DisposableThreadAffinityWrapper();
+        }
+
+        internal class DisposableThreadAffinityWrapper : IDisposable
+        {
+            public DisposableThreadAffinityWrapper()
+            {
+                Thread.BeginThreadAffinity();
+            }
+
+            public void Dispose()
+            {
+                Thread.EndThreadAffinity();
+            }
+        }
     }
 }
diff --git a/LibGit2Sharp/Repository.cs b/LibGit2Sharp/Repository.cs
index 395f5be..4b69aa7 100644
--- a/LibGit2Sharp/Repository.cs
+++ b/LibGit2Sharp/Repository.cs
@@ -45,7 +45,12 @@ public Repository(string path, RepositoryOptions options = null)
         {
             Ensure.ArgumentNotNullOrEmptyString(path, "path");

-            int result = NativeMethods.git_repository_open(out handle, path);
+            int result;
+            using (NativeMethods.ThreadAffinity())
+            {
+                result = NativeMethods.git_repository_open(out handle, path);
+            }
+

             if (result == (int)GitErrorCode.NotFound)
             {
-- 
1.7.8.msysgit.0

This would definitely be a sweet spot for some AOP... Let's hope it's a temporary fix (@carlosmn?).

While we're here, do you think that #172 was also related to some threading issue ("Switching to IntPtr fixes the issue. No idea why it works sometimes...")? How about trying to put back the marshaller directive on the callbacks and check if the thread affinity was the root cause?

As most of the issues are being raised from a Posh-Git context, would it be possible to "script" some scenario which would put the interop layer under some stress? I'd be interesting to make sure that we're actually fixing something ;-)

@dahlbyk
Copy link
Member Author

dahlbyk commented Aug 18, 2012

Are you sure about this?

No, it was just a starting point. I guess it's not clear to me the scope in which thread affinity is important, so I cast a wide net. :) I can try specifically in the scope of the status call (along with rolling back #172) and see if that helps.

I like the IDisposable pattern, but agree it's a bit awkward - maybe a bit of partial application can clean it up. Will see what I can come up with.

@dahlbyk
Copy link
Member Author

dahlbyk commented Aug 18, 2012

As most of the issues are being raised from a Posh-Git context, would it be possible to "script" some scenario which would put the interop layer under some stress?

If I could repro reliably, yes, but I haven't been able to yet.

@carlosmn
Copy link
Member

@nulltoken yeah, this is temporary until we move the control structures to be global rather than thread-local.

While it's not necessarily needed for every call in the repo, we can't simply wrap the native calls in it because all calls that read from a packfile need to happen on the same thread, and we may switch in between them. The idea in the library was this one anyway, anything to do with a repo happens in the same thread. It turns out that's not realistic for most languages/systems, so we'll have to move away from it.

@travisbot
Copy link

This pull request passes (merged eabdbc0 into 64c052e).

@dahlbyk
Copy link
Member Author

dahlbyk commented Aug 18, 2012

So it sounds like repository-wide thread affinity is the way to go? I wonder how that will impact performance...

The update I just pushed to this PR only uses affinity while iterating over status, which is the only place posh-git uses libgit2. I'll try it out locally for a while (and you can too: https://github.com/dahlbyk/posh-git/tree/libgit2) and see the failures persist.

@nulltoken
Copy link
Member

So it sounds like repository-wide thread affinity is the way to go?

Yes, every native method call and their related Ensure.Success() call should be wrapped in a NativeMethods.ThreadAffinity() call (as I think that the error mechanism also relies on thread local storage).

I'll try it out locally for a while (and you can too: https://github.com/dahlbyk/posh-git/tree/libgit2) and see the failures persist.

In progress ;-)

@nulltoken
Copy link
Member

@dahlbyk From what I understand we'll still have to stick with Thread.BeginThreadAffinity() even when libgit2/libgit2#884 is merged. Indeed the error message is stored in thread local storage libgit2 side.

Currently, we're applying the following pattern

int result;
using (NativeMethods.ThreadAffinity())
{
    result = NativeMethods.git_repository_open(out handle, path);
    Ensure.Success(result); // The error message is fetched here is result < 0
}

I was thinking if we could change it to something like this:

ErrorStruct result; // Contains an int and a string fields
result = Proxy.git_repository_open(out handle, path);

Ensure.Success(result);    // Only throws exception

[...]

internal class Proxy 
{
    public ErrorStruct git_repository_open(out RepositoryHandle handle, string path)
    {
        int result;
        using (NativeMethods.ThreadAffinity())
        {
            result = NativeMethods.git_repository_open(out handle, path);
            return BuildErrorStruct(result);     // The error message is fetched here is result < 0
        }
    }
}

Cons:

  • The Proxy class may be a little tedious to write
  • Some method do not return an error code, the Proxy wrapper would only be a pass-through
  • Not very smart code

Pros:

  • All the Thread related boilerplate would live at the same place and not be scattered all over the library.
  • It would keep the calling code as clean as possible
  • Easy to maintain. The pattern is quite easy to apply when a new native method has to be bound

@dahlbyk
Copy link
Member Author

dahlbyk commented Aug 19, 2012

The idea in the library was this one anyway, anything to do with a repo happens in the same thread.

By repository-wide, I meant between Repository construction and destruction (as originally implemented, unless there's more to it). Just using affinity around a native call and its error handling still allows threads to switch through the lifespan of the repo handle.

@nulltoken
Copy link
Member

Just using affinity around a native call and its error handling still allows threads to switch through the lifespan of the repo handle.

I think this is the correct way of handling this. We should only ensure that there's no thread switch between a function call, its potential callbacks and the retrieval of the error message.

@dahlbyk
Copy link
Member Author

dahlbyk commented Aug 20, 2012

One option would be to skip a separate proxy class and just add "improved" methods to NativeMethods in cases where we need to apply thread affinity. The real native method could be private, only exposing the API that should be used. We could also use that opportunity to clean up out params and such, e.g.:

diff --git a/LibGit2Sharp/Core/NativeMethods.cs b/LibGit2Sharp/Core/NativeMethods.cs
index 887ed10..cc1bece 100644
--- a/LibGit2Sharp/Core/NativeMethods.cs
+++ b/LibGit2Sharp/Core/NativeMethods.cs
@@ -695,7 +695,7 @@ public static void git_status_foreach(RepositorySafeHandle repo, status_callback
         }

         [DllImport(libgit2)]
-        public static extern int git_tag_create(
+        private static extern int git_tag_create(
             out GitOid oid,
             RepositorySafeHandle repo,
             [MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(Utf8Marshaler))] string name,
@@ -705,6 +705,20 @@ public static void git_status_foreach(RepositorySafeHandle repo, status_callback
             [MarshalAs(UnmanagedType.Bool)]
             bool force);

+        public static GitOid git_tag_create(
+            RepositorySafeHandle repo,
+            [MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(Utf8Marshaler))] string name,
+            GitObjectSafeHandle target,
+            SignatureSafeHandle signature,
+            [MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(Utf8Marshaler))] string message,
+            [MarshalAs(UnmanagedType.Bool)]
+            bool force)
+        {
+            GitOid oid;
+            Ensure.Success(git_tag_create(out oid, repo, name, target, signature, message, force));
+            return oid;
+        }
+
         [DllImport(libgit2)]
         public static extern int git_tag_create_lightweight(
             out GitOid oid,
diff --git a/LibGit2Sharp/TagCollection.cs b/LibGit2Sharp/TagCollection.cs
index 89d83fb..a6f0d94 100644
--- a/LibGit2Sharp/TagCollection.cs
+++ b/LibGit2Sharp/TagCollection.cs
@@ -89,16 +89,12 @@ public virtual Tag Add(string name, string target, Signature tagger, string mess

             string prettifiedMessage = ObjectDatabase.PrettifyMessage(message);

-            int res;
             using (var objectPtr = new ObjectSafeWrapper(objectToTag.Id, repo))
             using (SignatureSafeHandle taggerHandle = tagger.BuildHandle())
             {
-                GitOid oid;
-                res = NativeMethods.git_tag_create(out oid, repo.Handle, name, objectPtr.ObjectPtr, taggerHandle, prettifiedMessage, allowOverwrite);
+                NativeMethods.git_tag_create(repo.Handle, name, objectPtr.ObjectPtr, taggerHandle, prettifiedMessage, allowOverwrite);
             }

-            Ensure.Success(res);
-
             return this[name];
         }

@nulltoken
Copy link
Member

One option would be to skip a separate proxy class and just add "improved" methods to NativeMethods in cases where we need to apply thread affinity.

I think I prefer to have a second class to host the proxyfying.

We could also use that opportunity to clean up out params and such

Why not. I'm not completely sure if embedding the call to Ensure.Success() in the proxy can be carried away with all the code paths. If GIT_ENOTFOUND is the only special case, we could indeed go this way.

@yorah
Copy link
Contributor

yorah commented Aug 22, 2012

@carlosmn I think I need your advice to check if I understand the current issue correctly.

From what I understand, the libgit2/libgit2#884 makes it OK to call the repo from different threads.
However, on the client side, we need to make sure that when we call giterr_last() after calling a function that returned an error code, we stay on the same thread than this original function call.

Before proceeding to fix this, do you think there is something else that I'm missing?

@carlosmn
Copy link
Member

It looks like you got it. I would qualify the first sentence by saying that you can call it from an arbitrary thread ("different threads" could be interpreted as many threads at the same time, which you shouldn't do).

@dahlbyk
Copy link
Member Author

dahlbyk commented Aug 24, 2012

Thanks for clarifying, @carlosmn.

If you're still game, knock yourself out @yorah. :)

@yorah
Copy link
Contributor

yorah commented Aug 27, 2012

Thanks @carlosmn.

@dahlbyk I'm on it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants