-
Notifications
You must be signed in to change notification settings - Fork 12
Make better use of OSF capabilties #106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…rojects It essentially copies and adjusts https://github.com/datalad/git-remote-rclone in that it uses a local repo mirror to push and fetch refs to and from, and uploads a compressed archive to `.git/` of an OSF project that is identified by a URL of type `osf://<projectid>`. Because request latency is high, the entire repo is represented as two files: - a small text file listing the refs in the repo - a 7z archive containing all of the actual content Here is what it can do: ``` % mkdir newrepo % cd newrepo % git init Initialized empty Git repository in /tmp/newrepo/.git/ % touch some % git add some % git commit -m initial [master (root-commit) c552b2b] initial 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 some % git remote add osf osf://vtha6 % git push --set-upstream osf master Enumerating objects: 3, done. Counting objects: 100% (3/3), done. Writing objects: 100% (3/3), done. Building bitmaps: 100% (1/1), done. Total 3 (delta 0), reused 0 (delta 0) Computing commit graph generation numbers: 100% (1/1), done. Upload repository archive To osf://vtha6 * [new branch] master -> master Branch 'master' set up to track remote branch 'master' from 'osf'. % cd .. % git clone osf://vtha6 newrepoclone Cloning into 'newrepoclone'... fatal: bad revision 'HEAD' 100%|██████████████████████████████████████████████████| 83.0/83.0 [00:00<00:00, 519kbytes/s] Downloading repository archive 100%|███████████████████████████████████████████████| 7.99k/7.99k [00:00<00:00, 1.04Mbytes/s] Extracting repository archive % git -C newrepoclone log -1 --oneline |cat c552b2b initial % git -C newrepo log -1 --oneline |cat c552b2b initial ``` TODO: - there is substantial code overlap with https://github.com/datalad/git-remote-rclone that should refactored, ideally - there is also some overlap with the special remote implementation - a `clone` yields an immediate `fatal: bad revision 'HEAD'` output, that seems to come before any of this code is executed, no idea where this is coming from
and generate a default one, based on dataset ID and root path name
Will auto-add 'DataLad dataset', and the dataset ID to improve searchability on OSF.
A project is no more than a node of a particular category. With child-nodes and a more comprehensive use of OSF capabilities the discrepancy between terminologies becomes more and more problematic. This change replaces 'project' with 'node' in all internal places, and the name of the 'project' parameter of the special remote. This is also a sensible move as the created "projects" are actually nodes of type "data" by default (not "project"), and the category is configurable.
This enables storing the entire VCS. This can be combined with any mode of the git-annex special remote, and a publication dependency as set up automatically.
Also improve error handling when no node ID can be determined at all.
Both tarfile and zipfile in the stdlib support lzma compression. Use them instead of an external dependency on 7z. |
This is achieved by replacing the `repo.7z` at the remote end with a LZMA-compressed `repo.zip`. This has two advantages: - we no longer require users to install a 3rd-party tools, but stay within the capabilities of the standard lib - OSF is capable of inspecting ZIP files, so users have the ability to explore their content, instead of seeing only an opaque blob.
So it seems that the code has some kind of line-ending issues (oh how I love windows). In the logs I see:
Somehow a carriage return makes it into the stream that fast-export sends to fast-import(?). I failed to discover how and why. I am giving up here. If someone stumbles upon this and can figure out why this is happening on windows, please share! Thx. |
Can we get this feature for linux/osx anyhow? :-) or is it a strong blocker if windows tests are not passing? |
I have no idea what is going on. This needs a Git-Windows person to figure it out.
OK, so the remaining test failures are not unique to this PR. |
FTR: I will overhaul the documentation today |
Let me know if you need/could use help @adswa |
thx much @sappelhoff, I'm still typing, but will create a PR later and appreciate feedback and ideas & commits with improvements |
This PR enables all kind of things. Here is a demo of the most important aspect
and importantly enables this
None of this here is polished or optimized, but it works in principle!
create-sibling-osf
to use OSF as annex and Git remote #101datalad clone
it again from an OSF URL #9id
for all datasets)I would very much appreciate help writing tests for this new functionality and also for adjusting the docs.