Skip to content

Commit ffd67dd

Browse files
committed
works pretty well
1 parent 245b33a commit ffd67dd

File tree

9 files changed

+519
-0
lines changed

9 files changed

+519
-0
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
/.git_sym

.gitmodules

Whitespace-only changes.

README.md

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
# Why?
2+
The purpose is to separate big-file caching from revision-control. There are several alternatives:
3+
4+
* https://github.com/jedbrown/git-fat
5+
* https://github.com/schacon/git-media
6+
* http://git-annex.branchable.com/
7+
* https://github.com/github/git-lfs
8+
9+
But all those impose the penalty of checksums on the large files. We assert that the large files can be uniquely derived from URLs, versioned in S3 or by filename, etc. We store only symlinks in the git repo.
10+
11+
## Installing
12+
```
13+
ln -sf `pwd`/git_sym.py ~/bin/git-sym
14+
```
15+
16+
## Running
17+
You can test it right here.
18+
```
19+
touch ~/foo
20+
git-sym update links/foo
21+
# or
22+
# python git_sym.py update links/foo
23+
cat links/foo
24+
```
25+
And this should fail:
26+
```
27+
rm -f ~/git_sym_cache/foo ~/foo
28+
git-sym update
29+
```
30+
31+
## Adding your own large-files.
32+
```
33+
git-sym add large1 large2 large3
34+
git commit -m 'adding links'
35+
git-sym show
36+
```
37+
**git-sym** will choose unique filenames based on checksums. But `git-sym add` is strictly for convenience.
38+
You are free to use your own filenames. Anything symlinked via `GIT_ROOT/.git_sym` will be update-able.
39+
40+
Next, you might want to make those files available to other.
41+
You can then move those files out of GIT_SYM_CACHE_DIR and into Amazon S3, or an ftp site, or wherever.
42+
Just add rules to your `git_sym.makefile`.
43+
44+
## Other useful commands
45+
```
46+
git-sym show -h
47+
git-sym missing -h
48+
git-sym -h
49+
```
50+
51+
# Details
52+
## Typical usage
53+
You will store relative symlinks in your repo. They will point to a unique filename inside `ROOT/.git_sym/`, where ROOT is `../../` etc.
54+
55+
`git-sym update` will search your repo for symlinks (unless you specify them on the command-line). For each, it will execute `ROOT/git_sym.makefile` in your `GIT_SYM_CACHE_DIR` (`~/git_sym_cache` by default). The makefile targets will be the basenames of the symlinks.
56+
57+
If all those files are properly retrieved, then symlinks will be created with the same filenames inside `.git/git_sym`. `ROOT/.git_sym` will point at that. And all other symlinks will point *thru* `ROOT/.git_sym`. Thus, there are three (3) levels of indirection.
58+
59+
## Makefile
60+
Someday, we will offer a plugin architecture. But for now, using a makefile is really very simple. Just create a rule for each unique filename. (You *are* using unique filenames, right?) You can run `wget`, `curl`, `ftp`, `rcp`, `rsync`, `aws-s3-get`, or whatever you want. The retrieval mechanism is decoupled from caching.
61+
62+
You should try to ensure that you have a rule for every current symlink. Old rules for symlinks no longer in your repo are fine; they are simply ignored.
63+
64+
To test your rules:
65+
```
66+
export GIT_SYM_CACHE_DIR=~/mytest
67+
git-sym missing # should report something
68+
git-sym update
69+
git-sym missing
70+
```
71+
72+
## Other notes
73+
### Cache
74+
**git-sym** sets the mode to read-only for the cached files. These files should never change. You might want to name them after their own checksums. `git-sym add` can help you with that.
75+
### Submodules
76+
If your module can be used as a *submodule*, we cannot point at `.git/git_sym/` directly because for submodules `.git/` is not inside the tree. (The relative symlinks are constant, so they need to work no matter where `.git/` sits.) That is why we have *three* levels of indirection, in case your were wondering. (This is also why **git-annex** *fails* for submodules.)
77+
78+
This is also why we write `ROOT/.git_sym`; it might be a different directory than `.git`.
79+
80+
For submodule support, you will also need this:
81+
```
82+
git config --global alias.gsexec '!exec '
83+
```
84+
We use that to learn the actual location of the `.git/` directory. If it fails, we try current directory, and if `.git` is not a directory there, we attempt to find it in `../.git/modules/REPO`, where REPO is the root directory. (This can fail in many ways. The alias never fails.)
85+
86+
Again, we expect you to forget that, so we add that alias to your local repo for you. Believe us: It's a Good Thing.
87+
88+
### .gitignore
89+
Since the intermediate symlink is also in the repo, but points to a changing target, it needs to be listed in `.gitignore`. (That anticipates both accidental `git add` and `git clean`.) We expect you to forget that important rule, so **git-sym** will detect its absence and add it to `.git/info/exclude` instead. No worries.
90+
91+
### Complicated symlinks?
92+
We require a flat directory structure within `.git/git_sym`. If you need more files than your filesystem
93+
can handle, you're Doing It Wrong. Git will slow down anyway.
94+
95+
However, we support symlinked *directories*, which can then be an entire tree in GIT_SYM_CACHE_DIR. That should
96+
satisfy all reasonable use-cases.
97+
98+
# TODO
99+
* git-sym fix -- also fix broken links from moved cache, and missing links in GIT_SYM_DIR
100+
* Try `.gitattributes` instead of `.gitignore`, to avoid problems with `git clean`.
101+
* Add `git-submodule` support, to run `git-sym update` automatically.

experimental.ini

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
[symlinks]
2+
val = links/

git_sym.makefile

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
foo:
2+
cp -f ~/foo $@
3+
bar:
4+
cp -f ~/foo $@

0 commit comments

Comments
 (0)