Shepherd is a utility for applying code changes across many repositories.
- Powerful: You can write migration scripts using your favorite Unix commands, tools like
jscodeshift, or scripts in your preferred programming language. - Easy: With just a few commands, you can checkout dozens of repositories, apply changes, commit those changes, and open pull requests with detailed messages.
- Flexible: Ships with support for Git/GitHub, but can easily be extended to work with other version control products like Bitbucket, GitLab, or SVN.
For more high level context, this blog post covers the basics.
Install the Shepherd CLI:
npm install -g @nerdwallet/shepherdShepherd will now be available as the shepherd command in your shell:
shepherd --helpUsage: shepherd [options] [command]
...
Take a look at the tutorial for a detailed walkthrough of what Shepherd does and how it works, or read on for a higher-level and more brief look!
Moving away from monorepos and monolithic applications has generally been a good thing for developers because it allows them to move quickly and independently from each other. However, it's easy to run into problems, especially if your code relies on shared libraries. Specifically, making a change to shared code and then trying to roll that shared code out to all consumers of that code becomes difficult:
- The person updating that library must communicate the change to consumers of the library
- The consumer must understand the change and how they have to update their own code
- The consumer must make the necessary changes in their own code
- The consumer must test, merge, and deploy those changes
Shepherd aims to help shift responsibility for the first three steps to the person actually making the change to the library. Since they have the best understanding of their change, they can write a code migration to automate that change and then user Shepherd to automate the process of applying that change to all relevant repos. Then the owners of the affected repos (who have the best understanding of their own code) can review and merge the changes. This process is especially efficient for teams who rely on continuous integration: automated tests can help repository owners have confidence that the code changes are working as expected.
A migration is declaratively specified with a shepherd.yml file called a spec. Here's an example of a migration spec that renames .eslintrc to .eslintrc.json in all NerdWallet repositories that have been modified in 2018:
id: 2018.07.16-eslintrc-json
title: Rename all .eslintrc files to .eslintrc.json
adapter:
type: github
search_type: code
search_query: org:NerdWallet path:/ filename:.eslintrc
hooks:
should_migrate:
- ls .eslintrc # Check that this file actually exists in the repo
- git log -1 --format=%cd | grep 2018 --silent # Only migrate things that have seen commits in 2018
post_checkout: npm install
apply: mv .eslintrc .eslintrc.json
pr_message: echo 'Hey! This PR renames `.eslintrc` to `.eslintrc.json`'Let's go through this line-by-line:
idspecifies a unique identifier for this migration. It will be used as a branch name for this migration, and will be used internally by Shepherd to track state about the migration.titlespecifies a human-readable title for the migration that will be used as the commit message.adapterspecifies what version control adapter should be used for performing operations on repos, as well as extra options for that adapter. Currently Shepherd only has a GitHub adapter, but you could create a Bitbucket or GitLab adapter if you don't use GitHub. Note thatsearch_queryis specific to the GitHub adapter: it uses GitHub's code search qualifiers to identify repositories that are candidates for a migration. If a repository contains a file matching the search, it will be considered a candidate for this migration. As an alternative tosearch_query, GitHub adapter can be configured withorg: YOURGITHUBORGANIZATION. When usingorg, every repo in the organization that is visible will be considered as a candidate for this migration.search_type(optional): specifies search type - either 'code' or 'repositories'. If repositories is specified, it does a Github repository search. Defaults to code search if not specified.
The options under hooks specify the meat of a migration. They tell Shepherd how to determine if a repo should be migrated, how to actually perform the migration, how to generate a pull request message for each repository, and more. Each hook consists of one or more standard executables that Shepherd will execute in sequence.
should_migrateis a sequence of commands to execute to determine if a repo actually requires a migration. If any of them exit with a non-zero value, that signifies to Shepherd that the repo should not be migrated. For instance, the second step in the aboveshould_migratehook would fail if the repo was last modified in 2017, sincegrepwould exit with a non-zero value.post_checkoutis a sequence of commands to be executed once a repo has been checked out and passed anyshould_migratechecks. This is a convenient place to do anything that will only need to be done once per repo, such as installing any dependencies.applyis a sequence of commands that will actually execute the migration. This example is very simple: we're just usingmvto rename a file. However, this hook could contain arbitrarily many, potentially complex commands, depending on the requirements of your particular migration.pr_messageis a sequence of commands that will be used to generate a pull request message for a repository. In the simplest case, this can just be a static message, but you could also programmatically generate a message that calls out particular things that might need human attention. Anything written tostdoutwill be used for the message. If multiple commands are specified, the output from each one will be concatenated together.
should_migrate and post_checkout are optional; apply and pr_message are required.
Each of these commands will be executed with the working directory set to the target repository. Shepherd exposes some context to each command via specific environment variables. Some additional enviornment variables are exposed when using the git or github adapters.
-
SHEPHERD_REPO_DIRis the absolute path to the repository being operated on. This will be the working directory when commands are executed. -
SHEPHERD_DATA_DIRis the absolute path to a special directory that can be used to persist state between steps. This would be useful if, for instance, ajscodeshiftcodemod in yourapplyhook generates a list of files that need human attention and you want to use that list in yourpr_messagehook. -
SHEPHERD_BASE_BRANCHis the name of the branch Shepherd will set up a pull-request against. This will often, but not always, be master. Only available forapplyand later steps. -
SHEPHERD_MIGRATION_DIRis the absolute path to the directory containing your migration'sshepherd.ymlfile. This is useful if you want to include a script with your migration spec and need to reference that command in a hook. For instance, if I have a scriptpr.shthat will generate a PR message: mypr_messagehook might look something like this:pr_message: $SHEPHERD_MIGRATION_DIR/pr.sh
-
SHEPHERD_GIT_REVISION(gitandgithubadapters) is the current revision of the repository being operated on. -
SHEPHERD_GITHUB_REPO_OWNER(githubadapter) is the owner of the repository being operated on. For example, if operating on the repositoryhttps://github.com/NerdWalletOSS/shepherd, this would beNerdWalletOSS. -
SHEPHERD_GITHUB_REPO_NAME(githubadapter) is the name of the repository being operated on. For example, if operating on the repositoryhttps://github.com/NerdWalletOSS/shepherd, this would beshepherd.
Commands follow standard Unix conventions: an exit code of 0 indicates a command succeeded, a non-zero exit code indicates failure.
Shepherd is run as follows:
shepherd <command> <migration> [options]<migration> is the path to your migration directory containing a shepherd.yml file.
There are a number of commands that must be run to execute a migration:
checkout: Determines which repositories are candidates for migration and clones or updates the repositories on your machine. Clones are "shallow", containing no git history. Usesshould_migrateto decide if a repository should be kept after it's checked out.apply: Performs the migration using theapplyhook discussed above.commit: Makes a commit with any changes that were made during theapplystep, including adding newly-created files. The migration'stitlewill be prepended with[shepherd]and used as the commit message.push: Pushes all commits to their respective repositories.pr-preview: Prints the commit message that would be used for each repository without actually creating a PR; uses thepr_messagehook.pr: Creates a PR for each repo with the message generated from thepr_messagehook.version: Prints Shepherd version
By default, checkout will use the adapter to figure out which repositories to check out, and the remaining commands will operate on all checked-out repos. To only checkout a specific repo or to operate on only a subset of the checked-out repos, you can use the --repos flag, which specifies a comma-separated list of repos:
shepherd checkout path/to/migration --repos facebook/react,google/protobufRun shepherd --help to see all available commands and descriptions for each one.
Run yarn to install dependencies.
Shepherd is written in TypeScript, which requires compilation to JavaScript. When developing Shepherd, it's recommended to run yarn build:watch in a separate terminal. This will incrementally compile the source code as you edit it. You can then invoke the Shepherd CLI by referencing the absolute path to the compiled cli.js file:
cd ../my-other-project
../shepherd/lib/cli.js checkout path/to/migrationShepherd currently has minimal test coverage, but we're aiming to improve that with each new PR. Tests are written with Jest and should be named in a *.test.ts alongside the file under test. To run the test suite, run yarn test.
We use ESLint to ensure a consistent coding style and to help prevent certain classes of problems. Run yarn lint to run the linter, and yarn fix-lint to automatically fix applicable problems.
