Skip to content

Commit 2608473

Browse files
gustavoavenafacebook-github-bot
authored andcommitted
Create gitexport CLI
Summary: Creating an initial version of the gitexport CLI (for more context on why we need it, see T160586594). This tool is supposed to take a repo and a list of paths as input and it should export all the history of those paths in a git repo. ## What does it do now? Currently, this binary doesn't do anything useful. it just gets the history of a single path to be exported and prints their changeset ids and commit messages (for manual debugging). The main point of this diff is to **set most of the structure/flow of the tool to get some early feedback** before I start implementing anything more complex. Most of the functions don't have an actual implementation, but just do something simple (e.g. returning the first element of a vector) so it typechecks. ## What's my current plan? 1) Get the history of all the given paths. (This is mostly done in this diff already) 2) Merge the changesets into a single, topologically sorted, list of changesets 3) Strip irrelevant changes from every commit (T161205476). 4) Create a CommitGraph from this list (T161204758). 5) Export that CommitGraph to a new, temporary, Mononoke repo (T160787114). 6) Use existing tools to export that temporary repo to a git repo (T160787114). The tricky bits are steps 2,3 and 4, which is where I expect to spend most of my time. First, I'm not sure if event to create a CommitGraph at all, to be able to export the processed changesets to a new repo. If I do need to, I'm not sure if I should (a) strip the irrelevant file changes before or after creating the graph and (b) how to create a new repo and populate it with the commits from the graph I created. (b) is more of a implementation detail, so I don't worry about now... The main unknowns for me are #2 and #4. Basically, how can I create a proper commit graph from a set of commits that are not direct descendants of each other. Assuming a linear history, I don't think it would be very complicated, but we also have to support branching, so I'm not sure how to do this efficiently... ## Examples Let me put as simple example below. Commits with uppercase letters are relevant (i.e. should be exported) and lower case letters should now. ``` A -> b -> C -> D -> e |-> f -> G ``` In this case, I want to have the following commit graph in the end: ``` A' -> C' -> D' |-> G' ``` where X' is X stripped of irrelevant changes ## RFC - This is my first Rust diff ever, so please LMK what horrible things I'm doing, bc I'm very likely doing a few 😂 - Does the plan I described above make sense? - Any suggestions/ideas on how to efficiently stitch the changesets together would be appreciated! But I'll probably set up some time to discuss this problem specifically once I spend more time thinking about it... ## Next steps - Implement steps #5 and #6 (T160787114) to get the entire E2E solution working with the simplest case (i.e. one path with linear history). This is basically exporting the commit graph to a git repo (maybe through a temporary mononoke repo). - Update integration test case to actually run and test the tool with the simple case. - Figure out how to properly create a commit graph from a list of changeset lists. - Add test cases for multiple paths and edge cases, like having multiple branches. Reviewed By: RajivTS Differential Revision: D48226070 fbshipit-source-id: eed970a8e4697ab10682e3b93863e6d621adaacc
1 parent a06514c commit 2608473

File tree

4 files changed

+215
-0
lines changed

4 files changed

+215
-0
lines changed

eden/mononoke/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -282,6 +282,7 @@ members = [
282282
"git/git_types",
283283
"git/git_types/if",
284284
"git/git_types/if/types",
285+
"git/gitexport",
285286
"git/gitimport",
286287
"git/import_direct",
287288
"git/import_tools",
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# @generated by autocargo
2+
3+
[package]
4+
name = "gitexport_tools"
5+
version = "0.1.0"
6+
authors = ["Facebook"]
7+
edition = "2021"
8+
license = "GPLv2+"
9+
10+
[lib]
11+
path = "src/gitexport_tools/lib.rs"
12+
13+
[[bin]]
14+
name = "gitexport"
15+
path = "src/main.rs"
16+
17+
[dependencies]
18+
anyhow = "1.0.71"
19+
bookmarks_types = { version = "0.1.0", path = "../../bookmarks/bookmarks_types" }
20+
clap = { version = "4.3.5", features = ["derive", "env", "string", "unicode", "wrap_help"] }
21+
commit_graph = { version = "0.1.0", path = "../../repo_attributes/commit_graph/commit_graph" }
22+
fbinit = { version = "0.1.2", git = "https://github.com/facebookexperimental/rust-shed.git", branch = "main" }
23+
futures = { version = "0.3.28", features = ["async-await", "compat"] }
24+
in_memory_commit_graph_storage = { version = "0.1.0", path = "../../repo_attributes/commit_graph/in_memory_commit_graph_storage" }
25+
mononoke_api = { version = "0.1.0", path = "../../mononoke_api" }
26+
mononoke_app = { version = "0.1.0", path = "../../cmdlib/mononoke_app" }
27+
mononoke_types = { version = "0.1.0", path = "../../mononoke_types" }
28+
repo_authorization = { version = "0.1.0", path = "../../repo_authorization" }
Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
/*
2+
* Copyright (c) Meta Platforms, Inc. and affiliates.
3+
*
4+
* This software may be used and distributed according to the terms of the
5+
* GNU General Public License version 2.
6+
*/
7+
8+
use std::sync::Arc;
9+
10+
use commit_graph::CommitGraph;
11+
use futures::future::try_join_all;
12+
use futures::TryStreamExt;
13+
use in_memory_commit_graph_storage::InMemoryCommitGraphStorage;
14+
use mononoke_api::changeset_path::ChangesetPathHistoryContext;
15+
use mononoke_api::changeset_path::ChangesetPathHistoryOptions;
16+
pub use mononoke_api::BookmarkFreshness;
17+
use mononoke_api::ChangesetContext;
18+
use mononoke_api::MononokeError;
19+
use mononoke_api::MononokePath;
20+
use mononoke_types::RepositoryId;
21+
22+
/// Given a list of paths and a changeset, return a commit graph
23+
/// containing only commits that are ancestors of the changeset and have
24+
/// modified at least one of the paths.
25+
pub async fn build_partial_commit_graph_for_export<P>(
26+
paths: Vec<P>,
27+
chgset_ctx: ChangesetContext,
28+
) -> Result<CommitGraph, MononokeError>
29+
where
30+
P: TryInto<MononokePath>,
31+
MononokeError: From<P::Error>,
32+
{
33+
let mononoke_paths = paths
34+
.into_iter()
35+
.map(|path| path.try_into())
36+
.collect::<Result<Vec<MononokePath>, _>>()?;
37+
38+
let chgset_path_hist_ctxs: Vec<ChangesetPathHistoryContext> = chgset_ctx
39+
.paths_with_history(mononoke_paths.clone().into_iter())
40+
.await?
41+
.try_collect()
42+
.await?;
43+
44+
// Get each path's history as a vector of changesets
45+
let history_changesets: Vec<Vec<ChangesetContext>> = try_join_all(
46+
try_join_all(
47+
chgset_path_hist_ctxs
48+
.iter()
49+
// TODO(T160600443): support other ChangesetPathHistoryOptions
50+
.map(|csphc| csphc.history(ChangesetPathHistoryOptions::default())),
51+
)
52+
.await?
53+
.into_iter()
54+
.map(|stream| stream.try_collect()),
55+
)
56+
.await?;
57+
58+
let sorted_changesets = merge_and_sort_changeset_lists(history_changesets)?;
59+
let stripped_changesets = strip_irrelevant_changes(sorted_changesets, mononoke_paths).await?;
60+
61+
println!("sorted_changesets: {0:#?}", &stripped_changesets);
62+
63+
// TODO(gustavoavena): remove these prints for debugging after adding tests
64+
let chgset_msgs: Vec<_> =
65+
try_join_all(stripped_changesets.clone().iter().map(|csc| csc.message())).await?;
66+
println!("chgset_msgs: {0:#?}", chgset_msgs);
67+
68+
create_commit_graph(stripped_changesets)
69+
}
70+
71+
fn create_commit_graph(_changesets: Vec<ChangesetContext>) -> Result<CommitGraph, MononokeError> {
72+
let cg_storage = Arc::new(InMemoryCommitGraphStorage::new(RepositoryId::new(1)));
73+
74+
let commit_graph = CommitGraph::new(cg_storage);
75+
76+
// TODO(T161204758): add commits to the commit graph
77+
78+
// TODO(T161204758): properly sort and dedupe the list of relevant changesets
79+
Ok(commit_graph)
80+
}
81+
82+
/// Given a list of changeset lists, merge, dedupe and sort them topologically
83+
/// into a single changeset list that can be used to build a commit graph.
84+
fn merge_and_sort_changeset_lists(
85+
changesets: Vec<Vec<ChangesetContext>>,
86+
) -> Result<Vec<ChangesetContext>, MononokeError> {
87+
// TODO(T161204758): properly sort and dedupe the list of relevant changesets
88+
Ok(changesets.into_iter().flatten().collect())
89+
}
90+
91+
/// Given a commit graph, create a new graph with every commit stripped of all
92+
/// changes that are not in any of the provided paths.
93+
async fn strip_irrelevant_changes(
94+
changesets: Vec<ChangesetContext>,
95+
_paths: Vec<MononokePath>,
96+
) -> Result<Vec<ChangesetContext>, MononokeError> {
97+
// TODO(T161205476): strip irrelevant changes from a CommitGraph
98+
Ok(changesets)
99+
}
Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
/*
2+
* Copyright (c) Meta Platforms, Inc. and affiliates.
3+
*
4+
* This software may be used and distributed according to the terms of the
5+
* GNU General Public License version 2.
6+
*/
7+
8+
use std::str::FromStr;
9+
10+
use anyhow::Error;
11+
use bookmarks_types::BookmarkKey;
12+
use fbinit::FacebookInit;
13+
use gitexport_tools::build_partial_commit_graph_for_export;
14+
pub use mononoke_api::BookmarkFreshness;
15+
use mononoke_api::RepoContext;
16+
use mononoke_app::fb303::AliveService;
17+
use mononoke_app::fb303::Fb303AppExtension;
18+
use mononoke_app::MononokeApp;
19+
use mononoke_app::MononokeAppBuilder;
20+
use repo_authorization::AuthorizationContext;
21+
22+
use crate::types::GitExportArgs;
23+
24+
pub mod types {
25+
use clap::Parser;
26+
use mononoke_app::args::RepoArgs;
27+
28+
/// Mononoke Git Exporter
29+
#[derive(Debug, Parser)]
30+
pub struct GitExportArgs {
31+
/// Name of the hg repo being exported
32+
#[clap(flatten)]
33+
pub hg_repo_args: RepoArgs,
34+
35+
/// Path to the git repo being created
36+
#[clap(long)]
37+
pub output: Option<String>, // TODO(T160787114): Make this required
38+
39+
/// List of directories in `hg_repo` to be exported to a git repo
40+
#[clap(long)]
41+
// TODO(T161204758): change this to a Vec<String> when we can support multiple export paths
42+
pub export_path: String,
43+
// TODO(T160600443): support last revision argument
44+
// TODO(T160600443): support until_timestamp argument
45+
}
46+
}
47+
48+
#[fbinit::main]
49+
fn main(fb: FacebookInit) -> Result<(), Error> {
50+
let app: MononokeApp = MononokeAppBuilder::new(fb)
51+
.with_app_extension(Fb303AppExtension {})
52+
.build::<GitExportArgs>()?;
53+
54+
app.run_with_monitoring_and_logging(async_main, "gitexport", AliveService)
55+
}
56+
57+
async fn async_main(app: MononokeApp) -> Result<(), Error> {
58+
let args: GitExportArgs = app.args()?;
59+
let ctx = app.new_basic_context();
60+
61+
let repo = app.open_repo(&args.hg_repo_args).await?;
62+
63+
let auth_ctx = AuthorizationContext::new_bypass_access_control();
64+
let repo_ctx: RepoContext = RepoContext::new(ctx, auth_ctx.into(), repo, None, None).await?;
65+
66+
// TODO(T160600443): support using a specific changeset as starting commit,
67+
// instead of a bookmark.
68+
let bookmark_key = BookmarkKey::from_str("master")?;
69+
70+
let chgset_ctx = repo_ctx
71+
.resolve_bookmark(&bookmark_key, BookmarkFreshness::MostRecent)
72+
.await?
73+
.unwrap();
74+
75+
// TODO(T161204758): get multiple paths from arguments once we sort and
76+
// dedupe changesets properly and can build a commit graph from multiple
77+
// changeset lists
78+
let export_paths = vec![args.export_path.as_str()];
79+
80+
let _commit_graph = build_partial_commit_graph_for_export(export_paths, chgset_ctx).await;
81+
82+
// TODO(T160787114): export commit graph to temporary mononore repo
83+
84+
// TODO(T160787114): export temporary repo to git repo
85+
86+
Ok(())
87+
}

0 commit comments

Comments
 (0)