@@ -1689,3 +1689,49 @@ int cmd_survey(int argc, const char **argv, const char *prefix, struct repositor
16891689 clear_survey_context (& ctx );
16901690 return 0 ;
16911691}
1692+
1693+ /*
1694+ * NEEDSWORK: The following is a bit of a laundry list of things
1695+ * that I'd like to add.
1696+ *
1697+ * [] Dump stats on all of the packfiles. The number and size of each.
1698+ * Whether each is in the .git directory or in an alternate. The state
1699+ * of the IDX or MIDX files and etc. Delta chain stats. All of this
1700+ * data is relative to the "lived-in" state of the repository. Stuff
1701+ * that may change after a GC or repack.
1702+ *
1703+ * [] Dump stats on each remote. When we fetch from a remote the size
1704+ * of the response is related to the set of haves on the server. You
1705+ * can see this in `GIT_TRACE_CURL=1 git fetch`. We get a `ls-refs`
1706+ * payload that lists all of the branches and tags on the server, so
1707+ * at a minimum the RefName and SHA for each. But for annotated tags
1708+ * we also get the peeled SHA. The size of this overhead on every
1709+ * fetch is proporational to the size of the `git ls-remote` response
1710+ * (roughly, although the latter repeats the RefName of the peeled
1711+ * tag). If, for example, you have 500K refs on a remote, you're
1712+ * going to have a long "haves" message, so every fetch will be slow
1713+ * just because of that overhead (not counting new objects to be
1714+ * downloaded).
1715+ *
1716+ * Note that the local set of tags in "refs/tags/" is a union over all
1717+ * remotes. However, since most people only have one remote, we can
1718+ * probaly estimate the overhead value directly from the size of the
1719+ * set of "refs/tags/" that we visited while building the `ref_info`
1720+ * and `ref_array` and not need to ask the remote.
1721+ *
1722+ * [] Dump info on the complexity of the DAG. Criss-cross merges.
1723+ * The number of edges that must be touched to compute merge bases.
1724+ * Edge length. The number of parallel lanes in the history that must
1725+ * be navigated to get to the merge base. What affects the cost of
1726+ * the Ahead/Behind computation? How often do criss-crosses occur and
1727+ * do they cause various operations to slow down?
1728+ *
1729+ * [] If there are primary branches (like "main" or "master") are they
1730+ * always on the left side of merges? Does the graph have a clean
1731+ * left edge? Or are there normal and "backwards" merges? Do these
1732+ * cause problems at scale?
1733+ *
1734+ * [] If we have a hierarchy of FI/RI branches like "L1", "L2, ...,
1735+ * can we learn anything about the shape of the repo around these FI
1736+ * and RI integrations?
1737+ */
0 commit comments