Open
Description
-
oI
95%
Assemble corpuses of data from various prior performance research initiatives ( both within and outside of PL )-
💯
Enumerate/obtain test datasets -
90%
Document rationales for the test datasets -
95%
Publish all of the above as plain HTTP + IPFS pinned download
-
-
oI
85%
Document prior art, motivation and precise scope and types of sought metrics-
💯
Solicit/assemble feedback from various stakeholders -
💯
Collect/determine relevance of existing academic research into chunking ( 14 distinct papers selected for evaluation ) -
💯
Convert the pre-PL chunk-tester to proper multi-streaming, to dramatically lower the cost of experiments( aiming at about 500 megabyte/s stream processing )with the correct implementation and hardware about 3.5GiB/s standard ingestion 🎉 -
80%
Generate few preliminary datapoints to aid understanding the goal/scope -
90%
In depth study/evaluation/application of findings from above works -
💯
Understand and reuse existing go-ipfs implementations of CDCs ( Rabin + Buzzhash ) in a simpler go-ipfs independent utility, allowing rapid retries of different parameters -
💯
Same as above but pertaining to linking strategies ( trickle-dag etc ), as ignoring the link-layer of streams skews the results disproportionately -
98%
( subsumes a large portion of points belowv0.1
ETA: DEMO AT TEAM-WEEK ) Fully implement a standalone CLI utility re-implementing/converging withgo-ipfs
on all above algorithms. The distinguishing feature of said tool is the exposure of each chunker/linker as an atomic, composable primitive. The UX is similar to that offfmpeg
whereby an input stream is processed via multiple "filters", with the result being a stream of blocks with a statistic on their counts/sizes plus a valid IPFS CID. Current remaining tasks:-
💯
Profile/optimize baseline stream ingestion, ensure there is no penalty from applying a "null-filter", which allows one to benchmark a particular hardware setup's theoritcal maximum throughput -
💯
Finalize the "stackable chunkers" UI/UX, allowing effortless demonstration of impact of such chunker chains on the -
💯
Adjust statistics compilation/output for the above ( it currently looks like this, ignoring various "filter-levels" ) -
💯
Make final pass on memory allocation profile and fixup obvious low hanging fruit beforev0.1
-
80%
README / godoc / stuffz
-
-
80%
Rewrite previously utilized plotly.js-based visualiser to aid with the above point
-
-
oI
Open document to a short discussion soliciting feedback from workgroups -
oII
Perform a number of "brute force" tests aiming at reproducible results ( utilizing https://github.com/ipfs/testground )for the purposes of what we are trying to quantifyiptb
will be sufficient -
oII
( half-covered by initial writeup ) Convert raw results into multi-dimensional scatter-plot visualizations ( plotly.js ) -
oIII
Combine all available results into a "compromise chunking settings" RFC document -
oIV
Publish the results for discussion and decision of the level of incorporation into IPFS implementations ( default parameters, use of selected algorithm by default, etc )
Metadata
Metadata
Assignees
Labels
No labels