-
Notifications
You must be signed in to change notification settings - Fork 236
Open
Description
-
oI95%Assemble corpuses of data from various prior performance research initiatives ( both within and outside of PL )-
💯Enumerate/obtain test datasets -
90%Document rationales for the test datasets -
95%Publish all of the above as plain HTTP + IPFS pinned download
-
-
oI85%Document prior art, motivation and precise scope and types of sought metrics-
💯Solicit/assemble feedback from various stakeholders -
💯Collect/determine relevance of existing academic research into chunking ( 14 distinct papers selected for evaluation ) -
💯Convert the pre-PL chunk-tester to proper multi-streaming, to dramatically lower the cost of experiments( aiming at about 500 megabyte/s stream processing )with the correct implementation and hardware about 3.5GiB/s standard ingestion 🎉 -
80%Generate few preliminary datapoints to aid understanding the goal/scope -
90%In depth study/evaluation/application of findings from above works -
💯Understand and reuse existing go-ipfs implementations of CDCs ( Rabin + Buzzhash ) in a simpler go-ipfs independent utility, allowing rapid retries of different parameters -
💯Same as above but pertaining to linking strategies ( trickle-dag etc ), as ignoring the link-layer of streams skews the results disproportionately -
98%( subsumes a large portion of points belowv0.1ETA: DEMO AT TEAM-WEEK ) Fully implement a standalone CLI utility re-implementing/converging withgo-ipfson all above algorithms. The distinguishing feature of said tool is the exposure of each chunker/linker as an atomic, composable primitive. The UX is similar to that offfmpegwhereby an input stream is processed via multiple "filters", with the result being a stream of blocks with a statistic on their counts/sizes plus a valid IPFS CID. Current remaining tasks:-
💯Profile/optimize baseline stream ingestion, ensure there is no penalty from applying a "null-filter", which allows one to benchmark a particular hardware setup's theoritcal maximum throughput -
💯Finalize the "stackable chunkers" UI/UX, allowing effortless demonstration of impact of such chunker chains on the -
💯Adjust statistics compilation/output for the above ( it currently looks like this, ignoring various "filter-levels" ) -
💯Make final pass on memory allocation profile and fixup obvious low hanging fruit beforev0.1 -
80%README / godoc / stuffz
-
-
80%Rewrite previously utilized plotly.js-based visualiser to aid with the above point
-
-
oIOpen document to a short discussion soliciting feedback from workgroups -
oIIPerform a number of "brute force" tests aiming at reproducible results ( utilizing https://github.com/ipfs/testground )for the purposes of what we are trying to quantifyiptbwill be sufficient -
oII( half-covered by initial writeup ) Convert raw results into multi-dimensional scatter-plot visualizations ( plotly.js ) -
oIIICombine all available results into a "compromise chunking settings" RFC document -
oIVPublish the results for discussion and decision of the level of incorporation into IPFS implementations ( default parameters, use of selected algorithm by default, etc )
momack2 and bonedaddy
Metadata
Metadata
Assignees
Labels
No labels