Skip to content

Conversation

@Aaron-Hartwig
Copy link
Collaborator

Our current toolchain is pinned to a release of the OSS CAD Suite that is ~1.5 years old. A lot of progress has been made in that time, and using the QSFP design as a spot check, we can get quite the benefit from an update. On that design I saw a ~32% reduction in utilization and a 45% increase in Fmax, all while actually completing PnR 7x faster.

I'm opening this PR to generate images in CI. I'll look at all the PnR reports and make sure they make sense, then I'll pull images into Hubris branches and let them soak on niles and see if anything quirky happens. I'll be testing:

  • Sidecar Mainboard Controller
  • QSFPx32 Controller
  • Gimlet Sequencer
  • Ignition Target (I'll ask Eric to flash this on any sleds he can)

If we aren't seeing any issues after a few days, I'll explore integrating these into Hubris and try to get the new images released in the dogfood/colo context ASAP to get a good long soak before R9.

@Aaron-Hartwig
Copy link
Collaborator Author

Aaron-Hartwig commented May 13, 2024

Quick analysis shows a smol regression in iCE40 fit, but a massive improvement to ECP5 fit:
image

Unfortunately, the iCE40 ignition target design is rather tight and this bump has kept our *_reset_button.bit designs from finding a fit.

I think landing this toolchain bump is important given:

  1. Sidecar is missing timing by a mile Per Arjen's comment below, this is untrue.
  2. QSFP is a PITA to work on as its timing has been very sensitive and often nextpnr cannot even complete PnR.

I'm going to look at creating ignition target images that just forgo button functionality all together as that should get us healthy from a build perspective while also not impacting the product (we don't install the buttons and haven't for over a year).

Copy link
Collaborator

@nathanaelhuffman nathanaelhuffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. One q on the bsv change that may or may not have been intentional

@Aaron-Hartwig Aaron-Hartwig force-pushed the aaron/toolchain-bump branch from 4808caa to 289d6af Compare May 14, 2024 17:07
@Aaron-Hartwig
Copy link
Collaborator Author

I rebased this to grab the NoButton ignition PR. On niles I've deployed Sidecar Main/QSFP images to sidecar-b and the Gimlet Sequencer image to gimlet-c. I've also requested @ericaasen to update Ignition Target on those two boards plus psc-c to the NoButton flavored ignition image as a smoke test.

@arjenroodselaar
Copy link
Contributor

arjenroodselaar commented May 14, 2024

FWIW timing of the Sidecar mainboard image is not actually failing (looking at https://buildomat.eng.oxide.computer/wg/0/artefact/01HXWF3Z0CJBQJAK8QRNTT4TPW/0Dqe7RtFEik1kKyKN0KJ28nLqvpVFPPJqV7UrhAX3xDyuYx6/01HXWF6ZA7BP58245F7WCMV75A/01HXWHGRH3XJV1GYTBMK4H1BAT/sidecar_mainboard_controller_rev_b.report.txt, which is from the CI run of current main). Timing fails upon initial placement with 40.60 MHz, but after the router has done its work and rearranged the deck chairs the final Fmax for $glbnet$clk_50m_fpga_refclk$TRELLIS_IO_IN is found to be 77.69 MHz.

I'll take another look at why the iCE40 targets are regressing when I return because peeps in the community working on this claim the abc9 synthesis pass is supposed to be an all around win. I found last year that at these utilization levels the FF enable and set signals for groups of FFs can cause routing pressure. There are some knobs one can use to tune this, but at the time they weren't exposed in the top level synth_ice40 and synth_ecp5 commands. I'll revive my PR (YosysHQ/yosys#3595) and see if/how that affects the results and get that into Yosys if these knobs haven't been exposed in the meantime.

@Aaron-Hartwig Aaron-Hartwig force-pushed the aaron/toolchain-bump branch from 289d6af to 5f36092 Compare May 15, 2024 15:19
@Aaron-Hartwig
Copy link
Collaborator Author

@arjenroodselaar you are correct! My mistake there. And thanks for that PR to yosys, I think giving folks control over any knobs that exist makes a ton of sense. We'll have to bump the toolchain against once that lands! They cut a release daily, so we could probably make a much more regular habit of doing this.

I'm going to make a plan with the Hubris team on how to get these new images integrated and tested before hitting merge on this, but at this point things all seem fine.

@Aaron-Hartwig Aaron-Hartwig merged commit eb298ab into main May 15, 2024
@Aaron-Hartwig Aaron-Hartwig deleted the aaron/toolchain-bump branch May 15, 2024 17:31
Aaron-Hartwig added a commit to oxidecomputer/hubris that referenced this pull request May 15, 2024
We updated the FPGA toolchain in
oxidecomputer/quartz#150. This refreshes the
images we have as part of our applications with fresh ones from CI.

cc: @nathanaelhuffman @arjenroodselaar
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants