Skip to content

Use a callable template param in DmpCeff #120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

kbieganski
Copy link
Contributor

Replaces std::function in DmpCeff.cc with a callable template param. Speeds up CTS in OpenROAD by about 1-3% on ORFS designs, up to 4% on some others. Also speeds up floorplanning by about 1%.

Ibex (asap7)

Branch Min [s] Max [s] Mean [s] Relative mean Median [s] Relative median
baseline 99.12 99.64 99.39 ± 0.16 102% 99.41 102%
optimized 97.62 98.25 97.95 ± 0.27 100% 97.85 100%

Ibex (nangate45)

Branch Min [s] Max [s] Mean [s] Relative mean Median [s] Relative median
baseline 350.64 369.86 363.37 ± 8.29 102% 367.68 103%
optimized 340.50 369.86 354.89 ± 8.17 100% 357.47 100%

black_parrot (nangate45)

Branch Min [s] Max [s] Mean [s] Relative mean Median [s] Relative median
baseline 175.57 177.48 176.66 ± 0.74 101% 176.73 101%
optimized 171.77 176.99 175.21 ± 1.99 100% 175.76 100%

@CLAassistant
Copy link

CLAassistant commented Nov 6, 2024

CLA assistant check
All committers have signed the CLA.

jan-malek and others added 2 commits November 14, 2024 15:06
Signed-off-by: Jan Malek <jmalek@antmicro.com>
Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
@kbieganski kbieganski force-pushed the dcalc-lambda-template branch from 3fc1f3d to aa5b0bb Compare November 14, 2024 14:06
@kbieganski
Copy link
Contributor Author

Can we get some feedback if possible?

@jjcherry56
Copy link
Collaborator

I have very little free time to deal with reviewing and validating optimizations that may be 4% faster on some test cases. And I am no fan of templates. So it is very low priority.

static void
newtonRaphson(const int max_iter,
double x[],
const int n,
const double x_tol,
// eval(state) is called to fill fvec and fjac.
function<void ()> eval,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the original slow-ness be because this is passed by value instead of const & ? I guess it depends on how much is captured with the call.

Maybe it helps to un-stall this PR if changing the signature to const function<void()> &eval could improve performance and don't require a callable template parameter.

@parallaxsw
Copy link
Owner

Neither change makes enough of a difference to bother with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants