Skip to content

Adding smallest_unused_suffix cache #3296

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

JohnDumbell
Copy link
Contributor

@JohnDumbell JohnDumbell commented Nov 8, 2018

This just speeds up symbol name generation in situations where there are large amounts of symbols already existing with the same prefix. A similar approach to the journalling_symbol_tablet in how it works - you just wrap any current symbol table, and as long as that instance lasts it'll store what the next available index is so usually you'll only need to do a minimal amount of finds to get the next free suffix.

It doesn't go overboard with trying to work out if there are any holes in the symbol table that we could re-use because it's essential that each suffix + prefix combo be predictable, so only clears the cache when you clear the whole symbol table.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

Copy link
Collaborator

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see an analysis as to why it is even happening that there are this many symbols with the same prefix. And then data (how much faster is it, on which benchmarks). And commits that actually provide a rationale for the change happening.

I dare saying this because ff25b2c.

@@ -787,8 +788,11 @@ bool java_bytecode_languaget::typecheck(
break;
case LAZY_METHODS_MODE_EAGER:
{
symbol_table_buildert suffix_aware_table =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suffix_aware_table -> symbol_table_builder

@@ -787,8 +788,11 @@ bool java_bytecode_languaget::typecheck(
break;
case LAZY_METHODS_MODE_EAGER:
{
symbol_table_buildert suffix_aware_table =
symbol_table_buildert::build(symbol_table);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

::build -> ::wrap

journalling_symbol_tablet journalling_symbol_table =
journalling_symbol_tablet::wrap(symbol_table);
journalling_symbol_tablet::wrap(suffix_aware_table);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To check: does the journalling table forward next_suffix to its wrapped table?

@@ -692,37 +693,40 @@ void compilet::add_compiler_specific_defines(configt &config) const

void compilet::convert_symbols(goto_functionst &dest)
{
symbol_table_buildert suffix_aware_table =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suffix_aware_table -> symbol_table_builder

@@ -789,7 +789,7 @@ bool java_bytecode_languaget::typecheck(
case LAZY_METHODS_MODE_EAGER:
{
symbol_table_buildert suffix_aware_table =
symbol_table_buildert::build(symbol_table);
symbol_table_buildert::build(symbol_table);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally squash this formatting commit into the appropriate precessor commits

@smowton
Copy link
Contributor

smowton commented Nov 9, 2018

In brief your previous fix changed the complexity of generating N symbols with the same prefix in a symbol table of size M from O(NM) to O(N^2), which is definitely an improvement, but this patch bumps it to O(N). I believe the typical case where this really bites us is e.g. 1000 some.package.SomeClass.somefunction::new_tmp symbols costing ~500000 string-concats and symbol-table lookups, but we'll put a concrete benchmark together soon.

@tautschnig
Copy link
Collaborator

Why do we ever have

I believe the typical case where this really bites us is e.g. 1000 some.package.SomeClass.somefunction::new_tmp symbols

?

@tautschnig
Copy link
Collaborator

That is: can we really not do any better than calling all of them new_tmp?

@smowton
Copy link
Contributor

smowton commented Nov 9, 2018

They're temps generated at a new site. We've also got return-value temps, temps created by goto_convertt e.g. to isolate side-effects (f() + 2 -> tmp = f(); tmp + 2), and so on: there are 20 get_fresh_aux_symbol or goto_convertt::new_tmp_symbol in cbmc and another 30 in jbmc.

Possible solutions:

  1. For each user, give the variable some appropriate qualifier that makes clashes less likely -- for example, something derived directly from Java bytecode could be named after its bytecode index. This puts extra burden on new users though, and people making future changes will continue to be tempted to use get_fresh_aux... naively. It's also likely we would not bother to upgrade some use sites on the rationale "eh, when will this ever be frequently visited," which will doubtless bite us in the future.

  2. Introduce module-scoped name allocators. For example, goto_convertt::new_tmp_symbol would be an ideal place to count newly-allocated symbols and try harder to avoid the quadratic cost with an educated guess. However this is quite intrusive -- for example if goto_convertt calls some_utility which allocates symbols we must teach some_utility how to use goto_convertt's name allocator, probably by parameterising it.

  3. Improve symbol_tablet so it can always answer repeated fresh-name queries in linear, not quadratic time. This is very unintrusive, but @kroening doesn't like it because the next-name map would stick around consuming memory even when we proceed to phases when the symbol table is read-mostly.

  4. The solution adopted here: a symbol-table wrapper that holds the next-name map, and whose lifespan is chosen to include the table-building phase but not the mostly-read phase. The wrapped table has an efficient get-fresh-symbol operation without needing each individual user to solve that problem, and can be passed to any utility expecting a symbol_table_baset without that utility needing specific modification.

@smowton
Copy link
Contributor

smowton commented Nov 9, 2018

Also @JohnDumbell recommend separating a commit that just turns symbol_tablet -> symbol_table_baset from one that actually introduces uses of the new wrapper.

@tautschnig
Copy link
Collaborator

Also @JohnDumbell recommend separating a commit that just turns symbol_tablet -> symbol_table_baset from one that actually introduces uses of the new wrapper.

+1 to that, and possibly even do so in a separate PR: this is quite likely the syntactically most intrusive part of this PR, and at the same time actually just the right thing to do irrespective of this PR.

@tautschnig
Copy link
Collaborator

@smowton, in response to your earlier elaboration: this PR, and thus item 4 from your list, may be absolutely the right thing to do. But I want to make as sure as by any means possible that we are not fixing symptoms, but actually treat the root cause. And having symbols named tmp seems like a bad idea either way, so I would strongly encourage also investing in item 1 from your list. If we are naming a symbol "tmp", then you might as well argue that we are wasting three bytes in the string table as using just a number would be about as informative. (And indeed translating f() + 2 to tmp = f(); tmp + 2 is just bad, the name ought to reflect that it's about f.) So let's please try to fix this wherever we can, and then we can also include the changes proposed in this PR as a safety net. But there must be data, and there must a clear rationale for each change.

@JohnDumbell
Copy link
Contributor Author

JohnDumbell commented Nov 9, 2018

I'll apply the recommendations / git changes now and then get to work on an actual benchmark. In the particular case where I saw this (and then thought doing something about it might be useful long-term) the actual suffix was in the 4000's.

@smowton
Copy link
Contributor

smowton commented Nov 9, 2018

@tautschnig that sounds good -- better names wherever possible, and a fresh-symbol routine that can be relied upon not to explode when there are a few thousand name clashes.

@tautschnig
Copy link
Collaborator

@JohnDumbell 4000+ name clashes seem scary - wouldn't that imply that the function as at least 4000 instructions? Of course that's possible, but it seems worth taking another look. Taking action to fix a problem is good, but I want to be really sure we are fixing the right problem.

@smowton
Copy link
Contributor

smowton commented Nov 9, 2018

At a guess the pathalogical cases will be mechanically-generated code

@JohnDumbell JohnDumbell force-pushed the jd/enhancement/find_free_suffix_improvement branch 2 times, most recently from d976534 to b90ae0b Compare November 9, 2018 11:28
@chrisr-diffblue
Copy link
Contributor

@smowton I've also seen such pathological code in certain embedded systems.... sigh.

@JohnDumbell JohnDumbell force-pushed the jd/enhancement/find_free_suffix_improvement branch from b90ae0b to 4f793eb Compare November 21, 2018 15:31
@JohnDumbell JohnDumbell force-pushed the jd/enhancement/find_free_suffix_improvement branch from 4f793eb to 842c199 Compare November 21, 2018 15:50
One keeps the old functionality, while the other just allows us to tell it on what number to start the suffix search.
@JohnDumbell JohnDumbell force-pushed the jd/enhancement/find_free_suffix_improvement branch from 842c199 to f48ca75 Compare November 21, 2018 18:28
@JohnDumbell JohnDumbell force-pushed the jd/enhancement/find_free_suffix_improvement branch 3 times, most recently from 5f9eb36 to 7f5890d Compare November 22, 2018 13:38
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 7f5890d).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/92299022

@JohnDumbell
Copy link
Contributor Author

JohnDumbell commented Nov 22, 2018

@tautschnig Sorry it took a little while, but we did some tests on a c method with 10,000 lines of x += argc ? f() : g(); to generate a large amount of temp symbols needing to be created, and with this PR applied it took 1:11 to finish generating the GOTO (it then got stuck on symex), and without it, I stopped timing after 30 minutes of it not finishing the generation.

I tried recreating the test on some less extreme Java examples and only got around a 5% reduction in run-time due to bytecode per method limitation and me not knowing the absolute best way to generate as many temporaries as possible, but I also didn't recognise any noticeable slowdown either. I know it provides similar benefits to the c benchmark on the library I originally found the issue in (Groovy), but trying to measure them precisely is a bit harder because something else is going on to cause that many temporaries to be needed in the first place.

If you need anything more precise just say.

(Also thanks to @smowton on this)

@smowton
Copy link
Contributor

smowton commented Nov 22, 2018

(the reason for that particular formulation btw: ternary conditionals get a temporary allocated if either of their branches may have side-effects)

@smowton smowton requested a review from tautschnig November 26, 2018 10:12
@smowton smowton assigned tautschnig and unassigned JohnDumbell Nov 26, 2018
Copy link
Member

@peterschrammel peterschrammel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we create names with function_id::tmp::<hash of expr/symbol names in expr>::tmpX to make clashes less likely?

@JohnDumbell
Copy link
Contributor Author

@peterschrammel don't see any reason why not, beyond the hashing/name creation overall might be slower than just accepting any clashes and doing one additional find using the wrapper? (also more unique names = more instances in the map with smaller counts against them)

I guess the differences would be minor either way though, and you'd definitely benefit if you didn't have the wrapper. I think I'd like to do it either way though, better to do it in a different PR?

Copy link
Collaborator

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good except for the comments below. Also it would be nice to copy the report on the experiment into one of the commit messages as it's somewhat cumbersome to go from the git history to GitHub to then find the data.

@@ -0,0 +1,113 @@
// Copyright 2016-2018 Diffblue Limited. All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is different from any other file headers in use. The "All Rights Reserved." scares me.

#ifndef CPROVER_UTIL_SYMBOL_TABLE_BUILDER_H
#define CPROVER_UTIL_SYMBOL_TABLE_BUILDER_H

#include "symbol_table.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this header required here?

// Check if we have an entry for this particular suffix, if not,
// create baseline.
auto suffix_iter = next_free_suffix_for_prefix
.insert(std::pair<std::string, std::size_t>(prefix, 0))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to use {prefix, 0} instead of std::pair<....>(prefix, 0)

@@ -100,6 +100,11 @@ class journalling_symbol_tablet : public symbol_table_baset
return result;
}

std::size_t next_unused_suffix(const std::string &prefix) const override
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment about the commit message: I don't understand "Due to the inclusion of another method we just need to make sure any calls are routed to the wrapped table and not the parent." Which method is this "another" method that's causing trouble?

@@ -23,6 +23,7 @@ Author: Daniel Kroening, kroening@kroening.com
#include <json/json_parser.h>

#include <goto-programs/class_hierarchy.h>
#include <util/symbol_table_builder.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Place this include with the other ones from util above.

@tautschnig tautschnig assigned JohnDumbell and unassigned tautschnig Nov 28, 2018
@JohnDumbell JohnDumbell force-pushed the jd/enhancement/find_free_suffix_improvement branch from 7f5890d to ca71045 Compare November 29, 2018 12:13
This wrapper is purely to allow the symbol table to keep track of the suffixes that it has currently seen. This improves speed in a situation where a large amount of instances with the same prefix are requested.
Due to the inclusion of an additional overridable method on symbol_table_baset we just need to make sure any calls to this wrapper are routed to the wrapped table and not to the base instance.
Tests were done on a c method with 10,000 lines of x += argc ? f() : g(); to generate a large amount of temp symbols needing to be created, and with this change applied it took 1:11 (m:s) to finish generating the GOTO before it got stuck on symex, and without it the generation ran longer than 30 minutes.

In attempting to recreate the issue in Java it only got around a 5% reduction in run-time due to bytecode per method limitation and me not knowing the absolute best way to generate as many temporaries as possible, but I also didn't recognise any noticeable slowdown either. In the Java instance where this issue was first recognized it was building a similar number of temporaries as the C program, but directly benchmarking that is problematic because there is likely other issues causing it to require this number of temporaries in the first place.

But it would cause similar speed increases in any language that requires a large amount of symbols to be created that have a clashing prefix.
@JohnDumbell JohnDumbell force-pushed the jd/enhancement/find_free_suffix_improvement branch from ca71045 to 6425eb1 Compare November 29, 2018 18:39
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 6425eb1).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/93065007

@tautschnig tautschnig merged commit 5b66cd3 into diffblue:develop Nov 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants