Error out when contracts collide on name#1397
Error out when contracts collide on name#1397chriseth merged 20 commits intoargotorg:developfrom roadriverrail:contract_collision
Conversation
|
Some things to consider before merging--
|
| if (m_contracts.find(contract->name()) != m_contracts.end()) | ||
| { | ||
| const ContractDefinition* existingContract = m_contracts.find(contract->name())->second.contract; | ||
| if (contract != existingContract) |
There was a problem hiding this comment.
We are trying to keep TAB-only indentation.
There was a problem hiding this comment.
I swear that I know this. I'm going to have to colorize my tabs so I don't keep doing this.
| if (!resolver.resolveNamesAndTypes(*contract)) return false; | ||
| if (m_contracts.find(contract->name()) != m_contracts.end()) | ||
| { | ||
| const ContractDefinition* existingContract = m_contracts.find(contract->name())->second.contract; |
| if (contract != existingContract) | ||
| BOOST_THROW_EXCEPTION(CompilerError() << | ||
| errinfo_sourceLocation(contract->location()) << | ||
| errinfo_comment(contract->name() + " conflicts with!!! contract at " |
There was a problem hiding this comment.
I guess !!! was for debugging.
There was a problem hiding this comment.
Yeah, while I was sorting some test case stuff out. I can't believe I didn't see that. No more commits before bed for me.
| if (contract != existingContract) | ||
| BOOST_THROW_EXCEPTION(CompilerError() << | ||
| errinfo_sourceLocation(contract->location()) << | ||
| errinfo_comment(contract->name() + " conflicts with contract at " |
There was a problem hiding this comment.
The verb "conflicts" is a bit vague. Maybe something like Contract name is already defined at ... is more direct?
|
The planned proper fix would be to disambiguate such cases through the file name - this should also help in the case of libraries. Do you think that this would fix all the cases you have seen? |
|
Could you explain in more detail what you mean by "disambiguate such cases through the file name"? Maybe offer a small example? |
|
If you want to refer to a contract, you should be able to do it by |
libsolidity/ast/AST.cpp
Outdated
| } | ||
|
|
||
|
|
||
| std::string ContractDefinition::fullyQualifiedName() const |
There was a problem hiding this comment.
Please drop the std:: prefixes in cpp files.
libsolidity/ast/AST.cpp
Outdated
|
|
||
| std::string ContractDefinition::fullyQualifiedName() const | ||
| { | ||
| std::string sourceString = *(location().sourceName); |
There was a problem hiding this comment.
Declaration, which should be a base class for this one, already has a function sourceUnitName() which does a similar thing, but based on scopes and not on source locations.
Can you add a function fullyQualifiedName to the Declaration class and return something like sourceUnitName() + ":" + name()? I think the : should always be present, even if the source unit name is empty.
There was a problem hiding this comment.
Will look into this. FYI, the reason I dropped the : when the source unit name is not present is because it breaks a bunch of tests that expect to look up contracts by their name. It's really just busy work to fix, but I also opted for the path-of-least-edits for a first go-round.
There was a problem hiding this comment.
But that is actually a different thing. In the tests, the source file is not named. I would say that it is always possible to do a qualified lookup and in addition, you can do an unqualified lookup if the name is unique, which would guarantee backwards-compatibility.
| m_contracts[contract->name()].contract = contract; | ||
| if (m_contracts.find(contract->fullyQualifiedName()) != m_contracts.end()) | ||
| { | ||
| const ContractDefinition* existingContract = m_contracts.find(contract->fullyQualifiedName())->second.contract; |
There was a problem hiding this comment.
Perhaps use m_contracts[contract->fullyQualifiedName()] here? Also please put the const after the type name: ContractDefinition const*
There was a problem hiding this comment.
Sure. It's safe to use [] by that point. Also, I appreciate the style nits; y'all just use a different style than I've seen before (though I'm far more a C guy than a C++ guy)
|
|
||
| if (contract != existingContract) | ||
| { | ||
| auto err = make_shared<Error>(Error::Type::DeclarationError); |
There was a problem hiding this comment.
Are you able to trigger this? I think scoping rules should not allow it. If you cannot trigger it, please change it into a solAssert().
There was a problem hiding this comment.
I actually put that in there because I can trigger it. It came up fairly regularly, I think possibly having to do with different phases of compilation.
There was a problem hiding this comment.
Can you tell me a test that triggers it? I would like to investigate.
There was a problem hiding this comment.
ContractsCollision1.sol:
pragma solidity ^0.4.0;
contract Deed {
}
contract Registrar {
function newBid(bytes32 sealedBid) payable {
Deed newBid = new Deed();
}
}
ContractsCollision2.sol:
pragma solidity ^0.4.0;
contract Deed {
}
contract Registrar {
}
contract NoCollision {
}
If you drop something like this right before the if condition we're discussing:
if (contract == existingContract)
{
cerr << "contract: " << contract->fullyQualifiedName() << " existingContract : " << existingContract->fullyQualifiedName() << endl;
}
Then you should see each contract listed once. It happened so regularly to me that I just assumed it was part of some two-phase compilation process.
There was a problem hiding this comment.
I don't get it, all contracts in your example have different fully qualified names, so we should not get to line 226.
There was a problem hiding this comment.
Right, they do all have different fully-qualified names. My intuition, however, is that somehow or another this code path ends up getting run twice for each contract. The first time causes each contract to get added, but then the hashmap is built up, and so the second time, each contract "collides" with itself once. I didn't know why at the time (even just writing some of this code gave me more comfort with how solc works internally), but like I said, it was almost like the parsing process happened twice. Before I put in this stuff that actually checks for collision between insert, running through this code path twice was a non-issue, since all that happened was an insert, and inserting the same contract a second time wasn't going to cause any issues. But now that collisions are detected before the insert, it becomes more apparent.
It should be pretty easy to figure out each contract is being passed through this function twice, and where. I can take on that investigative effort if you want.
There was a problem hiding this comment.
Okay. It looks like what's happening is that there were two places in this function where contracts get added to the hashmap. The first one is back around line 176 or so. I replaced both cases of the insertion into the hashmap with a collision detector, because I naively believed there was a reason for both to exist; both of those inserts exist in the context of slightly different loop control logic. I'm not sure what really differentiates them, though; it might be one isn't actually necessary.
|
|
||
| } | ||
| else | ||
| { |
| } | ||
| // If no collision, return the contract's name | ||
| // String is copied to ensure that the contract's name can't be messed with | ||
| return std::string(matchContract.contract->name()); |
There was a problem hiding this comment.
No explicit conversion needed here.
| } | ||
| } | ||
| // If no collision, return the contract's name | ||
| // String is copied to ensure that the contract's name can't be messed with |
solc/CommandLineInterface.cpp
Outdated
| { | ||
| if (m_args.count("output-dir")) | ||
| createFile(_contract + ".bin", m_compiler->object(_contract).toHex()); | ||
| { |
| } | ||
| } | ||
| else | ||
| { |
There was a problem hiding this comment.
Another pair of braces to remove.
roadriverrail
left a comment
There was a problem hiding this comment.
Responses included; anything without a response will be corrected as requested.
|
Did a stylistic cleanup and incorporated all the proposed changes, with the exception of the ones from @chriseth regarding my guard code to prevent double-adding a contract in CompilerStack.cpp (see conversation above). I'm also going to fix up the tests so they use fully qualified names for looking up contracts. |
test/libsolidity/Imports.cpp
Outdated
| CompilerStack c; | ||
| c.addSource("a", "library A {} pragma solidity >=0.0;"); | ||
| c.addSource("b", "library A {} pragma solidity >=0.0;"); | ||
| c.addSource("a", "library A {} library A {} pragma solidity >=0.0;"); |
There was a problem hiding this comment.
After this change, the test does not test the intended feature anymore. There is a separate module that checks for library name clashes. This might be something we need to address here, too: Clashes in the linker.
There was a problem hiding this comment.
At this point, the libraries are actually internally called "a:A" and "b:A" and would be referenced as import A from "a" and import A from "b". The only way I could think to cause clashes would be to introduce a third source with imports the two of them (both as A), but I am presuming, perhaps incorrectly, that the compiler would notice this and give an error before getting to a linker. So, I'm not completely sure how to set up this test correctly.
There was a problem hiding this comment.
Regardless of what they will be called, this contract will produce an error in the type checker and not in the linker. Can you please also check how the library is called in the unlinked bytecode? There should be something like __A______ or __a:A____.
There was a problem hiding this comment.
In unlinked bytecode, it's currently in the __A_______ format, because I left the codegen stuff using name( ) instead of fullyQualifiedName( ). The code that prevents library name collisions is in CompilerStack.cpp and uses fullyQualifiedName( ), hence the reason the test fails. So, one of two things probably needs to happen-- the linker symbols start using fully qualified names or the library clash checker goes back to using name( ).
I'm somewhat more fond of the first of those two ideas, so I've actually already written it up but haven't pushed it (since I don't want to update the PR). Two issues there. The first is people's existing linker map files will need to be updated. The second is that it looks like there's a 36-character limit on link references; if that's the case, I'd love to know how we can truncate starting at the tail rather than the head, so we preserve the :A part and as much path as possible.
There was a problem hiding this comment.
Right, I would also prefer to use fully-qualified names for the linker. In the future, libraries will be linked differently and you do not have the 36 character limit. Also, the format of using __A____ is only a semi-backwards-compatible hack. You should really use solc --libraries my/file/name.sol:A=0x123 my/contract.sol for compiling, and there, there is obviously not such a small size limit.
If you want, you can merge this pull request and do the library handling in another one.
| if (!resolver.resolveNamesAndTypes(*contract)) return false; | ||
| m_contracts[contract->name()].contract = contract; | ||
|
|
||
| if (m_contracts.find(contract->fullyQualifiedName()) != m_contracts.end()) |
There was a problem hiding this comment.
I checked this and the other place again: The compiler returns an error, which means it detects the double declaration, but the aim for the compiler is to still continue as much as possible even if there was an error.
If you leave this code in, the error will be reported twice. So please change this (and the other place) to silently drop the second contract and not store it in m_contracts.
There was a problem hiding this comment.
I noticed the double-reporting of the error. Actually, it's worth noting that the only way to cause a collision now is to have the same contract twice in one file, which is actually covered under another error already, anyway. So, I think you're right that it's actually okay to simply silently continue without adding the contract since it's covered by the other error.
I will, however, be leaving comments explaining what's going on in the code, because it feels like the sort of thing that a future developer could trip over.
chriseth
left a comment
There was a problem hiding this comment.
The only remaining issue is whether this modified the linker or not and whether we have to change the test accordingly or not.
|
I coded up the library stuff already and it feels right to be in this PR.
I also think I made the library clash test case a hair more useful. Will
push in a few minutes.
…On Thu, Dec 22, 2016, 08:35 chriseth ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In test/libsolidity/Imports.cpp
<#1397>:
> @@ -104,8 +104,7 @@ BOOST_AUTO_TEST_CASE(simple_alias)
BOOST_AUTO_TEST_CASE(library_name_clash)
{
CompilerStack c;
- c.addSource("a", "library A {} pragma solidity >=0.0;");
- c.addSource("b", "library A {} pragma solidity >=0.0;");
+ c.addSource("a", "library A {} library A {} pragma solidity >=0.0;");
Right, I would also prefer to use fully-qualified names for the linker. In
the future, libraries will be linked differently and you do not have the 36
character limit. Also, the format of using __A____ is only a
semi-backwards-compatible hack. You should really use solc --libraries
my/file/name.sol:A=0x123 my/contract.sol for compiling, and there, there
is obviously not such a small size limit.
If you want, you can merge this pull request and do the library handling
in another one.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#1397>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAlhYF8J2ydnj8o1zTFZ1LyFidsEPqm1ks5rKnydgaJpZM4K2naU>
.
|
chriseth
left a comment
There was a problem hiding this comment.
It would be great if you could change the query functions to be backwards-compatible, so that e.g. the linker also works without having to prefix the library name by path:. If it is backwards-compatible, no changes to the tests should be necessary.
solc/CommandLineInterface.cpp
Outdated
| { | ||
| if (m_args.count(g_argOutputDir)) | ||
| createFile(_contract + ".bin", m_compiler->object(_contract).toHex()); | ||
| if (m_args.count("output-dir")) |
There was a problem hiding this comment.
Is this change here on purpose?
There was a problem hiding this comment.
No. It looks like something left over from merging up that shouldn't be that way. Will fix.
|
Please also add an entry into |
|
Apologies on the delay; transition in my day job to blame. Fixed the most critical parts of this. Will look at the query process tomorrow to add a fallback like you asked. |
|
No worries! Do you still plan to make it backwards-compatible? |
|
I'm not a huge fan of having fallback behaviors like that, but I'm going to
do it. There isn't really a query "function" being used in those places.
It's really just hash lookup, so I wanted to refactor the code into a
function. That's when I also noticed it looks like there's a function in
CompilerStack called contract( ) that might already do this sort of thing,
so I just needed a little more peeking in on doing it right.
On Mon, Jan 9, 2017 at 5:23 AM chriseth <notifications@github.com> wrote:
No worries! Do you still plan to make it backwards-compatible?
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#1397 (comment)>,
or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAlhYMdEcVnr5uMBtMwHEeevXJhGT0Spks5rQgqdgaJpZM4K2naU>
.
|
The previous behaviour, courtesy of the [] operator in std::map, would uncritically store a new ContractDefinition in m_contracts even when a ContractDefinition already existed. This "resolved" collissions on contract names by clobbering the original one with the new one, and could lead to scenarios where the clobber would only be discovered when the original ContractDefinition could not be found or referred to, which was an unhelpful InternalCompilerError. This change checks the m_contracts map for a collision first and will not let the ContractDefinition be changed to a new one once it's set, throwing a CompilerError with information about the conflict.
@chriseth had suggested that it would be better if contracts were referenced in a file:contract notation, and that we output .bin files that prepend original path names if necessary to avoid a collision. This commit is mostly a draft; it still needs to be run through the test suite.
A large number of tests compile contracts while passing in an empty string for the source name. This leads to it being keyed by the name ":<contract>", while the tests try to look it up under the name "<contract>". This change resolves that issue by dropping the ':' in cases where there is, effectively, no source file to prepend anyway.
Throwing a CompilerError on multiple contract definition violates the expectations of the test suite, which thinks that compile() will return false if the code can't compile. This brings contract collision reporting in line with most of the other errors.
Since contracts and libraries only collide if they share a common source file now, this test only works if both libraries are in the same source.
The fully-qualified name of a contract with no source unit is :<Name> instead of just <Name>, so the test system needed to be adjusted accordingly.
Because contracts are uniquely identified by their source unit, there is no need for a unique error for this; it's actually covered by the checker for double-declaration of identifiers.
Using libraries leaves behind a library link reference in the binary which the linker must later resolve. These link references were still being generated by name and not by fully-qualified name. This would lead to a link-time collision between two libraries having the same name but in different source units. This change changes linker symbols over to fully-qualified names, which resolves that issue. This does potentially introduce a new problem, which is that linker symbols appear to be limited to 36 characters and are truncated. Storing paths extends the average symbol size, and it would be great if truncation was from the tail rather than the head.
The library name clash checker throws errors when two libraries of the same name are spotted. In a previous commit, this function was rewritten to use fully-qualified names instead, which makes it redundant to the checker for multiply-declared identifiers. Since it no longer serves a clear purpose, the function is being dropped.
This reverts commit c4a9ca5.
Since libaraies no longer collide on name but on fully-qualified name, you can only induce collision by colliding them as idenfitiers.
Looks like merging up munged line 188 in CommandLineInterface.cpp, so that a string literal was being used where a global variable should be.
This reverts commit f8914c6.
Properly, contracts are now looked up via <source>:<contract> identifiers called "fully qualified names." As a modicum of backward-compatibility, failure on a lookup is now backed up by seeing if the ":" exists at all, and if it doesn't, then the known contracts are scanned for any matching contract name.
|
Rebased, added, the fallback, and reverted the changes to the tests. I'm not super stoked about the fallback, mostly because somone may have really meant "A:B," but the lookup is for just "B," and the lookup could return "C:B" instead. Again, though, that might live in a world of theory that exists mostly in my head. |
|
Nice, thanks a lot! There are some tiny things, which I will just do after merging |
The previous behavior, courtesy of the [] operator in std::map, would
uncritically store a new ContractDefinition in m_contracts even when a
ContractDefinition already existed. This "resolved" collissions on contract
names by clobbering the original one with the new one, and could lead to
scenarios where the clobber would only be discovered when the original
ContractDefinition could not be found or referred to, which was an unhelpful
InternalCompilerError.
This change checks the m_contracts map for a collision first and will not let
the ContractDefinition be changed to a new one once it's set, throwing a
CompilerError with information about the conflict.
Fixes issue #1120