Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing from 1.1.0 to 1.1.2 breaks the hash #30

Closed
gergelyke opened this issue Feb 23, 2016 · 11 comments
Closed

Changing from 1.1.0 to 1.1.2 breaks the hash #30

gergelyke opened this issue Feb 23, 2016 · 11 comments

Comments

@gergelyke
Copy link

Hey,

did you notice anything like this? In our use-case the result hash has changed :/

@addaleax
Copy link
Collaborator

Yeah, I’ve thought about this a bit, but never really reached a conclusion.
The problem is: For virtually any change here (at least in the past) there is some object for which the hash changed, and it’s not obvious anyway whether the returned exact hash value should be considerered part of the public API here.

Some thoughts I’ve had on this:

  • It’s hard to be portable anyway. This module supports browsers, but when the objects get complicated, browser support for certain things starts to differ (e.g. Function.prototype.name); So having the same hash in different environments is a hard thing to guarantee, with or without code changes. Maybe things across different Node.js versions are better, but I’m not 100 % sure of that, either.
  • If one doesn’t want to hash complicated objects, then going for someHash(JSON.stringify(…)) will do most of the time → The very purpose of this module is to go beyond that and be able to hash just about anything correctly, but it’s hard without some kind of loss of portability.
  • Maybe some trade-offs can be made, e.g. that the hash value does not change (without a major version number increase) when using hash(…) with the default options or for some specific set options (or provide some kind of option that leads to a maximum amount of portability, however that might be measured).
  • Always increasing the major version number for changed returned hashes virtually strips all meaning brought by semver from this repository; there would be likely no minor or patch version increases at all, essentially leading to version lock for all dependants of this module.

It would defintely be a good idea to specify how exactly the return values conform or not conform to semver, but it’s not an easy task and I would love to hear other people’s thoughts on this.

@brianreavis
Copy link
Contributor

Always increasing the major version number for changed returned hashes virtually strips all meaning brought by semver from this repository; there would be likely no minor or patch version increases at all, essentially leading to version lock for all dependants of this module.

Is the emphasized part a bad thing?

I agree with you though, semver on a project like this is very tough to wrestle with (if not impossible). I think keeping the versioning strategy that's going now is reasonable, but there at least needs to be a note in the readme telling people they should lock object-hash at a specific version, because new versions (even patch versions) are likely to affect the result.

brianreavis added a commit to brianreavis/object-hash that referenced this issue May 7, 2016
@addaleax
Copy link
Collaborator

addaleax commented May 7, 2016

Is the emphasized part a bad thing?

Not necessarily, but it makes semver kind of pointless. Ideally, there would be a way to specify two versions, one for the API and one for the returned data (you could compare that to an extra ABI version), but I don’t really see a good way to implement that.

For now, your disclaimer seems like a reasonable idea.

@wclr
Copy link

wclr commented Jul 6, 2016

I think probably the best way it to increase major version you change algorithm or approach to hash calculation. minor versions for adding capabilities. path for bugs. I don't see any irrelevance with semver concept.

Consistency is the most important thing for this lib.

@rubennorte
Copy link

I agree with @whitecolor. You could follow semver increasing major versions when the algorithm generates a different hash for the same input. You can test this comparing the output of several samples across versions (in the same environment).

For changes in the algorithm that generates the same output you can just increase minor or patch.

@jsuiker
Copy link

jsuiker commented Jul 27, 2017

+1 for following semver guidelines

@addaleax
Copy link
Collaborator

@jsuiker Nobody is suggesting anything else, the question is how semver would be interpreted in the context of this library.

I think it’s a good sign that there hasn’t been much change in this library this year; if the ultimate resolution to this problem is to just avoid changes in general, that might be an acceptable way out.

@jsuiker
Copy link

jsuiker commented Jul 27, 2017

@addaleax Sorry for causing you any extra head-ache! Now that I've spent a few minutes reading the source, it makes sense why we're having this discussion - it's the very nature of a hash that you can't make a small change without completely changing the SINGLE return value

So I say, just keep going with whatever versioning scheme works for you 👍

@jsuiker
Copy link

jsuiker commented Jul 27, 2017

As for the portability issues you are talking about, I'm curious to hear why they're such an issue. I would think that once you have an idea of what sort of string should represent a function, you can strong-arm whatever browser output to give you what you need... thoughts?

@jwueller
Copy link

jwueller commented Aug 6, 2017

I think semver does not leave any room for interpretation here:

Major version X (X.y.z | X > 0) MUST be incremented if any backwards incompatible changes are introduced to the public API. It MAY include minor and patch level changes. Patch and minor version MUST be reset to 0 when major version is incremented.

The return value is part of the public API. If the return value is not backwards-compatible (which it often seems to be, according to the disclaimer), it seems to require a MAJOR version increase in order to be compliant.

This is especially important when tools like npm make assumptions about compatibility, (e.g. Caret Ranges, which is the default for saving packages). I do not think a disclaimer is a real solution, since it requires users to manually make exceptions for this library instead of being able to rely on the expected semantics of the versioning, which kind of defeats the purpose.

I also do not see any disadvantage to only having MAJOR versions, either. There is no hard limit on the number of versions that can exist, so what is the harm in having a version 127.0.0, even if only a single line of code has changed? The only argument against it appears to be an aesthetic one, and I would advocate for function-over-form, especially when writing a library.

@addaleax
Copy link
Collaborator

addaleax commented Aug 7, 2017

Okay, I’m closing this thread since by now it’s just going in circles and the replies don’t actually address the issue in question.

No, of course there is no debate whether semver should apply to this module, the tricky bit is whether the exact return value is part of the API contract, or just the fact that it yields the same result if it’s used twice.

I’ve modified the policy in 03e12a7 to match the stricter option, i.e. changes to hash values will be considered semver-major.

@addaleax addaleax closed this as completed Aug 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants