Bug in the caching implementation

After trying out `atheris`, based on [your example](https://github.com/google/atheris/pull/2) (it is awesome, I'd say!) I found an interesting bug in caching that comes from the following fact:

```python
>>> hash(-2)
-2
>>> hash(-1)
-2
```

From [PEP-456](https://www.python.org/dev/peps/pep-0456/#requirements-for-a-hash-function):

> The internal interface code between the hash function and the tp_hash slots implements special cases for zero length input and a return value of -1. An input of length 0 is mapped to hash value 0. **The output -1 is mapped to -2.**

It leads to a problem with the wrong canonicalisation, e.g. if `{'exclusiveMaximum': 1, 'exclusiveMinimum': -1, 'type': 'number'}` was cached first, then applying canonicalisation on `{'exclusiveMaximum': 1, 'exclusiveMinimum': -2, 'type': 'number'}` will return  `'exclusiveMaximum': 1, 'exclusiveMinimum': -2, 'type': 'number'}` :( 

`-1` is quite common, and these cache collisions make me think about the current implementation - I am not completely sure how to implement caching efficiently enough. However, in #69, after reducing how many schemas are inlined, the performance improved dramatically, and I am not sure if this caching layer worth having (at least in the current implementation)

What do you think?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Bug in the caching implementation #71

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Bug in the caching implementation #71

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions