Skip to content

What information should the fingerprint be based on? #69

Open
@konstin

Description

We want to generate the right fingerprint values from ruff for integration with gitlab code quality (PR). The question is, on what information should the fingerprint be based?

If we e.g. take the following python code ...

def a(x=[]):
    x.append(1)
    print(x)


def b(y=[]):
    y.append(2)
    print(y)

... and run ruff on it, we get two B006 violations:

$ ruff --select B --show-source scratch.py
scratch.py:1:9: B006 [*] Do not use mutable data structures for argument defaults
  |
1 | def a(x=[]):
  |         ^^ B006
2 |     x.append(1)
3 |     print(x)
  |
  = help: Replace with `None`; initialize within function

scratch.py:6:9: B006 [*] Do not use mutable data structures for argument defaults
  |
6 | def b(y=[]):
  |         ^^ B006
7 |     y.append(2)
8 |     print(y)
  |
  = help: Replace with `None`; initialize within function

How should these be hashed for the fingerprint? If we include only the message and the source of the violation ([]), we get to identical fingerprints. If we include the line number on the other hand, the fingerprint will change if any line is inserted or removed before them.

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions