Skip to content

"No match found" when SHA-256 hash is lowercase and starts with 8+ numbers #165

Closed
@kurtmckee

Description

Description

When a lowercase SHA-256 hash that starts with 9 numbers is copied to the clipboard, OpenHashTab is not able to match the algorithm.

Steps to reproduce

  1. Configure OpenHashTab with the settings shown at the bottom of this post.

  2. Download Logseq-win-x64-0.9.5.exe (direct link) and SHA256SUMS.txt (direct link) from the Logseq 0.9.5 tag page.

  3. Open SHA256SUMS.txt and copy the SHA-256 checksum of Logseq-win-x64-0.9.5.exe to the clipboard.

  4. Open the Explorer properties for Logseq-win-x64-0.9.5.exe and go to the Hashes tab.

  5. Observe that there is "No match found" for the SHA-256 hash in the clipboard. In addition, CRC-32 and XXH-32 (8-character long algorithms) are unexpectedly enabled.

  6. Close the properties dialog.


  7. Paste the lowercase SHA-256 checksum into Notepad++, highlight the hash, and press [CTRL]+[SHIFT]+U to uppercase the checksum. Copy the uppercase checksum into the clipboard.

  8. Open the Explorer properties again for Logseq-win-x64-0.9.5.exe and go to the Hashes tab.

  9. Observe that the uppercase checksum is recognized and valid.

Things I tested

I attempted to isolate the issue in these ways:

  • I toggled all four combinations of the "Display hashes in uppercase" and "Export hashes in uppercase" settings and tested against lowercase versions of the checksum. In all cases, the lowercase hash resulted in "No match found".
  • I tested other arbitrary files on my computer to reproduce the issue, but found that OpenHashTab correctly matched other files on my computer against their own lowercase SHA-256 checksums.

Suspected source of bug

I noticed that the first 9 characters of the checksum are numbers, and that short 8-character hash algorithms kept getting enabled for this specific checksum (CRC32 and XXH32) which made me think that the source of the bug might be the algorithm detection. This led me to the following location:

std::vector<uint8_t> utl::FindHashInString(std::wstring_view wv)
{
static auto regex = ctre::match<LR"(((?:[0-9A-F]{2} ?)(?:[0-9A-F]{2} ?)(?:[0-9A-F]{2} ?)(?:[0-9A-F]{2} ?)++|(?:[0-9a-f]{2} ?)(?:[0-9a-f]{2} ?)(?:[0-9a-f]{2} ?)(?:[0-9a-f]{2} ?)++))">;
if (auto [whole, hash] = regex(wv); whole)
return HashStringToBytes(std::wstring_view{ std::wstring(hash)});
return {};
}

I think that the regular expression used here may be matching against the uppercase variant path and is extracting only the first 8 characters.

My OpenHashTab configuration

image

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions