Skip to content

The stupidest way to prove wrong to those who think the glyphs are letters.

Notifications You must be signed in to change notification settings

Halocrea/HuragokForce

Repository files navigation

STUPID BRUTEFORCING OF THE SUBGLYPHS

Intro

The numbers in the variable sequence are just meant to be the subglyphs and not to point at a specific order[1]. The number of variation for 10 characters and 26 choices each is 19,275,223,968,000[2] (and that's too much).

Presumptions

As computations of all variations without repetition means factorial complexity, we needed to narrow down a few things:

  1. We're going to assume that the subglyphs are roman letters we can "translate" such as "subglyph 1 = X, subglyph 2 = N" and so on.
  2. We're assuming they are english words.
  3. Given that the subglyph 2 is by far the most common and repeated 6 times, and looking at frequency analysis of the letters in english vocabulary[3], we're going to assume this is the "E" letter.
  4. Our alphabet is going to be amputated of the 4 less commonly used letters in english vocabulary: Q, J, Z, X ( < 0.2% frequency)[3].
  5. risky=iest asumption: we're going to assume that the subglyph 0, being between two "E"s, is a "Y". It's probably stupid, but we need to narrow those numbers down to something almost reasonable.

Basic knowledge

With all those base criterias, we end up with 8 characters that can be any of 20 letters, which makes 5,079,110,400 possibilities.

I ran a test (in debug and for one minute, so give or take) locally, and I had on average 150,000 computed variations every minute.

Based on that observation, computing and testing 5,079,110,400 possibilities may take around 23 days on a computer running the program 24/7.

Language Detection

Once the program computed a variation, we'll get it tested with a library that will determine the language that may be used in it[4]. The issue is: the library (and many others, this one is actually performing pretty OK) is obviously struggling with the lack of space in the sentence.

But choices need to be made: I ran some tests, and the library gives a "level of confidence":

  • the lower that number is, the most confident the program is about its deduction,
  • trying 19 character long, non-spaced, english sentences gave me levels between 3953 and 3962,
  • based on this, the program will only store english sentences with a level of confidence equal or smaller than 3960. [EDIT: was 3964, but running a test on 150,000 variations, it seemed to keep still too many results]

Links & Resources

About

The stupidest way to prove wrong to those who think the glyphs are letters.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages