Sized Naturals are like Natural numbers, but for hierarchical labeling, where 0
and 00
are distinct entities.
-
Sized BigInt's are bit strings for Javascript implementations (using native BigInts), to represent hashes, labels and hierarchical indexes.
For main implementation, see src_js/SizedBigInt.mjs, runs with NodeJS and main browsers.
For demos and simplified implementations, see src_js/examples. The assert files are at data/assert*.txt. -
SQL implementation: see src_sql.
-
Text, etc. under construction. See PDF or help at this link to improve the text of foundations.
-
Collabore: reporting issues, reviewing text of foundations, or installing and testing the SizedBigInt class.
(summarizing the contents of the PDF - "Sized Naturals as foundation for hierarchical labelingand extend hexadecimals for any bit string size").
Sometimes we need natural numbers, but a kind of number where 0 is not equal to 00.
Sized BigInt's are arbitrary-precision integers (BigInt) with defined number of bits, to represent hashes, labels, encodings, hierarchical indexes or any other that need to differenciate 0
and 00
, preserving all other numeric interpretations, like order (002
>001
) and the freedom to translate its positional notation to some especific radix (e.g. binary to quaternary or hexadecimal).
The following examples can be mathematically described as a finite sets of pseudo-numeric representations. Limiting examples in 8 bits:
-
Samples of base2 representations: X1 = {
0
,1
} X2 = {0
,00
,01
,1
,10
,11
} X3 = ...
X8 = {0
,00
,000
,000
,...,00000000
,00000001
, ...,11111111
}. -
The same set X8 without some (non-compatible) items, expressed in quaternary (base4):
Y8= {0
,00
,000
,0000
,0001
,0002
,0003
,001
,0010
,0011
, ...,3333
}. -
Ordering. The order in ordinary mathematical sets is arbitrary, but to group or list elements we can adopt some order. The main ordering options for typical SizedBigInts are the lexicographic order, to enhance "same prefix" grouping or hierarchy; and the pseudo-numeric order. Using the bit-length as default criterium and numeric order as optional.
Here a set of elements illustrated with different representations, listed by lexicographic order of the binary representation:
Representation
(size,value) BitString Base4
(1,0) 0 ?
(2,0) 00 0
(3,0) 000 ?
(4,0) 0000 00
(5,0) 00000 ?
(6,0) 000000 000
(7,0) 0000000 ?
(8,0) 00000000 0000
(8,1) 00000001 0001
(7,1) 0000001 ?
(8,2) 00000010 0002
(8,3) 00000011 0003
(6,1) 000001 001
(7,2) 0000010 ?
(8,4) 00000100 0010
(8,5) 00000101 0011
... ... ...
Each SizedBigInt is an element of a set. The formal definition of this set is the mathematical reference-concept for implementations.
As showed in the table above, we can represent elements of a set X as ordered pairs, (l,n) of bit‑length l and numeric value n, a Natural number. Supposing a minimum bit-length function, minBL(), the set Xk is a SizedBigInt set constrained by k, the maximum number of bits:
where
Natural numbers can be expressed with positional notation, using the rule of "remove leading zeros". The rule is used in any base (radix) representation.
The SizedBigInt's are like BigInt's without the rule of remove leading zeros, and the SizedBigInt must be the same in any base representation. This last condiction is a problem: as we see in the table, there are no base4 representation for 0
, because each digit in base4 need 2 bits.
How to convert base2 one-digit numbers 0
and 1
to base4?
The solution proposed here is to use a fake (optional) final digit that represent these values. To avoid cofusion with hexadecimal letters we can start with G
to represent 0
and H
to represent 1
. It was named "hierarchical digit" (hDigit) because the ordinary base4 digits use two bits. The resulted notation is base4h. For base16 e neeed more hDigits, but the logic is the same.
Listing some bit strings and its base4h representations.
(size,value) BitString Base4h
(1,0) 0 G
(2,0) 00 0
(3,0) 000 0G
(4,0) 0000 00
(5,0) 00000 00G
(6,0) 000000 000
(7,0) 0000000 000G
(8,0) 00000000 0000
(8,1) 00000001 0001
(7,1) 0000001 000H
(8,2) 00000010 0002
(8,3) 00000011 0003
(6,1) 000001 001
(7,2) 0000010 001G
(8,4) 00000100 0010
(8,5) 00000101 0011
... ... ...
(7,127) 1111111 333H
(8,254) 11111110 3332
(8,255) 11111111 3333
Base4h numbers are strings with usual base4 pattern and the halfDigit as optional suffix. This syntax rule can be expressed by a regular expression:
/^([0123]*)([GH])?$/
To translate from binary, only values with odd number of bits will be translate the last bit as halfDigit. The complete translation table, from binary to base4 representations, is:
{ "0":"G", "1":"H", "00":"0", "01":"1", "10":"2", "11":"3" }
Extending the hexadecimal representation, in a similar way to the previous one used for base4h: the last digit as a fake-digit that can represent all these incompatible values — so using the halphDigit values G
and H
for 1-bit values, and including more values for 2 bits (4 values) and 3 bits (8 values). The total is 2+4+8=14 values, they can be represented by the letters G
to T
.
The name of this new representation is Base16h, because it is the ordinary Base16 "plus an optional hDigit", and is used to represent hierarchical bit strings. Its string pattern is:
/^([0-9a-f]*)([G-T])?$/
TABLE-3
(size,value) BitString Base16h
(1,0) 0 G
(2,0) 00 I
(3,0) 000 M
(4,0) 0000 0
(5,0) 00000 0G
(6,0) 000000 0I
(7,0) 0000000 0M
(8,0) 00000000 00
(8,1) 00000001 01
(7,1) 0000001 0N
(8,2) 00000010 02
(8,3) 00000011 03
(6,1) 000001 0J
...
(6,63) 111111 fL
(7,126) 1111110 fS
(8,252) 11111100 fc
(8,253) 11111101 fd
(7,127) 1111111 fT
(8,254) 11111110 fe
(8,255) 11111111 ff
To translate from a binary string with b bits, there are b % 4
last bits to be translated as special digits. Splitting the value as binary prefix (part[0]
) and suffix (part[1]
with 1, 2 or 3 last bits),
let part = strbin.match(/^((?:[01]{4,4})*)([01]*)$/)
the prefix will be translated to usual hexadecimal number, and the suffix, when exists, translated by this complete "bits to haslDigit" map:
{
"0":"G","1":"H",
"00":"I","01":"J","10":"K","11":"L",
"000":"M","001":"N","010":"O","011":"P","100":"Q","101":"R","110":"S","111":"T"
}
The textual contents, images and datasets of this git repository are dedicated to the public domain.
Algorithms and software source-code licensed as Apache-2.0.