@@ -8,18 +8,8 @@ An Entry to the One Billion Row Challenge in Object Pascal using Delphi 12 by [E
8
8
9
9
### Dependencies
10
10
11
- Project uses Delphi units: ` Classes ` , ` System.SysUtils ` , ` System.StrUtils ` and ` Math ` .
12
-
13
- ### UTF8 vs. Windows Terminal
14
-
15
- The text in the Windows Terminal console uses the system code page, which does not play well with ` UTF8 ` .
16
- The only way to match the approved result is to write the output to a file, with resulting ` SHA256 ` hash:\
17
- ` 4256d19d3e134d79cc6f160d428a1d859ce961167bd01ca528daca8705163910 `
18
-
19
- If the Windows console output is redirected to a file, some characters are mangled, and the resulting ` SHA256 ` hash is:\
20
- ` 5c1942377034a69c7457f7cf671b5f8605df597ef18037c1baf4b9ead3c84678 `
21
-
22
- For the challenge, compiled for LINUX, the console result will (hopefully) be correct.
11
+ Project uses Delphi System units: ` Classes ` , ` SysUtils ` , ` StrUtils ` , ` Diagnostics ` ,
12
+ ` Threading ` and ` SyncObjs ` .
23
13
24
14
### Execution
25
15
```
@@ -30,6 +20,9 @@ For the challenge, compiled for LINUX, the console result will (hopefully) be co
30
20
bfire -i <file_1> -o <file_2> | <file_1> contains Weather Data
31
21
| <file_2> contains result
32
22
If <file_2> is not defined, result goes to CONSOLE (STDOUT)
23
+
24
+ Select 1, 2 or 3 reading threads (use -r in addition to -o)
25
+ bfire -i <file_1> -o <file_2> -r <n>
33
26
```
34
27
35
28
#### Contest Mode
@@ -51,19 +44,23 @@ The list is initially unsorted and has linked objects for records holding accumu
51
44
Finally, the TStringList is sorted and used to output sorted data.
52
45
53
46
Third version has a thread for the console (which waits for tabulation, then sorts and writes results),
54
- one thread to read file, four threads to tabulate stations (split by section of alphabet). File is read
55
- byte-wise into "classic" byte arrays for station name and temperature. The arrays are passed to one of
56
- four stacks, split by section of alphabet, for tabulation. Tabulation threads hash station name, use hash
57
- as index into a data array. After all data is read and tabulated, the four data arrays are added to an
47
+ two threads to read file, five threads to tabulate stations. Stations are grouped into five separate stacks,
48
+ so each tabulation thread has roughly the same work load. File is read byte-wise into "classic" byte array
49
+ for each file line ending in ascii 10. Each of these arrays is queued as a record in a last-in-first-out stack.
50
+ Tabulation threads split the data into station name and temperature, then hash station name and use hash
51
+ as index into one of five data arrays. After all data is read and tabulated, the five data arrays are added to an
58
52
initially unsorted TStringList that holds unsorted Unicode station name and has linked pointers to
59
53
tabulated data for each station. Finally, the TStringList is sorted, and the data is output.
60
54
61
55
## History
62
56
63
- - Version 1.0: First working version, based on TStringList.
64
- - Version 1.1: Modified rounding to new baseline.
65
- - Version 2.0: Use hashing, sort later.
66
- - Version 2.1: Minor speed tweaks.
67
- - Version 2.2: Try hash functions modification.
57
+ - Version 1.0: first working version, based on TStringList.
58
+ - Version 1.1: modified rounding to new baseline.
59
+ - Version 2.0: use hashing, sort later.
60
+ - Version 2.1: minor speed tweaks.
61
+ - Version 2.2: try hash functions modification.
68
62
- Version 3.0: Six threads: one to read, four to tabulate, one (console) to rule them all...
69
- - Version 3.1: Safer locking strategy
63
+ - Version 3.1: Safer locking strategy - didn't work.
64
+ - Version 3.2: Eight threads: two to read, five to tabulate, one (console) to rule them all...
65
+ - Version 3.3: Use 1, 2 or 3 threads to read.
66
+
0 commit comments