This course is targeted towards computer science students of the University of Applied Sciences in Fulda, but you can still follow along if you are familiar with Java. There are accompanying videos in German on PeerTube and YouTube. You can have a look at the slides used in the videos here.
If you do not speak German: the important parts will be covered here in this README or with alternative resources.
You might have learned the basics of Java in the module Programming 1 but are not sure how to apply them in a full-blown project. Well, that's where Programming 2 comes in handy, but until then let's build a blockchain. Needless to say that this will be a prototype and not the bullet-proof foundation of the next hot crypto-currency. However, we will learn a thing or two about real-world blockchains along the way.
Before we build, we shall know what to build.
Course video in German: PeerTube, YouTube (~9min)
Hearing about blockchain probably triggers bitcoin somewhere else in the brain. Rightfully so, as it is a striking example of a blockchain's possibilities. Instead of the course video you can also watch this to get an impression: But how does bitcoin actually work? (~27min) or read this paper by the "founder" of bitcoin.
To put it briefly: A blockchain is just a list of stuff you want to record. Apart from the actual data you care about, each entry has some additional information to make it almost impossible to alter the entries later on without getting caught in the act. Immutability is achieved by linking entries together (it is a chain after all). How? you might ask. Let's find out.
Course video in German: PeerTube, YouTube (~12min)
horstl is the management system of Hochschule Fulda.
One feature of it is requesting one's grades.
We will create a blockchain that securely stores read-only grades of students and call it the horstlChain.
We instantiate objects of the class ExamAttendance
as our data points.
To avoid thinking up exam results every time we want to create an instance of ExamAttendance
,
we instead call the static method getRandomAttendance
that creates random objects of this class.
Course video in German: PeerTube, YouTube (~14min)
First, let us make things easier and only consider a single block. Think of it as one entry in the list of things we want to record.
public class Block {
private ExamAttendance examAttendance;
private String hash;
private String previousHash;
private String timestamp;
/* omitting parts of the code to keep snippet brief */
}
Wait, that is way more than just an exam attendance. Well yes. Otherwise, we wouldn't need an additional class :)
One attribute is the timestamp
, which is not a necessity but nice to have.
Not only can we later trace the creation date of the block, but it also helps with checking consistency,
e.g. a block later on in the chain could not have an earlier timestamp than any previous block.
What helps even more with consistency is storing the previousHash
, i.e. linking to the previous block.
A hash
is an (almost) unique identifier for a block, like a fingerprint.
It is created by putting all the data of the block into a blender.
The result is a long row of zeroes and ones.
Here comes the trick: part of the data we put into the blender is the previousHash
, meaning the hash of the predecessor block.
A block's resulting hash
(identity) is therefore tied to its ancestor.
Like we humans carry parts of our ancestors in us.
What a beautiful thing.
Course video in German: PeerTube, YouTube (~17min)
Obviously it is not a real blender that creates the hash
.
We rather rely on an ingenious cryptographic algorithm to produce a unique outcome.
There are several algorithms available.
In our project we will use one from the SHA-3 family, namely SHA3-256.
We don't need to get into the details of the algorithm, that's something for your cryptography or security class.
Nevertheless, we should be aware of two of its features.
To follow along, you can try out some hashing here.
For example:
input:
University of Applied Sciences Fulda
SHA3-256 output (64 characters in hexadecimal):
18a399ab855ef479c77820ac4155f3954835bc914e3f378844beb5e15a5317f8
It is noticeable that, no matter what input we are hashing, the SHA3-256 will always output a string with a fixed length of 64.
This string consists of the characters 0-9
and a-e
.
It is the hexadecimal representation of a sequence of zeroes and ones with the length of 256.
That's where the 256 in SHA3-256 comes from.
SHA3-256 output (256 digits in binary):
1100010100011100110011010101110000101010111101111010001111001110001110111100000100000101011000100000101010101111100111001010101001000001101011011110010010001010011100011111100110111100010000100010010111110101101011110000101011010010100110001011111111000
The algorithm has another exciting property. If we change the input just slightly, we will get a completely different output:
input:
University of Applied Science Fulda
SHA3-256 output (in hexadecimal):
0db1827db07b5a4a2e75adfeb720df2af6115b68d4053c6803a6ce61b838abe2
Do you spot the difference? Yes, I omitted the trailing s
in Sciences
and the hash looks nothing like before.
We will later see why these properties are important to us.
First, let's look at the code.
Don't worry if you don't fully get what is happening here.
It will be explained below.
public class Utils {
public static String getSha3256Hash(String originalString) {
/* ... */
}
private static String bytesToHex(byte[] hash) {
/* ... */
}
}
So we have a Utils
class here with two methods.
Programmers often create this kind of classes to have helper methods at hand, e.g. to convert something into something else.
The advantage of these public static
methods is that we can call them from anywhere in our code.
getSha3256Hash
turns an input string into a SHA3-256 hashbytesToHex
converts an array ofbyte
values into aString
To be honest, the standard Java class MessageDigest
does the heavy lifting for us.
All we need to know is, that we can call getSha3256Hash
to get the hash
of whatever we put as an argument.
Considering our previous example:
Utils.getSha3256Hash("University of Applied Sciences Fulda")
will return 18a399ab855ef479c77820ac4155f3954835bc914e3f378844beb5e15a5317f8
.
Now that we are equipped with a hash-puking method, we can start creating hashes.
To achieve the blockchain's traceability and integrity, we mix everything we want to secure into the input.
Back in our Block
class we call our helper method in the method calculateHash
:
public class Block {
/* omitting parts of the code to highlight only the discussed parts */
public Block(ExamAttendance examAttendance, String previousHash) {
/* ... */
this.hash = calculateHash();
}
private String calculateHash() {
return Utils.getSha3256Hash(examAttendance.toString() + previousHash + timestamp);
}
}
Course video in German: PeerTube, YouTube (~15min)
It's time to get the ball rolling by building our chain.
You might have guessed what that means.
We need another class: the HorstlChain
.
Its main task is to manage blocks (add and read).
Two attributes help to achieve this:
public class HorstlChain {
private Block[] horstlChain;
private int currentIndex = 0;
/* ... */
}
The horstlChain
array stores the blocks.
currentIndex
records the position of the latest block within the array.
When we give birth to our blockchain by calling the constructor, we initialize the array with a length and put the first (genesis) block into the array:
public class HorstlChain {
/* ... */
public HorstlChain(ExamAttendance examAttendance) {
horstlChain = new Block[100];
horstlChain[0] = new Block(examAttendance, "genesis");
}
/* ... */
}
A caveat with this implementation is that the size of the blockchain is fixed (in this case to 100 elements). However, if we reach that limit we can always create a larger array to which we transfer the content of the smaller array.
Afterwards we can add further blocks with the addBlock
method.
Here we retrieve the hash of the latest block to use it as the previousHash
.
public class HorstlChain {
/* ... */
private void addBlock(ExamAttendance examAttendance) {
String previousHash = horstlChain[currentIndex].getHash();
horstlChain[++currentIndex] = new Block(examAttendance, previousHash);
}
/* ... */
}
So far we have only done simple array manipulations. No trace of the promises of blockchain technology: easy to verify but difficult to change. Remember the hashes? Now it is their time to shine.
Course video in German: PeerTube, YouTube (~15min)
As we don't want to trust blindly, we verify with isValid()
that everything has been stored correctly.
This method iteratively grabs two adjacent blocks and runs checks on them.
public class HorstlChain {
/* ... */
private boolean isValid() {
Block currentBlock;
Block previousBlock;
int i = 1;
while (i <= currentIndex) {
currentBlock = horstlChain[i];
previousBlock = horstlChain[i - 1];
// stored hash does not equal calculated hash
if (!currentBlock.getHash().equals(currentBlock.calculateHash())) {
return false;
}
// stored previousHash does not equal actual previous hash
if (!currentBlock.getPreviousHash().equals(previousBlock.getHash())) {
return false;
}
i++;
}
return true;
}
/* ... */
}
There are several ways an ill-intentioned individual could tamper our chain. Let's consider these two:
- change the
ExamAttendance
object in a block without changing the block'shash
It is highly unlikely that with a differentExamAttendance
object the hash function produces the samehash
as before the alteration. To check the integrity, we run the calculation of the block'shash
again and look if the new result is identical to the already storedhash
. - completely replace a block with a different one
In this case, the culprit would do well to include thehash
of the predecessor as thepreviousHash
in the new block. Regardless, the successor of the illegitimately added block still contains thehash
of the original predecessor as itspreviousHash
, so this is what we check for in the second part ofisValid()
.
We can imagine more checks, e.g. trace if all timestamps are in chronological order or if a block's timestamp is earlier than the exam date etc. I leave that as an exercise to the reader.
Course video in German: PeerTube, YouTube (~12min)
In a public blockchain, the data that is used to calculate the hash
is visible for everyone.
That's why a well-equipped hacker might still get away with fraudulently exchanging blocks.
We visualize the attack with a blockchain of 4 blocks:
__________ __________ __________ __________
| | |prev: ab| |prev: ee| |prev: xy|
|hash: ab|---|hash: ee|---|hash: xy|---|hash: jk|
|________| |________| |________| |________|
The attacker changes the content of the second block, resulting in a new hash.
changed
__________ __________ __________ __________
| | |prev: ab| |prev: ee| |prev: xy|
|hash: ab|---|hash: ff|---|hash: xy|---|hash: jk|
|________| |________| |________| |________|
To not get detected, the attacker stores a new previousHash
in the third block.
As the previousHash
is part of the calculated hash
, the hash
of the third block needs to change, too.
changed
__________ __________ __________ __________
| | |prev: ab| |prev: ff| |prev: xy|
|hash: ab|---|hash: ff|---|hash: rp|---|hash: jk|
|________| |________| |________| |________|
This procedure continues until the last block in the chain is reached.
changed
__________ __________ __________ __________
| | |prev: ab| |prev: ff| |prev: rp|
|hash: ab|---|hash: ff|---|hash: rp|---|hash: zt|
|________| |________| |________| |________|
If we examine our chain now, everything looks innocent, although the second block's content has changed.
So the attacker "simply" has to recalculate all the hashes that follow the manipulated block. Depending on the chain's length and the position of the targeted block this might take a while, but it is still somehow doable. We need a mechanism that turns somehow doable into highly impractical. Introducing: Proof of Work.
Proof of Work means it is easily recognizable that you have put effort into something. Imagine, before attaching a new block to the blockchain you would need to solve a Sudoku puzzle. For each block you need to spend some time. If an attacker now tries to change one block and reassign all following blocks, there would be a lot of puzzles to solve...
In reality, there are no Sudokus involved.
Instead, the hashes serve as a (not so entertaining) puzzle.
The rule: a hash must comply with a certain characteristic, otherwise it is treated as invalid.
Let's say all hashes need to start with six zeroes, e.g. 000000a44b24323cdd3
.
If our hash does not comply, we need to generate new hashes, until we find one that adheres to the rule.
Wait a minute. Hashes are based on the input to the hash function. In order to generate a different hash we would have to change the input.
Exactly.
But we don't want to change the timestamp
nor the previousHash
nor the data we would like to store.
These are fixed.
Therefore, we add another element to the input.
We call it nonce.
We then modify this element until we find a matching hash.
The nonce could be a number that we increment by 1 with each hash calculation.
Finding a solution to this hash puzzle will take a considerable amount of time (depending on the rules for the hash), because the only known way at the moment is by brute force searching for a solution, i.e. trying out all the possibilities.
Course video in German: PeerTube, YouTube (~12min)
We add the discussed nonce as another attribute of Block
.
public class Block {
/* ... */
int nonce = Integer.MIN_VALUE;
/* ... */
}
nonce
is initialized with the lowest available value, so that we can count up as often as possible.
The difficulty
is controlled by the HorstlChain
class.
To see an effect during hash calculation while also not waiting forever I settled with five zeroes.
public class HorstlChain {
/* ... */
private String difficulty = "00000";
/* ... */
}
Now we are ready to mine blocks.
The associated method resides in Block
, so that we can call it on each Block
object.
If the current hash does not have the required number of leading zeroes, it calculates a new hash with a different nonce
.
If we are very lucky, we get a suitable hash with the first hash calculation in the
Block
constructor. But the probability of it decreases exponentially with each additional zero. Find out more about it here.
public class Block {
/* ... */
public String mineBlock(String difficulty) {
while (!this.hash.startsWith(difficulty)) {
nonce++;
this.hash = calculateHash();
}
return this.hash;
}
/* ... */
}
Back in the chain we need to adjust our addBlock
method to make sure we have a valid block before adding it to the chain.
public class HorstlChain {
/* ... */
private void addBlock(ExamAttendance examAttendance) {
String previousHash = horstlChain[currentIndex].getHash();
Block blockToAdd = new Block(examAttendance, previousHash);
blockToAdd.mineBlock(difficulty);
horstlChain[++currentIndex] = blockToAdd;
}
/* ... */
}
Yes, we are done!
We have achieved what we set out to do.
Of course, this is just the beginning.
It takes one look at the incredible size of bitcoin's source code to set our sights low.
Nevertheless, we got off to a good start.
And now try to add your own ideas to the horstlChain
.