with ⏳ and ❤ by Victor Moreno Arribas for learning purposes from the subject Sistemas de Gestión de Seguridad de Sistemas de Información at the Facultad de Informática in the campus of Gipuzkoa from the Universidad del Pais Vasco (UPV/EHU).
- Asymmetric Cryptography
- SHA-256
- Digital Signatures
- Cryptographic Hash Functions
- Merkle Trees
- Proof of Work
- Ledger
- Blockchain
- Bitcoin
- Symmetric cryptography uses the same key for encryption and decryption.
- Asymmetric cryptography uses two different keys for encryption and decryption.
Asymmetric cryptography is a cryptographic system that uses two different keys: a public key and a private key.
Public Key | Private Key |
---|---|
Can be shared with anyone | Should be kept secret |
Encryption | Decryption |
Signing | Verification |
Video by Simply Explained
Online key demo
Online RSA key generator
SHA stands for Secure Hash Algorithm.
SHA-256 is a cryptographic hash function that takes an input and produces a 256-bit (32-byte) hash value.
This hash value is known as a message digest (very important) – typically rendered as a hexadecimal number, 64 digits (0-f) long.
SHA-256 online algorithm
Online SHA-256 generator
Steps in SHA-256
Video by Xiuminseokie21
A digital signature is a mathematical scheme for verifying the authenticity of a digital message or document.
Input: message, private key
Output: signature
Input: message, signature, public key
Output: true/false
- Authentication: Digital signature makes the receiver believe that the data was created and sent by the claimed user.
- Non-Repudiation: The sender cannot deny sending a message later on.
- Integrity: This ensures that the message was not altered during the transfer.
Video by khan Academy
Video by Lisk
Video by Computerphile
Medium article
Book for dummies
This are a special class of hash functions that have certain properties that make them suitable for cryptography.
- Deterministic: the same input always produces the same output.
- One-way: it is infeasible to compute the input data from its hash value.
- Collision resistant: it is infeasible to find two different inputs that produce the same hash value.
- Computationally efficient: it is easy to compute the hash value for any given input data.
- Hide information: the output looks random and does not reveal any information about the input data.
Video by Lisk
What is hashing?
Video by Khan Academy
Online SHA-256 hash demo
Cryptographic vs Non-crpytographic
Video by 3Blue1Brown
A Merkle tree is a data structure that allows for efficient verification of the contents of large data structures.
It is a tree in which every leaf node is labelled with the cryptographic hash of a data block and every non-leaf
node is labelled with the cryptographic hash of the labels of its child nodes.
Intro to Merkle Trees
Video by Cryptoeconomic Study
Proof of work is a piece of data which is difficult (costly, time-consuming) to produce but easy for others to verify.
This piece of data needs to comply a set of rules that are difficult to satisfy but easy to verify.
Bitcoin uses the proof of work algorithm to determine the order of transactions in the blockchain. Etherium uses the proof of stake algorithm to determine the order of transactions in the blockchain.
Video by Binance Academy
PoW by Investopedia
Video by Khan Academy
PoW vs PoS
A ledger is a record of transactions. This is the key element to keep the integrity of the blockchain.
The ledger is distributed all over the network and it is updated by the nodes.
If one node changes the ledger, in the following step it will be verified and just with a single bit been modified
the hash will be completely different and the block will be rejected.
The rest of the nodes still have an updated ledger that is valid.
Online demo ledger
Video by Sonar Systems
Video by Genesis Block HK
Video by Anders Brownworth
A blockchain is a distributed ledger that is used to record transactions across many computers so that the
record cannot be altered retroactively without the alteration of all subsequent blocks and the collusion of the network.
Blockchain is a combination of all the concepts we have explained above.
Video by 3Blue1Brown
Video by Anders Brownworth
Video by Simply Explained
Bitcoin is the first cryptocurrency and the most popular one.
First of all, we need to sign the files and also get the hash of a given file.
import hashlib
# Generates the hash for a given filename
def hasher(file):
with open(file, "rb") as f:
bytes = f.read() # read entire file as bytes
return hashlib.sha256(bytes).hexdigest()
import secrets
# Sign the file with an 8-len HEX string and a signature
def signer(file, signature):
f = open(file, "a")
sign = secrets.token_hex(4) + " " + signature
f.write(sign + "\n")
f.close()
return sign
import shutil
import os
import basics
# Signs a file with a given signature, and return the hash of the signed file
def block_hasher(path, signature):
# makes a copy of the input file
copy = shutil.copyfile(path, "signed.txt")
# signs the file and return the last line added to the file
sign = basics.signer(copy, signature)
# hashes the file
signed_hash = basics.hasher(copy)
# copy the signed file to the input file
shutil.copyfile(copy, path)
# removes the copy
os.remove(copy)
return signed_hash
--- CASE 1 ---
Hash: a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e
--- CASE 2 ---
Hash before signing: a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e
Hash after signing: 05b71e0acef1ec412c915ffad65bb3a7b38decba5bd7d8958685e1c9d32c6992
--- CASE 3 ---
Before signing: ['Hello World']
Hash: 8b41337cb9563e93cbc0c999d2c3f544eee93f2664b66ad58bcde3fcb5a5c54c
After signing: ['Hello World\n', '40d654d2 G39']
With these test cases, we can observe some properties of the SHA-256 hash function as described in the cruptographic hash functions section.
Hide information and non-predictable output
In the first case we can see that the hash value seems random, but it is not.
Collision resistant
With a 64 character hash (256 bits), it is very unlikely to find two different files with the same hash value.
Deterministic and one-way
In the second case we can check that the SHA-256 hash function is deterministic, since the hash value of the file (test.txt) is the same as in the first case since we didn't change the file.
Also, we can see that the hash function is one-way, since we can't get the original file from the hash value information.
Computationaly efficient
After running the three test cases we can observe that the hash function is computationaly efficient, since it takes a lot less than 1 second to obtain the hash value.
As we have seen in the proof of work section, the proof of work algorithm is based on finding a number that when hashed with the data of the block, the resulting hash starts with a given number of zeros.
Therefore, we are going to implement some methods to find hashes with a given number of zeros.
import os
import shutil
import blocks
import basics
# Finds a hash with a given number of zeros
def num_finder(file, signature, n_zeros):
# if n_zeros = 2 then zeros = "00"
zeros = __convert_string(n_zeros)
# make a copy of input file
copy = shutil.copyfile(file, "nf_tmp.txt")
while True:
# sign the copy and get it hash until it starts with at least n_zeros
signed_hash = blocks.block_hasher(copy, signature)
if signed_hash.startswith(zeros):
return signed_hash
# Number of zeros to string
def __convert_string(n):
zeros = ""
for i in range(n):
zeros += "0"
return zeros
When there are a lot of computers in the blockchain network that obtain a hash with a given number of zeros, the winner that gets the transaction fee is the one that obtains the hash with the largest amount of zeros in less time.
# Finds the maximum number of zeros under the given time
# We can't use num_find because process can take more time than the given seconds
def max_finder(path_input, path_output, signature, seconds):
# counter of zeros that will be incremented each time a hash with n zeros is found
max_zeros = 1
# string of zeros
zeros = __convert_string(max_zeros)
# copy of the input file
copy = shutil.copyfile(path_input, "mf_tmp.txt")
# last valid hash obtained
valid_hash = ""
# time
t = 0
# timer starts now
start = time.time()
while t < seconds:
# time left
t = time.time() - start
# hash of the copy
signed_hash = blocks.block_hasher(copy, signature)
if signed_hash.startswith(zeros):
valid_hash = signed_hash
# remove previous output file generated with fewer zeros
if os.path.exists(path_output):
os.remove(path_output)
# copy is now the output file, is signed
shutil.copyfile(copy, path_output)
# increments the number of zeros, try to find a hash with more zeros
max_zeros += 1
zeros = __convert_string(max_zeros)
# resets the file to remove the last signature
os.remove(copy)
copy = shutil.copyfile(path_input, "tmp.txt")
os.remove(copy)
return valid_hash
If we give more time to the max_finder method, we can find have more probability to find a hash with more zeros.
--- CASE 5 ---
Search time: 20 seconds
Hash: 00079cef5307c6aef6fac056684699cc332cc4d22163aee2f242a6aacb0144e4
Last line of the output file: 308ece06 ikaslea
In this case, we found a hash with 3 zeros in 20 seconds. So, we were a little lucky, usually it would find a hash with 2 zeros in 20 seconds. And this is a proof of work.
In our case, we had to hash the file "SGSSI-22.CB.O2.txt" with signature "G39", into "SGSSI-22.CB.O2.VMOR.txt".
def test_lab5():
path_input = "SGSSI-22.CB.02.txt"
path_output = "SGSSI-22.CB.02.VMOR.txt"
signature = "G39"
seconds = 20
print("Search time: " + str(seconds) + " seconds")
hash = max_finder(path_input, path_output, signature, seconds)
print("Hash: " + hash)
with open(path_output, "r") as f:
print("Last line of the output file: " + f.readlines()[-1])
--- LAB 5 ---
Search time: 20 seconds
Hash: 00e0dc3d6e03a8158da9a00cf00c10ca9667fa2a8ae246aab89a6a94d3039d43
Last line of the output file: f6cb905d G39
Now we are going to be the checkers instead of the miners, so we will have to check is a file is signed and if the hash of that file starts with zeros.
import filecmp
import basics
def checker(path_input1, path_input2):
# check if is the same file
if path_input1 == path_input2:
return False
# check if the content is the same
if not filecmp.cmp(path_input1, path_input2):
with open(path_input1, "r") as f1:
with open(path_input2, "r") as f2:
lines1 = f1.readlines()
lines2 = f2.readlines()
# checks every line except the last one
for i in range(len(lines1) - 1):
if lines1[i] != lines2[i]:
return False
# checks the last line
if not re.match(r"[a-z0-9]{8} G([0-9]{2})+", lines2[-1]):
return False
# check if the hash of the second file starts with zeros
hash = basics.hasher(path_input2)
if not hash.startswith("00"):
return False
return True
When we test with the files "SGSSI-22.CB.02.txt" and "SGSSI-22.CB.02.VMOR.txt", we get True.
And that is correct, as the second file is the one we signed in the last test from LAB 5.
The hash of this file starts with 2 zeros and is signed by "G39".
Valid file: True
As we are in a network with a lot of computers that are mining, we have to check a lot of files. Therefore, now we are going to implement a method that checks a lot of files from the same folder and returns the winner. We already mentioned, that the winner is the one that obtains the hash with the largest amount of zeros. In this case, all the files are in a folder, so we don't know the time that it took to each file to obtain the hash.
import os
import shutil
import basics
import checker
# Return the number of consecutive zeros at the beginning of the hash
def zeros_counter(hash):
zeros = 0
for char in hash:
if char == "0":
zeros += 1
else:
break
return zeros
# Returns the last line of a file
def last_line(path):
with open(path, "r") as f:
lines = f.readlines()
return lines[-1]
def files_checker(path_input, directory):
# dictionary with key=number of consecutive zeros and value=hash
zeros_hash = {}
hash_input = basics.hasher(path_input)
zeros_input = zeros_counter(hash_input)
line_input = last_line(path_input)
zeros_hash[zeros_input] = (hash_input, line_input)
# check every file in directory
for file in os.listdir(directory):
# path of the file
path = os.path.join(directory, file)
tmp1 = shutil.copyfile(path_input, "tmp1.txt")
tmp2 = shutil.copyfile(path, "tmp2.txt")
# check if the file is a valid candidate
if checker.checker(tmp1, tmp2):
hash = basics.hasher(tmp2)
zeros = zeros_counter(hash)
line = last_line(tmp2)
zeros_hash[zeros] = (hash, line)
# check if there is a tie in the max number of consecutive zeros
max_zeros = max(zeros_hash.keys())
winners = []
for key in zeros_hash.keys():
if key == max_zeros:
winners.append(zeros_hash[key])
# there is more than one winner
if len(winners) > 1:
winners.sort()
return winners[0]
# there is only one winner
else:
return winners[0]
tmp = files_checker("SGSSI-22.CB.02.VMOR.txt", "SGSSI-22.Lab06.CB.02.Candidatos")
print(tmp)
After analyzing all the files one by one, we create a dictionary that contains the hash of the file, the number of consecutive zeros as the key anf the last line (signature) to know who is the owner of the winner hash.
('00000068919edb1b9cc570cc592f754938680a7bdd82adb7e6517d0f5839e64b', '0bde8f37 G14')
Now for the activity 7.2 we are requested to sign the file "SGSSI-22.CB.03.txt" until we find a 8 hex length string in the signature that provides a hash of the file that starts with a sequence of zeros.
--- LAB 6 ---
Search time: 60 seconds
Hash: 000bf1d6a4af10b9fde91887e83aa09cf295cc0bc9ae5b1b71cba40bb0adba3d
Last line of the output file: 8e5b9fd3 G39
In our case, we obtained 3 zeros in 60 seconds, pretty good. The resulting file is "SGSSI-22.CB.03.VMOR.txt", this file is signed with an 8 lenght hex string and "G39", and also the hash of the file starts with "000". Therefore, is a valid resulting file.