LIP: 0003
Title: Uniform ordering of delegates list
Author: Iker Alustiza <iker@lightcurve.io>
Discussions-To: https://research.lisk.com/t/uniform-ordering-of-delegates-list/
Status: Replaced
Type: Standards Track
Created: 2018-08-17
Updated: 2024-01-04
Requires: 0022
Superseded-By: 0057
This LIP addresses two issues affecting delegate list uniformity found within the reference implementation of the Lisk protocol. It proposes a way to uniformly order the delegates list by hashing each delegate’s unique identifier together with a unique identifier for the round.
This LIP is licensed under the Creative Commons Zero 1.0 Universal.
Currently, there is a bug in how the order of the delegates list is generated for every round. This is implemented in the shuffleDelegateListForRound
function of delegates_list.ts. Below you can find the reference code which generates the delegate list for every round:
const seedSource = round.toString();
const delegateList = [...list];
let currentSeed = hash(seedSource, 'utf8');
for (let i = 0, delCount = delegateList.length; i < delCount; i++) {
for (let x = 0; x < 4 && i < delCount; i++, x++) {
const newIndex = currentSeed[x] % delCount;
const b = delegateList[newIndex];
delegateList[newIndex] = delegateList[i];
delegateList[i] = b;
}
currentSeed = hash(currentSeed);
}
This code snippet contains two unrelated issues, either of which is sufficient to introduce the generation of non-uniform lists of delegates:
-
The fourth and fifth lines of code contain two
for
loops that increment the variablei
twice, every 5th time (in the inner and outer loop). This means that the shuffling logic is skipped for every 5th delegate. -
The line of code after the second
for
loop calculates the new index for the current delegate. It takes what can be assumed to be a uniformly distributed random number between 0 and 255 (currentSeed[x]
) and calculates the mod101 of it (delCount = 101). This implies that some indices are more probable than others. This happens because, for example:
0 mod 101 = 101 mod 101 = 202 mod 101 = 0
but,
100 mod 101 = 201 mod 101 = 100
Therefore, delegates in positions from 1 to 54 are 50% more likely to be chosen in the loop process than the others.
If we consider the worst affected cases caused by the two issues mentioned above:
- Being a delegate in the 5th position, 10th position, 15th position... until 55 i.e. [4, 9, ..., 54].
- Being a delegate in the 60th position, 65th position, 70th position... until 101 i.e. [59, 64,..., 99].
And if we assume the vote weighting does not change between rounds, which is the worst case scenario.
For case 1, taking for example delegate[4]:
Knowing that the probability of delegate[4] being swapped by delegate[0] (during the first loop) is 3/253, we have that:
The probability of delegate[4] not being swapped by any of the looped delegates (82 loops) is 0.38. In other words, the delegate[4] will change its position for 61.9% of the rounds. Which is the same for the rest of the delegates considered in this case since we assume a uniform distribution in the hash is generated.
For case 2, taking for example delegate[99]:
Knowing that the probability of delegate[99] being swapped by delegate[0] (during the first loop) is 2/253, we have that:
The probability of delegate[99] not being swapped by any of the looped delegates (82 loops) is 0.526. In other words, the delegate[99] will change its position 47.4% of the rounds. Which is the same for the rest of the delegates considered in this case since we assume a uniform distribution in the hash is generated. So given the current situation, it is noticeable when a delegate is one of these in the case 2, and it may likely not change its position in 2 or more rounds.
Given the study of the issue presented in the previous section, this LIP proposes a solution which will ensure that every delegate's position in the list is recalculated every round, and the probability for a delegate to end up in any position is the same for the whole set of possible positions.
The proposed fix is based on hashing each delegate’s unique identifier together with a unique identifier for the round:
- The delegate's unique identifier is the binary address of the delegate. Note that this proposal does not depend on the implementation of LIP 0018. The only requirement here is that the addresses have to be unique for each forging delegate which is ensured by the protocol in any case.
- The round's unique identifier is one of the two random seed generated according to the specifications in LIP 0022.
By having a unique hash per delegate for every round, we will ensure that in every round the list of delegates will be re-ordered based on a uniform distribution. Moreover, these changes not only fix the uniformity issue (caused by the two issues raised in the Motivation section), but it will also allow simpler and more human-readable code in any implementation.
This section specifies how the delegate list has to be ordered for each round. Currently, this is implemented in the shuffleDelegateListForRound
function mentioned above. Assuming that the delegate list for round r
has to be ordered, the process is:
-
One hash per delegate is generated using a seed of the previous round and the delegate address. For simplicity, we propose to use SHA-256, which is the hashing function already utilised in the existing function:
forEach delegate in delegateList delegate.roundHash = hash(randomSeed1 | delegate.address)
where
|
indicates bytes concatenation andrandomSeed1
is the random seed for roundr-1
as specified in LIP 0022. The listdelegateList
contains the address of each of the forging delegates for roundr
. -
The delegates are reordered in lexicographical order according to the hashes generated in step 1. In the unlikely event that several delegates have the same hash, they are ordered lexicographically by delegate address:
(delegate1.roundHash < delegate2.roundHash) || ((delegate1.roundHash == delegate2.roundHash) && (delegate1.address < delegate2.address))
This change will cause a hard fork. Nodes implementing this fix will generate a different delegate list every round compared to the previous version, therefore every node will need to use an implementation which contains the proposed fix in order to maintain consensus.
Additionally, the current algorithm will need to be maintained for syncing/validation of blocks before this proposal is implemented.
TBD