-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build blocks asynchronously #38
base: main
Are you sure you want to change the base?
Conversation
0ccaf2d
to
3425471
Compare
92eb71e
to
0f3ab5e
Compare
When it is a node's turn to propose or verify a block, it calls the BlockBuilder's BuildBlock() and Verify() methods respectively. However, this method may take a long time to complete, and during this time, the Epoch instance cannot process messages (since proposing a new block may be called indirectly via HandleMessage). This commit adds an asynchronous task scheduler which the Epoch uses to register a callback to make itself propose the block or verify it it asynchronously and continue the program flow from that point onwards. While the block is verified or built, Simplex is free to handle other messages. Signed-off-by: Yacov Manevich <[email protected]>
epoch.go
Outdated
|
||
// This finalization may correspond to a proposal from a future round, or to the proposal of the current round | ||
// which we are still verifying. | ||
if (e.round < finalization.Round && finalization.Round-e.round < e.maxRoundWindow) || pendingProposal { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Would be a little more clear if we dont have a pendingProposal
variable and simply perform all the logic in if statement above? Would make the log more descriptive, and any common code could be separated into a helper method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
epoch.go
Outdated
msgsForRound.finalization = message | ||
return nil | ||
} | ||
|
||
// Have we already finalized this round? | ||
round, exists := e.rounds[finalization.Round] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should we just initialize round, exists
before line 428 and remove this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
epoch.go
Outdated
msgsForRound.vote = message | ||
return nil | ||
} | ||
|
||
// TODO: what if we've received a vote for a round we didn't instantiate yet? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont think this comment is relevant anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
epoch.go
Outdated
return nil | ||
} | ||
|
||
func (e *Epoch) scheduleBlockVerification(block Block, md BlockHeader, from NodeID, vote Vote) func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: a better name could be createBlockVerificationTask
since this method doesn't actually schedule
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -705,11 +775,24 @@ func (e *Epoch) handleBlockMessage(message *BlockMessage, _ NodeID) error { | |||
return nil | |||
} | |||
|
|||
pendingBlocks := e.sched.Size() | |||
if pendingBlocks > e.maxPendingBlocks { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im concerned some malicious node in the future could dos the scheduler and single handedly fill it up. This would starve all blockMessageMessages
from other nodes even if they are from rounds before the malicious nodes block. Should we add some check to avoid adding duplicate tasks to the scheduler or is this handled in a different way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent point.
We should not accept a block for a round that is more than e.maxRoundWindow
ahead. I thought this check was being performed, but after going through the code I see it doesn't enforce it explicitly.
I added the check + a test
epoch.go
Outdated
// TODO: timeout | ||
} | ||
// Create a task that will verify the block in the future, after its predecessors have also been verified. | ||
task := e.scheduleBlockVerification(block, md, from, vote) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the point of lines 813-827 to do some basic pre verification?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we couple votes from block proposers with the blocks they propose, we don't want some nefarious actor to take a vote from the real proposer and send us its own block + the vote from the real block proposer.
We therefore check the vote actually votes on the block, and is not voting for an alternative block.
epoch.go
Outdated
round.votes[string(vote.Signature.Signer)] = &vote | ||
|
||
if err := e.doProposed(block, vote, from); err != nil { | ||
e.Logger.Debug("Failed voting on block", zap.Error(err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe change the logger levels to Warn
on error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
epoch.go
Outdated
func (e *Epoch) isBlockReadyToBeScheduled(seq uint64, prev Digest) bool { | ||
var ready bool | ||
|
||
if seq > 0 { | ||
// A block can be scheduled if its predecessor either exists in storage, | ||
// or there exists a round object for it. | ||
// Since we only create a round object after we verify the block, | ||
// it means we have verified this block in the past. | ||
_, ok := e.locateBlock(seq-1, prev[:]) | ||
ready = ok | ||
} else { | ||
// The first block is always ready to be scheduled | ||
ready = true | ||
} | ||
return ready |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func (e *Epoch) isBlockReadyToBeScheduled(seq uint64, prev Digest) bool { | |
var ready bool | |
if seq > 0 { | |
// A block can be scheduled if its predecessor either exists in storage, | |
// or there exists a round object for it. | |
// Since we only create a round object after we verify the block, | |
// it means we have verified this block in the past. | |
_, ok := e.locateBlock(seq-1, prev[:]) | |
ready = ok | |
} else { | |
// The first block is always ready to be scheduled | |
ready = true | |
} | |
return ready | |
func (e *Epoch) isBlockReadyToBeScheduled(seq uint64, prev Digest) bool { | |
if seq > 0 { | |
// A block can be scheduled if its predecessor either exists in storage, | |
// or there exists a round object for it. | |
// Since we only create a round object after we verify the block, | |
// it means we have verified this block in the past. | |
_, ok := e.locateBlock(seq-1, prev[:]) | |
return ok | |
} | |
return false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed but keep in mind that your suggestion isn't semantically the same as the original code, because if seq == 0 then we should return true, not false.
if as.close { | ||
return | ||
} | ||
as.signal.Wait() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im concerned we could deadlock here.
- Say we enter
maybeExecuteTask
and execute a task that callsSchedule(ready = true)
- Schedule will call
Signal
onas.signal
- We will then finish executing the original task and this line
as.signal.Wait()
would have missed the priorSignal
call, therefore waiting when it should be executing another task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we execute a task that is ready to run, it will add itself to the ready queue.
Then, it will perform another iteration of the loop in maybeExecuteTask
which will not exit, as the ready queue is still not empty, and will execute the next task.
Signed-off-by: Yacov Manevich <[email protected]>
Signed-off-by: Yacov Manevich <[email protected]>
Signed-off-by: Yacov Manevich <[email protected]>
Signed-off-by: Yacov Manevich <[email protected]>
Signed-off-by: Yacov Manevich <[email protected]>
Signed-off-by: Yacov Manevich <[email protected]>
When it is a node's turn to propose a block, it calls the BlockBuilder's BuildBlock method.
However, this method may take a long time to complete, and during this time, the Epoch instance
cannot process messages (since proposing a new block may be called indirectly via HandleMessage).
This commit adds an asynchronous task scheduler which the Epoch uses to register a callback to make itself propose the block after building it asynchronously.