Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Start LLMQContext early to let VerifyDB() check ChainLock signatures in coinbase #5752

Merged
merged 2 commits into from
Dec 5, 2023

Conversation

UdjinM6
Copy link

@UdjinM6 UdjinM6 commented Dec 4, 2023

Issue being fixed or feature implemented

Now that we have ChainLock sigs in coinbase VerifyDB() have to process them. It works most of the time because usually we simply read contributions from quorum db https://github.com/dashpay/dash/blob/develop/src/llmq/quorums.cpp#L385. However, sometimes these contributions aren't available so we try to re-build them https://github.com/dashpay/dash/blob/develop/src/llmq/quorums.cpp#L388. But by the time we call VerifyDB() bls worker threads aren't started yet, so we keep pushing jobs into worker's queue but it can't do anything and it halts everything.

backtrace:

  * frame #0: 0x00007fdd85a2873d libc.so.6`syscall at syscall.S:38
    frame #1: 0x0000555c41152921 dashd_testnet`std::__atomic_futex_unsigned_base::_M_futex_wait_until(unsigned int*, unsigned int, bool, std::chrono::duration<long, std::ratio<1l, 1l> >, std::chrono::duration<long, std::ratio<1l, 1000000000l> >) + 225
    frame #2: 0x0000555c40e22bd2 dashd_testnet`CBLSWorker::BuildQuorumVerificationVector(Span<std::shared_ptr<std::vector<CBLSPublicKey, std::allocator<CBLSPublicKey> > > >, bool) at atomic_futex.h:102:36
    frame #3: 0x0000555c40d35567 dashd_testnet`llmq::CQuorumManager::BuildQuorumContributions(std::unique_ptr<llmq::CFinalCommitment, std::default_delete<llmq::CFinalCommitment> > const&, std::shared_ptr<llmq::CQuorum> const&) const at quorums.cpp:419:65
    frame #4: 0x0000555c40d3b9d1 dashd_testnet`llmq::CQuorumManager::BuildQuorumFromCommitment(Consensus::LLMQType, gsl::not_null<CBlockIndex const*>) const at quorums.cpp:388:37
    frame #5: 0x0000555c40d3c415 dashd_testnet`llmq::CQuorumManager::GetQuorum(Consensus::LLMQType, gsl::not_null<CBlockIndex const*>) const at quorums.cpp:588:37
    frame #6: 0x0000555c40d406a9 dashd_testnet`llmq::CQuorumManager::ScanQuorums(Consensus::LLMQType, CBlockIndex const*, unsigned long) const at quorums.cpp:545:64
    frame #7: 0x0000555c40937629 dashd_testnet`llmq::CSigningManager::SelectQuorumForSigning(Consensus::LLMQParams const&, llmq::CQuorumManager const&, uint256 const&, int, int) at signing.cpp:1038:90
    frame #8: 0x0000555c40937d34 dashd_testnet`llmq::CSigningManager::VerifyRecoveredSig(Consensus::LLMQType, llmq::CQuorumManager const&, int, uint256 const&, uint256 const&, CBLSSignature const&, int) at signing.cpp:1061:113
    frame #9: 0x0000555c408e2d43 dashd_testnet`llmq::CChainLocksHandler::VerifyChainLock(llmq::CChainLockSig const&) const at chainlocks.cpp:559:53
    frame #10: 0x0000555c40c8b09e dashd_testnet`CheckCbTxBestChainlock(CBlock const&, CBlockIndex const*, llmq::CChainLocksHandler const&, BlockValidationState&) at cbtx.cpp:368:47
    frame #11: 0x0000555c40cf75db dashd_testnet`ProcessSpecialTxsInBlock(CBlock const&, CBlockIndex const*, CMNHFManager&, llmq::CQuorumBlockProcessor&, llmq::CChainLocksHandler const&, Consensus::Params const&, CCoinsViewCache const&, bool, bool, BlockValidationState&, std::optional<MNListUpdates>&) at specialtxman.cpp:202:60
    frame #12: 0x0000555c40c00a47 dashd_testnet`CChainState::ConnectBlock(CBlock const&, BlockValidationState&, CBlockIndex*, CCoinsViewCache&, bool) at validation.cpp:2179:34
    frame #13: 0x0000555c40c0e593 dashd_testnet`CVerifyDB::VerifyDB(CChainState&, CChainParams const&, CCoinsView&, CEvoDB&, int, int) at validation.cpp:4789:41
    frame #14: 0x0000555c40851627 dashd_testnet`AppInitMain(std::variant<std::nullopt_t, std::reference_wrapper<NodeContext>, std::reference_wrapper<WalletContext>, std::reference_wrapper<CTxMemPool>, std::reference_wrapper<ChainstateManager>, std::reference_wrapper<CBlockPolicyEstimator>, std::reference_wrapper<LLMQContext> > const&, NodeContext&, interfaces::BlockAndHeaderTipInfo*) at init.cpp:2098:50
    frame #15: 0x0000555c4082fe11 dashd_testnet`AppInit(int, char**) at bitcoind.cpp:145:54
    frame #16: 0x0000555c40823c64 dashd_testnet`main at bitcoind.cpp:173:20
    frame #17: 0x00007fdd85934083 libc.so.6`__libc_start_main(main=(dashd_testnet`main at bitcoind.cpp:160:1), argc=3, argv=0x00007ffcb8ca5b88, init=<unavailable>, fini=<unavailable>, rtld_fini=<unavailable>, stack_end=0x00007ffcb8ca5b78) at libc-start.c:308:16
    frame #18: 0x0000555c4082f27e dashd_testnet`_start + 46

Fixes #5741

What was done?

Start LLMQContext early. Alternative solution could be moving bls worker Start/Stop into llmq context ctor/dtor.

How Has This Been Tested?

I had a node with that issue. This patch fixed it.

Breaking Changes

Not sure, hopefully none.

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation
  • I have assigned this pull request to a milestone (for repository code-owners and collaborators only)

PastaPastaPasta
PastaPastaPasta previously approved these changes Dec 4, 2023
Copy link
Member

@PastaPastaPasta PastaPastaPasta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK for squash merge

ogabrielides
ogabrielides previously approved these changes Dec 4, 2023
Copy link
Collaborator

@ogabrielides ogabrielides left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

@UdjinM6 UdjinM6 dismissed stale reviews from ogabrielides and PastaPastaPasta via d8d1be9 December 4, 2023 17:04
@UdjinM6 UdjinM6 modified the milestones: 20.1, 20.0.2 Dec 4, 2023
@UdjinM6
Copy link
Author

UdjinM6 commented Dec 4, 2023

CI looks good, pls re-review

Copy link
Collaborator

@ogabrielides ogabrielides left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@knst knst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@PastaPastaPasta PastaPastaPasta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK for squash merge

@PastaPastaPasta PastaPastaPasta merged commit e1630d8 into dashpay:develop Dec 5, 2023
10 checks passed
ogabrielides pushed a commit to ogabrielides/dash that referenced this pull request Dec 5, 2023
…atures in coinbase (dashpay#5752)

## Issue being fixed or feature implemented
Now that we have ChainLock sigs in coinbase `VerifyDB()` have to process
them. It works most of the time because usually we simply read
contributions from quorum db
https://github.com/dashpay/dash/blob/develop/src/llmq/quorums.cpp#L385.
However, sometimes these contributions aren't available so we try to
re-build them
https://github.com/dashpay/dash/blob/develop/src/llmq/quorums.cpp#L388.
But by the time we call `VerifyDB()` bls worker threads aren't started
yet, so we keep pushing jobs into worker's queue but it can't do
anything and it halts everything.

backtrace:
```
  * frame #0: 0x00007fdd85a2873d libc.so.6`syscall at syscall.S:38
    frame #1: 0x0000555c41152921 dashd_testnet`std::__atomic_futex_unsigned_base::_M_futex_wait_until(unsigned int*, unsigned int, bool, std::chrono::duration<long, std::ratio<1l, 1l> >, std::chrono::duration<long, std::ratio<1l, 1000000000l> >) + 225
    frame #2: 0x0000555c40e22bd2 dashd_testnet`CBLSWorker::BuildQuorumVerificationVector(Span<std::shared_ptr<std::vector<CBLSPublicKey, std::allocator<CBLSPublicKey> > > >, bool) at atomic_futex.h:102:36
    frame #3: 0x0000555c40d35567 dashd_testnet`llmq::CQuorumManager::BuildQuorumContributions(std::unique_ptr<llmq::CFinalCommitment, std::default_delete<llmq::CFinalCommitment> > const&, std::shared_ptr<llmq::CQuorum> const&) const at quorums.cpp:419:65
    frame #4: 0x0000555c40d3b9d1 dashd_testnet`llmq::CQuorumManager::BuildQuorumFromCommitment(Consensus::LLMQType, gsl::not_null<CBlockIndex const*>) const at quorums.cpp:388:37
    frame dashpay#5: 0x0000555c40d3c415 dashd_testnet`llmq::CQuorumManager::GetQuorum(Consensus::LLMQType, gsl::not_null<CBlockIndex const*>) const at quorums.cpp:588:37
    frame dashpay#6: 0x0000555c40d406a9 dashd_testnet`llmq::CQuorumManager::ScanQuorums(Consensus::LLMQType, CBlockIndex const*, unsigned long) const at quorums.cpp:545:64
    frame dashpay#7: 0x0000555c40937629 dashd_testnet`llmq::CSigningManager::SelectQuorumForSigning(Consensus::LLMQParams const&, llmq::CQuorumManager const&, uint256 const&, int, int) at signing.cpp:1038:90
    frame dashpay#8: 0x0000555c40937d34 dashd_testnet`llmq::CSigningManager::VerifyRecoveredSig(Consensus::LLMQType, llmq::CQuorumManager const&, int, uint256 const&, uint256 const&, CBLSSignature const&, int) at signing.cpp:1061:113
    frame dashpay#9: 0x0000555c408e2d43 dashd_testnet`llmq::CChainLocksHandler::VerifyChainLock(llmq::CChainLockSig const&) const at chainlocks.cpp:559:53
    frame dashpay#10: 0x0000555c40c8b09e dashd_testnet`CheckCbTxBestChainlock(CBlock const&, CBlockIndex const*, llmq::CChainLocksHandler const&, BlockValidationState&) at cbtx.cpp:368:47
    frame dashpay#11: 0x0000555c40cf75db dashd_testnet`ProcessSpecialTxsInBlock(CBlock const&, CBlockIndex const*, CMNHFManager&, llmq::CQuorumBlockProcessor&, llmq::CChainLocksHandler const&, Consensus::Params const&, CCoinsViewCache const&, bool, bool, BlockValidationState&, std::optional<MNListUpdates>&) at specialtxman.cpp:202:60
    frame dashpay#12: 0x0000555c40c00a47 dashd_testnet`CChainState::ConnectBlock(CBlock const&, BlockValidationState&, CBlockIndex*, CCoinsViewCache&, bool) at validation.cpp:2179:34
    frame dashpay#13: 0x0000555c40c0e593 dashd_testnet`CVerifyDB::VerifyDB(CChainState&, CChainParams const&, CCoinsView&, CEvoDB&, int, int) at validation.cpp:4789:41
    frame dashpay#14: 0x0000555c40851627 dashd_testnet`AppInitMain(std::variant<std::nullopt_t, std::reference_wrapper<NodeContext>, std::reference_wrapper<WalletContext>, std::reference_wrapper<CTxMemPool>, std::reference_wrapper<ChainstateManager>, std::reference_wrapper<CBlockPolicyEstimator>, std::reference_wrapper<LLMQContext> > const&, NodeContext&, interfaces::BlockAndHeaderTipInfo*) at init.cpp:2098:50
    frame dashpay#15: 0x0000555c4082fe11 dashd_testnet`AppInit(int, char**) at bitcoind.cpp:145:54
    frame dashpay#16: 0x0000555c40823c64 dashd_testnet`main at bitcoind.cpp:173:20
    frame dashpay#17: 0x00007fdd85934083 libc.so.6`__libc_start_main(main=(dashd_testnet`main at bitcoind.cpp:160:1), argc=3, argv=0x00007ffcb8ca5b88, init=<unavailable>, fini=<unavailable>, rtld_fini=<unavailable>, stack_end=0x00007ffcb8ca5b78) at libc-start.c:308:16
    frame dashpay#18: 0x0000555c4082f27e dashd_testnet`_start + 46
```

Fixes dashpay#5741

## What was done?
Start LLMQContext early. Alternative solution could be moving bls worker
Start/Stop into llmq context ctor/dtor.

## How Has This Been Tested?
I had a node with that issue. This patch fixed it.

## Breaking Changes
Not sure, hopefully none.

## Checklist:
- [x] I have performed a self-review of my own code
- [x] I have commented my code, particularly in hard-to-understand areas
- [ ] I have added or updated relevant unit/integration/functional/e2e
tests
- [ ] I have made corresponding changes to the documentation
- [x] I have assigned this pull request to a milestone _(for repository
code-owners and collaborators only)_
PastaPastaPasta added a commit that referenced this pull request Dec 9, 2024
…erministicMNManager`, `LLMQContext` member ctors, reduce use in `CJContext`

7d26061 refactor: move `CConnman`, `PeerManager` out of `CCoinJoinClientQueueManager` ctor (Kittywhiskers Van Gogh)
953ba96 refactor: move `CConnman` out of `CoinJoinWalletManager` ctor (Kittywhiskers Van Gogh)
ac930a8 refactor: remove unused `CConnman` from `CDeterministicMNManager` ctor (Kittywhiskers Van Gogh)
a14e604 refactor: remove `CConnman`, `PeerManager` from `LLMQContext` ctor (Kittywhiskers Van Gogh)
d9e5cc7 refactor: move `PeerManager` out of `CInstantSendManager` ctor (Kittywhiskers Van Gogh)
82d1aed refactor: move `CConnman`, `PeerManager` out of `CSigSharesManager` ctor (Kittywhiskers Van Gogh)
7498a38 refactor: move `PeerManager` out of `CSigningManager` ctor (Kittywhiskers Van Gogh)
7ebc61e refactor: move `CConnman` out of `CQuorumManager` ctor (Kittywhiskers Van Gogh)
c07b522 refactor: move `CConnman` out of `CDKGSession{,Handler,Manager}` ctor (Kittywhiskers Van Gogh)
01876c7 refactor: move `PeerManager` out of `CDKGSession{,Handler,Manager}` ctor (Kittywhiskers Van Gogh)
cc0e771 refactor: start BLS thread in `LLMQContext` ctor, move `Start` downwards (Kittywhiskers Van Gogh)

Pull request description:

  ## Additional Information

  * Depends on #6425
  * Dependency for #6304
  * In order to reduce the logic used in chainstate initialization (that will be split out of `init.cpp` in [bitcoin#23280](bitcoin#23280)), the spinning up of threads (as done in `LLMQContext::Start()`) needs to be moved down.
    * They were moved up in [dash#5752](#5752) as `CBLSWorker` is a part of `LLMQContext` and `CBLSWorker` is needed during chainstate verification. As suggested in dash#5752, an alternate fix to the one already merged in was to move `CBLSWorker` `Start()`/`Stop()` to the constructor, which is done here.
      * Another alternate fix is that we move it out of `LLMQContext` entirely and let it remain in `NodeContext` though this approach has not been taken.
    * The reason we cannot retain the status quo is because `bitcoin-chainstate` (the binary introduced in [bitcoin#24304](bitcoin#24304) that's built on the code split off in [bitcoin#23280](bitcoin#23280)) aims to be devoid of P2P logic and this is reflected in the source files used to build it ([source](https://github.com/bitcoin/bitcoin/pull/24304/files#diff-4cb884d03ebb901069e4ee5de5d02538c40dd9b39919c615d8eaa9d364bbbd77R794)). (Also, there's no `NodeContext`, [source](https://github.com/bitcoin/bitcoin/pull/24304/files#diff-4cb884d03ebb901069e4ee5de5d02538c40dd9b39919c615d8eaa9d364bbbd77R795-R798))
      * This means need to separate P2P and validation components from Dash-specific logic in order for the split to work as expected. This PR is a step in that direction by moving P2P elements (`CConnman` and `PeerManager`) out of constructors.
  * As it stands, there are two sources for Dash-specific components to have access to P2P components, initialization (e.g. through `LLMQContext::Start()` or `PeerManagerImpl::ProcessMessage()`).

  ## Breaking Changes

  None expected. While changes are present in initialization order, consensus behaviour should remain unchanged.

  ## Checklist:

  - [x] I have performed a self-review of my own code
  - [x] I have commented my code, particularly in hard-to-understand areas **(note: N/A)**
  - [x] I have added or updated relevant unit/integration/functional/e2e tests **(note: N/A)**
  - [x] I have made corresponding changes to the documentation **(note: N/A)**
  - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_

ACKs for top commit:
  knst:
    utACK 7d26061
  UdjinM6:
    utACK 7d26061

Tree-SHA512: 4f12cbda935cad3a186acb31fed513cea489b0d3a55aa80be8e1336d10cbe1579d6d3db862a78a167134650c9e97816732acaf0c85ab759f6555b1b6be99ec02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants