FollowUp: Move agreement negotiation off-chain, keep only mutual commitment on-chain#207
FollowUp: Move agreement negotiation off-chain, keep only mutual commitment on-chain#207danielbui12 wants to merge 26 commits into
Conversation
Add an upper bound on AgreementTerms.valid_until at both redemption sites: a quote is now redeemable only within [now, now + RequestTimeout]. Previously valid_until was a lower bound only, leaving every quote effectively immortal and replayable against a re-registered provider. - New Error::TermsValidityTooLong rejecting quotes past the cap. - integrity_test asserts RequestTimeout < DeregisterAnnouncementPeriod, making the timing invariant that closes the re-register replay hole a build-time check. - Mock RequestTimeout lowered to 50 (< DeregisterAnnouncementPeriod=100); mock + benchmark term builders now produce in-window valid_until. - Regression tests: too-long validity, re-register replay blocked by expiry, and nonce non-reuse after early termination. No field rename, no ABI/wire change.
Added `client/src/block_subscription.rs`: a `BlockSubscriber` that subscribes to finalized blocks, fetches all events per block once (every pallet), and dispatches them through a registry of type-erased `EventPlugin`s. Each plugin pairs a parser with its own handler; one plugin erroring or panicking is caught and logged — it never aborts the loop or starves other plugins. The existing `EventSubscriber` / `EventParser` trait in `event_subscription.rs` was left untouched; the new module reuses the `EventParser` seam.
Fixed a stale event type in `client/src/event_subscription.rs`: `StorageEvent::AgreementRequested` was removed after agreement negotiation moved off-chain in commit `651726f`. The variant was dead (parser arm could never fire) and the actual replacement events — `StorageAgreementEstablished` and `ReplicaAgreementEstablished` — were falling through to `StorageEvent::Unknown`. The fix adds both replacement variants with correct field shapes matching `pallet/src/lib.rs:626-641`, including a `field_terms_scalars` helper that reads nested `AgreementTerms` scalar fields gracefully.
…r updates Replace the plugin-based block subscriber with a Stream of raw finalized blocks and have provider-node keep its ChainState live from it. - pallet: ProviderSettingsUpdated / ProviderMultiaddrUpdated now carry the updated settings / multiaddr so off-chain consumers can apply changes without a re-fetch. - storage-client: add matching StorageEvent variants + parser arms; replace the plugin-based BlockSubscriber (EventPlugin / ParserPlugin / isolation) with a Stream-based BlockSubscriberStream that yields raw finalized blocks; drop the now-unused plugin re-exports and example. - provider-node: ChainStateCoordinator drives the stream in a single async loop — storing current_block every block and merging this provider's settings / multiaddr into the cached ProviderInfo in place, re-fetching the full info on registration.
Wire a finalized-block ChainStateCoordinator into the provider node so it keeps current_block + provider_info in sync with the runtime, and use current_block + request_timeout to issue bounded negotiate quotes instead of valid_until: u32::MAX. - ChainStateCoordinator/Handle subscribe to finalized blocks, update current_block on every block, and refresh provider_info on ProviderSettingsUpdated/ProviderMultiaddrUpdated events. - request_timeout bootstrapped from the on-chain StorageProvider::RequestTimeout constant via ProviderClient::fetch_request_timeout + substrate constants module. - negotiate_terms computes valid_until = current_block.saturating_add(request_timeout) and returns 503 ChainStateNotReady when either is 0.
…n chain-state readiness - provider-node: seed a fresh NonceCounter when no on-chain replay state exists yet, so newly-registered (or soon-to-be) providers can negotiate - provider-node: return ChainStateNotReady from /negotiate when the nonce counter is absent, and clarify the signing-unavailable error message - papi demo: add getProviderNodeInfo and wait for the provider node to sync registration settings before proceeding; log block numbers; bump demo price
4308e19 to
d955615
Compare
|
@mudigal there is a limitation in provider-node coverage: it is not able to run integration tests related to runtime due to the lack of blockchain startup in the check.yml job. Should we add start-chain to check.yml to increase test coverage, or do you want to keep it as simple as it is? |
Reconcile the merge of origin/dev: adopt dev's reconciler-based provider_info + bootstrappable nonce_counter + readiness reporting, while keeping this branch's replay-window fix (bounded valid_until = current_block + request_timeout, ChainStateNotReady guard). Move RequestTimeout fetching into the background reconciler (request_timeout is now AtomicU32) so it self-heals if the chain was unreachable at boot, and split reconcile_once into focused helpers.
The merge left provider registration info in two places: ProviderState's own provider_info (reconciler-maintained) and chain_state.provider_info (coordinator-maintained). Remove the former and route /negotiate, /info, and the reconciler through chain_state.provider_info as the single source of truth.
…tate Consolidate all on-chain state the node needs at runtime into ChainState, with the chain-state coordinator as its only writer. Fixes the dual-writer race between the polling reconciler and the event coordinator (review #2), the dropped settings/multiaddr events when provider_info was None (#3), and removes duplicated ChainState tests (#5). - ChainState gains `constants` (PalletConstants/RequestTimeout) and `nonce_counter`; coordinator fetches constants once per connect, does an initial full sync for restart recovery, and re-fetches the full ProviderInfo on any relevant provider event so committed_bytes/stake/ settings stay consistent (no field-patching). - Drop ProviderState.request_timeout and .nonce_counter; /negotiate and /info now read from chain_state. - Remove the polling reconciler (spawn_chain_reconciler + helpers) and the now-unused --reconcile-interval-secs CLI flag and its CI usage. - examples/papi: replace the broken object-equality registration poll with a /info readiness-flag poll plus a hard timeout (review #1).
The method was never called anywhere (single definition, zero call sites). The coordinator uses its own parse_pallet_events free function which also filters by pallet name — the stronger, correct implementation. Closes review finding #4.
- pallet: skip providers with deregister_at set in query_find_matching_providers - client: surface deregister_at on ProviderInfo; add DeregisterAnnounced, ProviderDeregistered, and DeregisterCancelled StorageEvent variants + parsing - provider-node: coordinator refreshes provider_info on all three deregister events; /negotiate returns 503 provider_deregistering when deregister_at is set; /info exposes a deregistering readiness flag - tests: integration coverage for deregister lifecycle (announce, complete, cancel) and /info readiness flags
|
@danielbui12 my assistant says:
|
| // more blocks), so an old quote cannot be replayed against the new | ||
| // incarnation. | ||
| assert!( | ||
| T::RequestTimeout::get() < T::DeregisterAnnouncementPeriod::get(), |
There was a problem hiding this comment.
this is just compile-time check, but we allow to change those with pub storage and setStorage root calls (those calls also does not catch this invariant).
We should definitely add fn try_state checks also, to hit the stuff when doing runtime upgrade at least (@danielbui12 please, create follow-up issue for adding try_state checks and cover some important).
| max_bytes: maxBytes, | ||
| duration, | ||
| price_per_byte: 1n, | ||
| price_per_byte: 100n, |
| .ok_or_else(|| ClientError::Chain("Missing 'accepting_extensions'".to_string()))?, | ||
| agreements_total, | ||
| challenges_failed, | ||
| deregister_at: named_field(&value, "deregister_at").and_then(|v| match &v.value { |
There was a problem hiding this comment.
@danielbui12 @ilchu not sure, if this storage item parsing is the right thing, we should use runtime api's or something typed as mentioned here: #125 (comment)
| pub max_capacity: u64, | ||
| } | ||
|
|
||
| /// Mirror StorageProvider::ProviderSettings |
There was a problem hiding this comment.
@danielbui12 where do we need there Default impl? this can easily go out of sync. Can we remove this default impl?
|
@danielbui12 also please check the up-to-date docs, my assistant says:
|
Primary Changes
On-chain validity-window enforcement
valid_untilmust be in range [current block, current block + RequestTimeout]RequestTimeout < DeregisterAnnouncementPeriod— this is the core invariant that closes the re-register replay window (an old quote always expires before a provider can deregister + reregister).ProviderMultiaddrUpdatedevent emitmultiaddr.Provider node: live chain-state sync
chain_state_coordinator.rsintroduces ChainState (atomic current_block + RwLock<provider_info>) holds state that frequently change.RequestTimeoutfrom pallet constant at startup.valid_until = u32::MAXin /negotiate; stamps valid_until = current_block + request_timeoutIssue: