Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus metrics library upgrade #8877

Open
wants to merge 20 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 17 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -372,7 +372,7 @@ jobs:
at: ~/project
- run:
name: AcceptanceTests
no_output_timeout: 20m
no_output_timeout: 30m
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this increase or was it part of your testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, I changed the value when I was testing and forgot to roll it back

command: |
CLASSNAMES=$(circleci tests glob "**/src/acceptance-test/java/**/*.java" \
| sed 's@.*/src/acceptance-test/java/@@' \
Expand Down
15 changes: 15 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,21 @@
## Unreleased Changes

### Breaking Changes
- With the upgrade of the Prometheus Java Metrics library, there are the following changes:
- Gauge names are not allowed to end with `total`, therefore metrics as `beacon_proposers_data_total` and `beacon_eth1_current_period_votes_total` are dropping the `_total` suffix
- The `_created` timestamps are not returned by default.
- Some JVM metrics have changed name to adhere to the OTEL standard (see the table below), [Teku Detailed Dashboard](https://grafana.com/grafana/dashboards/16737-teku-detailed/) is updated to support both names

| Old Name | New Name |
|---------------------------------|---------------------------------|
| jvm_memory_bytes_committed | jvm_memory_committed_bytes |
| jvm_memory_bytes_init | jvm_memory_init_bytes |
| jvm_memory_bytes_max | jvm_memory_max_bytes |
| jvm_memory_bytes_used | jvm_memory_used_bytes |
| jvm_memory_pool_bytes_committed | jvm_memory_pool_committed_bytes |
| jvm_memory_pool_bytes_init | jvm_memory_pool_init_bytes |
| jvm_memory_pool_bytes_max | jvm_memory_pool_max_bytes |
| jvm_memory_pool_bytes_used | jvm_memory_pool_used_bytes |

### Additions and Improvements
- Optimized blobs validation pipeline
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ public static MetricValue parse(final String line) {

final Map<String, String> metricLabels = new HashMap<>();
final String labelsString = line.substring(line.indexOf("{") + 1, line.indexOf("}"));
final Pattern p = Pattern.compile("(.*?)=\"(.*?)\",");
final Pattern p = Pattern.compile("(.*?)=\"(.*?)\",?");
final Matcher m = p.matcher(labelsString);
while (m.find()) {
metricLabels.put(m.group(1), m.group(2));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
public class Eth1DataCache {
static final String CACHE_SIZE_METRIC_NAME = "eth1_block_cache_size";
static final String VOTES_MAX_METRIC_NAME = "eth1_current_period_votes_max";
static final String VOTES_TOTAL_METRIC_NAME = "eth1_current_period_votes_total";
static final String VOTES_TOTAL_METRIC_NAME = "eth1_current_period_votes";
static final String VOTES_UNKNOWN_METRIC_NAME = "eth1_current_period_votes_unknown";
static final String VOTES_CURRENT_METRIC_NAME = "eth1_current_period_votes_current";
static final String VOTES_BEST_METRIC_NAME = "eth1_current_period_votes_best";
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ public long getGossipBytesTotalReceived() {
}

private void storeObservationIfNeeded(final Observation observation) {
MetricCategory category = observation.getCategory();
MetricCategory category = observation.category();
if (category.equals(PROCESS)) {
readProcessCategoryItem(observation);
} else if (category.equals(VALIDATOR)) {
Expand All @@ -113,37 +113,37 @@ private void storeObservationIfNeeded(final Observation observation) {

private void readBeaconCategoryItem(final Observation observation) {
isBeaconNodePresent = true;
switch (observation.getMetricName()) {
switch (observation.metricName()) {
case "head_slot":
headSlot = getLongValue(observation.getValue());
headSlot = getLongValue(observation.value());
break;
case "eth1_request_queue_size":
isEth1Connected = true;
break;
case "peer_count":
peerCount = getIntValue(observation.getValue());
peerCount = getIntValue(observation.value());
break;
case "node_syncing_active":
isEth2Synced = getIntValue(observation.getValue()) == 0;
isEth2Synced = getIntValue(observation.value()) == 0;
break;
}
}

private void readProcessCategoryItem(final Observation observation) {
if ("cpu_seconds_total".equals(observation.getMetricName())) {
cpuSecondsTotal = getLongValue(observation.getValue());
if ("cpu_seconds_total".equals(observation.metricName())) {
cpuSecondsTotal = getLongValue(observation.value());
}
}

private void readJvmCategoryItem(final Observation observation) {
if ("memory_pool_bytes_used".equals(observation.getMetricName())) {
addToMemoryPoolBytesUsed((Double) observation.getValue());
if ("memory_pool_bytes_used".equals(observation.metricName())) {
addToMemoryPoolBytesUsed((Double) observation.value());
}
}

private void readValidatorCategoryItem(final Observation observation) {
if ("local_validator_counts".equals(observation.getMetricName())) {
addToLocalValidators(observation.getLabels(), (Double) observation.getValue());
if ("local_validator_counts".equals(observation.metricName())) {
addToLocalValidators(observation.labels(), (Double) observation.value());
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@
import java.util.stream.Stream;
import org.hyperledger.besu.plugin.services.MetricsSystem;
import org.hyperledger.besu.plugin.services.metrics.Counter;
import org.hyperledger.besu.plugin.services.metrics.LabelledGauge;
import org.hyperledger.besu.plugin.services.metrics.LabelledMetric;
import org.hyperledger.besu.plugin.services.metrics.LabelledSuppliedMetric;
import tech.pegasys.teku.infrastructure.async.AsyncRunner;
import tech.pegasys.teku.infrastructure.async.SafeFuture;
import tech.pegasys.teku.infrastructure.collections.LimitedMap;
Expand Down Expand Up @@ -88,8 +88,8 @@ public static <K, V> CachingTaskQueue<K, V> create(
}

public void startMetrics() {
final LabelledGauge taskQueueMetrics =
metricsSystem.createLabelledGauge(
final LabelledSuppliedMetric taskQueueMetrics =
metricsSystem.createLabelledSuppliedGauge(
TekuMetricCategory.STORAGE,
metricsPrefix + "_tasks",
"Labelled task queue metrics",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -272,33 +272,29 @@ private void assertCacheSizeMetric(final int expectedSize) {

private void assertCacheHitCount(final int expectedCount) {
final double value =
metricsSystem
.getCounter(TekuMetricCategory.STORAGE, METRICS_PREFIX + "_tasks_total")
.getValue("cached");
metricsSystem.getCounterValue(
TekuMetricCategory.STORAGE, METRICS_PREFIX + "_tasks_total", "cached");
assertThat(value).isEqualTo(expectedCount);
}

private void assertNewTaskCount(final int expectedCount) {
final double value =
metricsSystem
.getCounter(TekuMetricCategory.STORAGE, METRICS_PREFIX + "_tasks_total")
.getValue("new");
metricsSystem.getCounterValue(
TekuMetricCategory.STORAGE, METRICS_PREFIX + "_tasks_total", "new");
assertThat(value).isEqualTo(expectedCount);
}

private void assertDuplicateTaskCount(final int expectedCount) {
final double value =
metricsSystem
.getCounter(TekuMetricCategory.STORAGE, METRICS_PREFIX + "_tasks_total")
.getValue("duplicate");
metricsSystem.getCounterValue(
TekuMetricCategory.STORAGE, METRICS_PREFIX + "_tasks_total", "duplicate");
assertThat(value).isEqualTo(expectedCount);
}

private void assertRebasedTaskCount(final int expectedCount) {
final double value =
metricsSystem
.getCounter(TekuMetricCategory.STORAGE, METRICS_PREFIX + "_tasks_total")
.getValue("rebase");
metricsSystem.getCounterValue(
TekuMetricCategory.STORAGE, METRICS_PREFIX + "_tasks_total", "rebase");
assertThat(value).isEqualTo(expectedCount);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -567,9 +567,11 @@ private void verifyEngineCalled(

private void verifySourceCounter(final Source source, final FallbackReason reason) {
final long actualCount =
stubMetricsSystem
.getCounter(TekuMetricCategory.BEACON, "execution_payload_source_total")
.getValue(source.toString(), reason.toString());
stubMetricsSystem.getCounterValue(
TekuMetricCategory.BEACON,
"execution_payload_source_total",
source.toString(),
reason.toString());
assertThat(actualCount).isOne();
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -887,9 +887,11 @@ private void verifyEngineCalled(

private void verifySourceCounter(final Source source, final FallbackReason reason) {
final long actualCount =
stubMetricsSystem
.getCounter(TekuMetricCategory.BEACON, "execution_payload_source_total")
.getValue(source.toString(), reason.toString());
stubMetricsSystem.getCounterValue(
TekuMetricCategory.BEACON,
"execution_payload_source_total",
source.toString(),
reason.toString());
assertThat(actualCount).isOne();
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
import org.hyperledger.besu.ethereum.chain.GenesisState;
import org.hyperledger.besu.ethereum.core.Block;
import org.hyperledger.besu.ethereum.core.BlockHeader;
import org.hyperledger.besu.ethereum.core.MiningParameters;
import org.hyperledger.besu.ethereum.core.MiningConfiguration;
import org.hyperledger.besu.ethereum.mainnet.MainnetProtocolSchedule;
import org.hyperledger.besu.ethereum.mainnet.ProtocolSchedule;
import org.hyperledger.besu.metrics.noop.NoOpMetricsSystem;
Expand All @@ -40,7 +40,7 @@ public static ExecutionPayloadHeader createPayloadForBesuGenesis(
final ProtocolSchedule protocolSchedule =
MainnetProtocolSchedule.fromConfig(
genesisConfigOptions,
MiningParameters.MINING_DISABLED,
MiningConfiguration.MINING_DISABLED,
badBlockManager,
false,
new NoOpMetricsSystem());
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
import org.apache.logging.log4j.Logger;
import org.apache.tuweni.bytes.Bytes32;
import org.hyperledger.besu.plugin.services.MetricsSystem;
import org.hyperledger.besu.plugin.services.metrics.LabelledGauge;
import org.hyperledger.besu.plugin.services.metrics.LabelledSuppliedMetric;
import tech.pegasys.teku.ethereum.events.SlotEventsChannel;
import tech.pegasys.teku.ethereum.execution.types.Eth1Address;
import tech.pegasys.teku.infrastructure.async.SafeFuture;
Expand Down Expand Up @@ -66,10 +66,10 @@ public ProposersDataManager(
final RecentChainData recentChainData,
final Optional<Eth1Address> proposerDefaultFeeRecipient,
final boolean forkChoiceUpdatedAlwaysSendPayloadAttribute) {
final LabelledGauge labelledGauge =
metricsSystem.createLabelledGauge(
final LabelledSuppliedMetric labelledGauge =
metricsSystem.createLabelledSuppliedGauge(
TekuMetricCategory.BEACON,
"proposers_data_total",
"proposers_data",
"Total number proposers/validators under management",
"type");

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
import org.apache.tuweni.bytes.Bytes;
import org.hyperledger.besu.plugin.services.MetricsSystem;
import org.hyperledger.besu.plugin.services.metrics.Counter;
import org.hyperledger.besu.plugin.services.metrics.Histogram;
import tech.pegasys.teku.bls.BLS;
import tech.pegasys.teku.bls.BLSPublicKey;
import tech.pegasys.teku.bls.BLSSignature;
Expand All @@ -53,7 +54,7 @@ public class AggregatingSignatureVerificationService extends SignatureVerificati
private final AsyncRunner asyncRunner;
private final Counter batchCounter;
private final Counter taskCounter;
private final MetricsHistogram batchSizeHistogram;
private final Histogram batchSizeHistogram;

@VisibleForTesting
AggregatingSignatureVerificationService(
Expand Down Expand Up @@ -89,13 +90,11 @@ public class AggregatingSignatureVerificationService extends SignatureVerificati
"signature_verifications_task_count_total",
"Reports the number of individual verification tasks processed");
batchSizeHistogram =
MetricsHistogram.create(
metricsSystem.createHistogram(
TekuMetricCategory.EXECUTOR,
metricsSystem,
"signature_verifications_batch_size",
"Histogram of signature verification batch sizes",
3,
List.of());
MetricsHistogram.getDefaultBuckets());
}

public AggregatingSignatureVerificationService(
Expand Down Expand Up @@ -188,7 +187,7 @@ private List<SignatureTask> waitForBatch() {
void batchVerifySignatures(final List<SignatureTask> tasks) {
batchCounter.inc();
taskCounter.inc(tasks.size());
batchSizeHistogram.recordValue(tasks.size());
batchSizeHistogram.observe(tasks.size());
final List<List<BLSPublicKey>> allKeys = new ArrayList<>();
final List<Bytes> allMessages = new ArrayList<>();
final List<BLSSignature> allSignatures = new ArrayList<>();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -155,15 +155,15 @@ private void prepareRegistrations() {
private void assertPreparedProposersCount(final int expectedCount) {
final OptionalDouble optionalValue =
metricsSystem
.getLabelledGauge(TekuMetricCategory.BEACON, "proposers_data_total")
.getLabelledGauge(TekuMetricCategory.BEACON, "proposers_data")
.getValue("prepared_proposers");
assertThat(optionalValue).hasValue(expectedCount);
}

private void assertRegisteredValidatorsCount(final int expectedCount) {
final OptionalDouble optionalValue =
metricsSystem
.getLabelledGauge(TekuMetricCategory.BEACON, "proposers_data_total")
.getLabelledGauge(TekuMetricCategory.BEACON, "proposers_data")
.getValue("registered_validators");
assertThat(optionalValue).hasValue(expectedCount);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@
import org.hyperledger.besu.metrics.ObservableMetricsSystem;
import org.hyperledger.besu.metrics.Observation;
import org.hyperledger.besu.metrics.prometheus.PrometheusMetricsSystem;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import tech.pegasys.infrastructure.logging.LogCaptor;
Expand Down Expand Up @@ -79,6 +80,7 @@ public class BlockBlobSidecarsTrackersPoolImplTest {
private final StubAsyncRunner asyncRunner = new StubAsyncRunner();
private final RecentChainData recentChainData = mock(RecentChainData.class);
private final ExecutionLayerChannel executionLayer = mock(ExecutionLayerChannel.class);
BlockBlobSidecarsTrackersPoolImpl blockBlobSidecarsTrackersPool;

@SuppressWarnings("unchecked")
private final Function<BlobSidecar, SafeFuture<Void>> blobSidecarPublisher = mock(Function.class);
Expand All @@ -88,21 +90,6 @@ public class BlockBlobSidecarsTrackersPoolImplTest {

private final BlockImportChannel blockImportChannel = mock(BlockImportChannel.class);
private final int maxItems = 15;
private final BlockBlobSidecarsTrackersPoolImpl blockBlobSidecarsTrackersPool =
new PoolFactory(metricsSystem)
.createPoolForBlockBlobSidecarsTrackers(
blockImportChannel,
spec,
timeProvider,
asyncRunner,
recentChainData,
executionLayer,
() -> blobSidecarGossipValidator,
blobSidecarPublisher,
historicalTolerance,
futureTolerance,
maxItems,
this::trackerFactory);

private UInt64 currentSlot = historicalTolerance.times(2);
private final List<Bytes32> requiredBlockRootEvents = new ArrayList<>();
Expand All @@ -116,6 +103,21 @@ public class BlockBlobSidecarsTrackersPoolImplTest {

@BeforeEach
public void setup() {
blockBlobSidecarsTrackersPool =
new PoolFactory(metricsSystem)
.createPoolForBlockBlobSidecarsTrackers(
blockImportChannel,
spec,
timeProvider,
asyncRunner,
recentChainData,
executionLayer,
() -> blobSidecarGossipValidator,
blobSidecarPublisher,
historicalTolerance,
futureTolerance,
maxItems,
this::trackerFactory);
// Set up slot
blockBlobSidecarsTrackersPool.subscribeRequiredBlockRoot(requiredBlockRootEvents::add);
blockBlobSidecarsTrackersPool.subscribeRequiredBlockRootDropped(
Expand All @@ -130,6 +132,11 @@ public void setup() {
setSlot(currentSlot);
}

@AfterEach
public void tearDown() {
metricsSystem.shutdown();
}

private void setSlot(final long slot) {
setSlot(UInt64.valueOf(slot));
}
Expand Down Expand Up @@ -1298,7 +1305,7 @@ private static BlobIdentifier blobIdentifierFromBlobSidecar(final BlobSidecar bl

private void assertStats(final String type, final String subType, final double count) {
assertThat(
getMetricsValues("block_blobs_trackers_pool_stats_total").get(List.of(type, subType)))
getMetricsValues("block_blobs_trackers_pool_stats_total").get(List.of(subType, type)))
.isEqualTo(count);
}

Expand All @@ -1321,8 +1328,8 @@ private void assertBlobSidecarsTrackersCount(final int count) {
private Map<List<String>, Object> getMetricsValues(final String metricName) {
return metricsSystem
.streamObservations(TekuMetricCategory.BEACON)
.filter(ob -> ob.getMetricName().equals(metricName))
.collect(Collectors.toMap(Observation::getLabels, Observation::getValue));
.filter(ob -> ob.metricName().equals(metricName))
.collect(Collectors.toMap(Observation::labels, Observation::value));
}

private BlockBlobSidecarsTracker trackerFactory(final SlotAndBlockRoot slotAndBlockRoot) {
Expand Down
Loading
Loading