You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"expr": "100 * abs(1-(sum(stddev by (redpanda_topic) (sum(redpanda_kafka_max_offset{redpanda_namespace=\"kafka\",redpanda_cloud_data_cluster_name=~\"\"}) by (redpanda_topic,redpanda_partition))) / sum(avg by (redpanda_topic) ((sum(redpanda_kafka_max_offset{redpanda_namespace=\"kafka\",redpanda_cloud_data_cluster_name=~\"\"}) by (redpanda_topic,redpanda_partition))))))",
This currently shows the balance of writes to partitions across the cluster. It's confusingly named since we also think about the balance of partition replicas (data) and partition leadership (which brokers own updates to a primary partition and replication to the partition followers).
I get the intent of this to show how well the data is distributed on writes, but I'm not sure we're being clear on this. We probably need to rename this to something like "Partition Write Distribution" and give it an info block so people understand how to interpret.
On a cluster with only a handful of topics, it's probably ok to interpret this as topic write evenness, but on complex clusters with hundreds or thousands of topics/partitions, I'm not sure how this should really come across because the write loads could be fairly skewed to a subset of topics and still have a relatively even write load.
The text was updated successfully, but these errors were encountered:
This currently shows the balance of writes to partitions across the cluster. It's confusingly named since we also think about the balance of partition replicas (data) and partition leadership (which brokers own updates to a primary partition and replication to the partition followers).
I get the intent of this to show how well the data is distributed on writes, but I'm not sure we're being clear on this. We probably need to rename this to something like "Partition Write Distribution" and give it an info block so people understand how to interpret.
On a cluster with only a handful of topics, it's probably ok to interpret this as topic write evenness, but on complex clusters with hundreds or thousands of topics/partitions, I'm not sure how this should really come across because the write loads could be fairly skewed to a subset of topics and still have a relatively even write load.
The text was updated successfully, but these errors were encountered: