Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitoring node runs out of RAM and CPU resources with growth of the tables number and data in it #2429

Open
vponomaryov opened this issue Dec 2, 2024 · 11 comments
Assignees
Labels
bug Something isn't working right

Comments

@vponomaryov
Copy link

vponomaryov commented Dec 2, 2024

Installation details
Panel Name: any
Dashboard Name: any
Scylla-Monitoring Version: 4.8.0
Scylla-Version: 2024.2.0~rc3-20241004.89f8638e9e9b
Monitor node instance type: m6i.xlarge

Running a test which creates tables in batches by 125 we observe constant memory and CPU utilization growth:

Image

The same about disk utilization:
Image

Result of the top command:

Tasks: 134 total,   1 running, 133 sleeping,   0 stopped,   0 zombie
%Cpu(s): 25.6 us,  0.2 sy,  0.0 ni, 74.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  15717.2 total,   1244.2 free,  12641.1 used,   1831.9 buff/cache
MiB Swap:  20480.0 total,  16750.0 free,   3730.0 used.   2393.5 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                                                                                                                
   5527 ubuntu    20   0  113.7g  12.1g 570644 S 100.3  79.1   8266:01 prometheus                                                                                                                                                                                             
   9710 scylla    20   0   16.0t  76860  20480 S   1.0   0.5  92:07.13 scylla                                                                                                                                                                                                 
    414 root      20   0 1949744  17860   8192 S   0.3   0.1   4:16.46 containerd                                                                                                                                                                                             
   2977 root      20   0 2134828  32928  14080 S   0.3   0.2   3:13.23 dockerd                                                                                                                                                                                                
   5508 root      20   0 1238716   6408   3456 S   0.3   0.0   1:22.68 containerd-shim                                                                                                                                                                                        
   9718 scylla-+  20   0 1266796  25560  11904 S   0.3   0.2   4:57.53 scylla-manager                                                                                                                                                                                         
  57000 root      20   0 1319948  24704  16768 S   0.3   0.2   0:00.04 snapd                                                                                                                                                                                                  
      1 root      20   0  167584   6480   4048 S   0.0   0.0   0:23.68 systemd                                                                                                                                                                                                
      2 root      20   0       0      0      0 S   0.0   0.0   0:00.06 kthreadd                                                                                                                                                                                               
      3 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_gp     

DB nodes load:
Image

On the DB nodes load screenshot may be observed the situation with batches.
Each tooth is population of the 125 tables.

Argus: scylla-staging/valerii/vp-scale-5000-tables-test#3
CI job: https://jenkins.scylladb.com/view/staging/job/scylla-staging/job/valerii/job/vp-scale-5000-tables-test/3

@vponomaryov vponomaryov added the bug Something isn't working right label Dec 2, 2024
@vponomaryov
Copy link
Author

vponomaryov commented Dec 2, 2024

@tzach
Copy link
Contributor

tzach commented Dec 2, 2024

@fruch
Copy link
Contributor

fruch commented Dec 2, 2024

@amnonh

I've found we have some metrics which are tables specific like scylla_column_family_memtable_row_hits
quick tour of the scylla code, and I've found the flag that enable it:
https://github.com/scylladb/scylladb/blob/acd643bd75468703150b2e23b1bbf05a3e95e42d/db/config.cc#L1012

and it's default on

is that on purpose ?

@mykaul
Copy link
Contributor

mykaul commented Dec 2, 2024

@amnonh

I've found we have some metrics which are tables specific like scylla_column_family_memtable_row_hits quick tour of the scylla code, and I've found the flag that enable it: https://github.com/scylladb/scylladb/blob/acd643bd75468703150b2e23b1bbf05a3e95e42d/db/config.cc#L1012

and it's default on

is that on purpose ?

Yes.

@fruch
Copy link
Contributor

fruch commented Dec 2, 2024

I've found the answer, scylladb/scylladb#13293

yes it was deliberately

and @tzach you got the benchmark you asked back then :)
it's bad, and the calculator from https://monitoring.docs.scylladb.com/stable/install/monitoring-stack.html#calculating-prometheus-minimal-memory-space-requirement doesn't help much when you have 5000+ tables

we have t3.large for the monitor, which maybe not exactly as the calculator suggest, but two years ago it was working o.k. for this case...

@vponomaryov
Copy link
Author

@vponomaryov do the monitoring server match the Memory Space requirement https://monitoring.docs.scylladb.com/stable/install/monitoring-stack.html#calculating-prometheus-minimal-memory-space-requirement

we have t3.large for the monitor, which maybe not exactly as the calculator suggest, but two years ago it was working o.k. for this case...

In the test run used for the bug report was used following instance type for the monitoring node: m6i.xlarge

@mykaul
Copy link
Contributor

mykaul commented Dec 3, 2024

@vponomaryov do the monitoring server match the Memory Space requirement https://monitoring.docs.scylladb.com/stable/install/monitoring-stack.html#calculating-prometheus-minimal-memory-space-requirement

we have t3.large for the monitor, which maybe not exactly as the calculator suggest, but two years ago it was working o.k. for this case...

In the test run used for the bug report was used following instance type for the monitoring node: m6i.xlarge

Please fetch from Prometheus UI the TSDB status page, which will help us analyzing this.

@vponomaryov
Copy link
Author

vponomaryov commented Dec 3, 2024

Please fetch from Prometheus UI the TSDB status page, which will help us analyzing this.

TSDB Status

Head Stats

Number of Series Number of Chunks Number of Label Pairs Current Min Time Current Max Time
2843270 15940315 12293 2024-12-01T06:00:00.714Z (1733032800714) 2024-12-01T09:37:40.845Z (1733045860845)

Head Cardinality Stats

Top 10 label names with value count

Name Count
cf 10056
name 1193
le 143
type 115
devices 83
handler 51
collector 46
name 35
cpu 32
shard 30

Top 10 series count by metric names

Name Count
scylla_column_family_write_latency_bucket 1366170
scylla_column_family_read_latency_bucket 679835
wlatencyaks 55646
wlatencyp95ks 55646
wlatencyp99ks 55646
scylla_column_family_cache_hit_rate 50280
scylla_column_family_live_sstable 50280
scylla_column_family_total_disk_space 50280
scylla_column_family_live_disk_space 50280
rlatencyp99ks 27724

Top 10 label names with high memory usage

Name Bytes
name 106236467
cf 44598208
cluster 28081620
instance 26033983
le 25698553
dc 25301132
job 15793954
ks 13335682
by 2367290
class 339838

Top 10 series count by label value pairs

Name Count
dc=eu-west-1 2810750
cluster=my-cluster 2808162
ks=feeds 2664829
job=scylla 2556887
name=scylla_column_family_write_latency_bucket 1366170
instance=10.4.4.193 853470
instance=10.4.6.77 853400
instance=10.4.4.64 853271
name=scylla_column_family_read_latency_bucket 679835
instance=10.4.6.145 132761

@vponomaryov
Copy link
Author

vponomaryov commented Dec 4, 2024

Setting of the enable_node_aggregated_table_metrics: false scylla config option did remove the problem in the scylla-staging/valerii/vp-scale-5000-tables-test#4.

Resource usage on the monitoring node:

Image

@amnonh
Copy link
Collaborator

amnonh commented Dec 7, 2024

There's really nothing we can do. A part of the number of metrics is proportional to the number of nodes multiplied by the number of tables. I will need to come up with a better metrics prediction formula (though it will always be difficult). When we have many tables, we can use a bigger monitoring node or disable the per-table metrics.

@vponomaryov
Copy link
Author

For this specific test scenario it was decided to disable per-table metrics gathering: scylladb/scylla-cluster-tests#9843

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working right
Projects
None yet
Development

No branches or pull requests

5 participants