Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rocksdb-resharding: add a retry/until #107

Merged
merged 1 commit into from
Nov 23, 2023

Conversation

guits
Copy link
Collaborator

@guits guits commented Jun 14, 2022

When retrieving the container_image, if the corresponding OSD
isn't running (not started/ready yet), let's retry during 120sec before
we let the playbook fail.

This is because of a failure seen in the CI like following:

TASK [get container image currently used by osd container] *********************
task path: /home/jenkins-build/build/workspace/cephadm-ansible-prs-el8-functional/rocksdb-resharding.yml:66
Monday 13 June 2022  15:20:38 +0000 (0:00:00.015)       0:00:00.683 ***********
ok: [localhost -> ceph-node0] => changed=false
  cmd:
  - cephadm
  - shell
  - ceph
  - orch
  - ps
  - --daemon_type
  - osd
  - --daemon_id
  - '0'
  - --format
  - json
  delta: '0:00:02.669373'
  end: '2022-06-13 15:20:42.149214'
  rc: 0
  start: '2022-06-13 15:20:39.479841'
  stderr: |-
    Inferring fsid 4217f198-b8b7-11eb-941d-5254004b7a69
    Inferring config /var/lib/ceph/4217f198-b8b7-11eb-941d-5254004b7a69/mon.ceph-node0/config
    Using ceph image with id '7d10b4103611' and tag 'master' created on 2022-05-23 21:52:02 +0000 UTC
    quay.ceph.io/ceph-ci/ceph@sha256:0ece388ce186bf2122eb4f3389d6b108fa94aa3541d40d08449286fffc34e29f
  stderr_lines: <omitted>
  stdout: |2-

    [{"daemon_id": "0", "daemon_name": "osd.0", "daemon_type": "osd", "events": ["2022-06-13T15:20:20.821851Z daemon:osd.0 [INFO] \"Deployed osd.0 on host 'ceph-node4'\""], "hostname": "ceph-node4", "is_active": false, "memory_request": 4294967296, "ports": [], "service_name": "osd.osd", "status": 2, "status_desc": "starting"}]
  stdout_lines: <omitted>

We can see the 'status_desc' is 'starting'.

Signed-off-by: Guillaume Abrioux gabrioux@redhat.com

@guits guits force-pushed the sync_orch_ps_rocksdb-resharding_playbook branch from 9139093 to e57a9d2 Compare June 14, 2022 08:08
@asm0deuz asm0deuz force-pushed the sync_orch_ps_rocksdb-resharding_playbook branch 2 times, most recently from 2e2aed6 to 9e0df31 Compare November 22, 2023 20:26
When retrieving the container_image, if the corresponding OSD
isn't running (not started/ready yet), let's retry during 120sec before
we let the playbook fail.

This is because of a failure seen in the CI like following:

```
TASK [get container image currently used by osd container] *********************
task path: /home/jenkins-build/build/workspace/cephadm-ansible-prs-el8-functional/rocksdb-resharding.yml:66
Monday 13 June 2022  15:20:38 +0000 (0:00:00.015)       0:00:00.683 ***********
ok: [localhost -> ceph-node0] => changed=false
  cmd:
  - cephadm
  - shell
  - ceph
  - orch
  - ps
  - --daemon_type
  - osd
  - --daemon_id
  - '0'
  - --format
  - json
  delta: '0:00:02.669373'
  end: '2022-06-13 15:20:42.149214'
  rc: 0
  start: '2022-06-13 15:20:39.479841'
  stderr: |-
    Inferring fsid 4217f198-b8b7-11eb-941d-5254004b7a69
    Inferring config /var/lib/ceph/4217f198-b8b7-11eb-941d-5254004b7a69/mon.ceph-node0/config
    Using ceph image with id '7d10b4103611' and tag 'master' created on 2022-05-23 21:52:02 +0000 UTC
    quay.ceph.io/ceph-ci/ceph@sha256:0ece388ce186bf2122eb4f3389d6b108fa94aa3541d40d08449286fffc34e29f
  stderr_lines: <omitted>
  stdout: |2-

    [{"daemon_id": "0", "daemon_name": "osd.0", "daemon_type": "osd", "events": ["2022-06-13T15:20:20.821851Z daemon:osd.0 [INFO] \"Deployed osd.0 on host 'ceph-node4'\""], "hostname": "ceph-node4", "is_active": false, "memory_request": 4294967296, "ports": [], "service_name": "osd.osd", "status": 2, "status_desc": "starting"}]
  stdout_lines: <omitted>
```

We can see the 'status_desc' is 'starting'.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
@asm0deuz asm0deuz force-pushed the sync_orch_ps_rocksdb-resharding_playbook branch from 9e0df31 to 5743d5b Compare November 22, 2023 21:02
@asm0deuz asm0deuz added backport-quincy backport quincy backport-reef backport reef labels Nov 23, 2023
Copy link
Collaborator

@asm0deuz asm0deuz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@asm0deuz asm0deuz merged commit b0d56f8 into devel Nov 23, 2023
3 checks passed
@asm0deuz asm0deuz deleted the sync_orch_ps_rocksdb-resharding_playbook branch November 23, 2023 10:11
asm0deuz added a commit that referenced this pull request Nov 23, 2023
rocksdb-resharding: add a retry/until (backport #107)
asm0deuz added a commit that referenced this pull request Nov 23, 2023
rocksdb-resharding: add a retry/until (backport #107)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-quincy backport quincy backport-reef backport reef
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants