Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[aarch64, debug] topology_custom/test_tablets2 failed with PytestCollectionWarning: #22182

Open
scylladb-promoter opened this issue Jan 6, 2025 · 4 comments
Assignees
Labels
area/tablets symptom/ci stability Issues that failed in ScyllaDB CI - tests and framework tests/flaky A problem with a test, having flaky behavior

Comments

@scylladb-promoter
Copy link
Contributor

https://jenkins.scylladb.com/job/scylla-master/job/next/8649/ failed with the following error:


_______________ test_tombstone_gc_disabled_on_pending_replica.1 ________________

manager = <test.pylib.manager_client.ManagerClient object at 0xffff683e0e90>

    @pytest.mark.asyncio
    @skip_mode('release', 'error injections are not supported in release mode')
    async def test_tombstone_gc_disabled_on_pending_replica(manager: ManagerClient):
        logger.info("Bootstrapping cluster")
        servers = [await manager.server_add()]
    
        await manager.api.disable_tablet_balancing(servers[0].ip_addr)
    
        cql = manager.get_cql()
        await cql.run_async("CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor': 1} AND tablets = {'initial': 4};")
        await cql.run_async("CREATE TABLE test.test (pk int PRIMARY KEY, c int) WITH gc_grace_seconds = 0;")
    
        servers.append(await manager.server_add())
    
        key = 7 # Whatever
        tablet_token = 0 # Doesn't matter since there is one tablet
        await cql.run_async(f"INSERT INTO test.test (pk, c) VALUES ({key}, 1) USING timestamp 9")
        rows = await cql.run_async("SELECT pk from test.test")
        assert len(rows) == 1
    
        replica = await get_tablet_replica(manager, servers[0], 'test', 'test', tablet_token)
    
        s0_host_id = await manager.get_host_id(servers[0].server_id)
        s1_host_id = await manager.get_host_id(servers[1].server_id)
        dst_shard = 0
    
        await manager.api.enable_injection(servers[1].ip_addr, "stream_mutation_fragments", one_shot=True)
        s1_log = await manager.server_open_log(servers[1].server_id)
        s1_mark = await s1_log.mark()
    
        migration_task = asyncio.create_task(
            manager.api.move_tablet(servers[0].ip_addr, "test", "test", replica[0], replica[1], s1_host_id, dst_shard, tablet_token))
    
        await s1_log.wait_for('stream_mutation_fragments: waiting', from_mark=s1_mark)
        s1_mark = await s1_log.mark()
    
        # write a tombstone with timestamp X to DB
        await cql.run_async(f'DELETE FROM test.test USING timestamp 10 WHERE pk = {key}')
    
        # flush both servers
        for s in servers:
            await manager.api.flush_keyspace(s.ip_addr, "test")
    
        await asyncio.sleep(1)
    
        # major compact both servers
        for s in servers:
            await manager.api.keyspace_compaction(s.ip_addr, "test")
    
        # write backdated data to test.test with timestamp X-1 with the same key as the tombstone
        await cql.run_async(f'INSERT INTO test.test (pk, c) VALUES ({key}, 0) USING timestamp 9')
    
        # release streaming
        await manager.api.message_injection(servers[1].ip_addr, "stream_mutation_fragments")
        await s1_log.wait_for('stream_mutation_fragments: done', from_mark=s1_mark)
    
        logger.info("Waiting for migration to finish")
        await migration_task
        logger.info("Migration done")
    
        for s in servers:
            await manager.api.flush_keyspace(s.ip_addr, "test")
    
        # verify result
        rows = await cql.run_async(f'SELECT pk, c FROM test.test WHERE pk = {key};')
>       assert len(rows) == 0
E       assert 1 == 0
E        +  where 1 = len([Row(pk=7, c=0)])

test/topology_custom/test_tablets2.py:1388: AssertionError
------------------------------ Captured log setup ------------------------------
@scylladb-promoter scylladb-promoter added symptom/ci stability Issues that failed in ScyllaDB CI - tests and framework tests/flaky A problem with a test, having flaky behavior triage/master Looking for assignee labels Jan 6, 2025
@mykaul
Copy link
Contributor

mykaul commented Jan 6, 2025

@mykaul
Copy link
Contributor

mykaul commented Jan 6, 2025

@bhalevy - please assign someone to look at (I did not see this issue before)

@bhalevy
Copy link
Member

bhalevy commented Jan 6, 2025

@nodep I see you added this test in cdf775d
I wonder if it's valid... can't tombstone_gc happen right after tablet migration completes?

@nodep
Copy link
Contributor

nodep commented Jan 6, 2025

@bhalevy Yes, you are right tombstone_gc can happen after the migration, which could explain what happened here. I will look into this to validate the hypothesis.

@mykaul mykaul removed the triage/master Looking for assignee label Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/tablets symptom/ci stability Issues that failed in ScyllaDB CI - tests and framework tests/flaky A problem with a test, having flaky behavior
Projects
None yet
Development

No branches or pull requests

4 participants