Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AmazonS3MoveCleanupPolicy doesn't cleanup files during KafkaConnect workers restart #671

Open
arnitolog opened this issue Sep 23, 2024 · 0 comments

Comments

@arnitolog
Copy link

Hello,
I have KafkaConnect with 2 workers and FilePulse connector with 4 tasks. During the rolling restart of the workers, connector stops cleaning up files from S3. At the same time once the restart is finished, new files are processed and cleaned up successfully.
All not-cleaned files have the status "COMMITTED" in the status topic.
here is my connector config:

    aws.s3.region: "us-east-1"
    aws.s3.bucket.name: "test-xxxxx-data-us-east-1"
    aws.s3.bucket.prefix: "emails/"

    topic: "test-xxxxx-data"
    fs.listing.class: io.streamthoughts.kafka.connect.filepulse.fs.AmazonS3FileSystemListing
    fs.listing.interval.ms: 5000

    fs.cleanup.policy.class: io.streamthoughts.kafka.connect.filepulse.fs.clean.AmazonS3MoveCleanupPolicy
    fs.cleanup.policy.move.success.aws.prefix.path: "success"
    fs.cleanup.policy.move.failure.aws.prefix.path: "failure"

    allow.tasks.reconfiguration.after.timeout.ms: 120000
    tasks.reader.class: io.streamthoughts.kafka.connect.filepulse.fs.reader.AmazonS3BytesArrayInputReader
    tasks.file.status.storage.bootstrap.servers: kafka-cluster-kafka-bootstrap:9093
    tasks.file.status.storage.topic: test-xxxxx-data-status-internal
    tasks.file.status.storage.topic.partitions: 10
    tasks.file.status.storage.topic.replication.factor: 3

There are no errors in the logs.
Just a thought, is it possible/makes sense to check file status by AmazonS3MoveCleanupPolicy retrospectively and if it has "COMMITTED" status clean it up?

@arnitolog arnitolog changed the title AmazonS3MoveCleanupPolicy doesn't cleanup after restart AmazonS3MoveCleanupPolicy doesn't cleanup files during KafkaConnect workers restart Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant