Skip to content

Backup and Restore detailed

Daria Kharlan edited this page Nov 10, 2022 · 7 revisions

Docker

These series of steps describes how to back up data from one Docker container and restore on the other

1. Backing up Agent's database into a single dump file

Perform full dump of Agent's database (PostgreSQL) int single dump file

# 1.1. Inside docker container
$ docker exec -it anodot-agent bash
$ agent backup
agent database successfully dumped to /usr/src/app/backup-data/agent_2022-11-09_12:03:38.dump

# 1.2. Outside docker container
$ docker exec anodot-agent agent backup
agent database successfully dumped to /usr/src/app/backup-data/agent_2022-11-09_12:03:38.dump

2. Copy dump file from Docker container to local machine

# Copy the dump file
$ docker cp anodot-agent:/usr/src/app/backup-data/agent_2022-11-09_12:03:38.dump ./agent.dump

# Verify file copied
$ ls -la agent.dump
-rw-r--r-- 1 ubuntu ubuntu 28124 Nov  9 12:03 agent.dump

3. Restore database and StreamSets on another Docker container

Recreates or updates existing pipelines, their offsets and statuses using database info. If StreamSets instance related to the pipeline is not available, the pipeline will not be restored unless --use-available will be provided

# Copy the dump file to a new container
$ docker cp agent.dump anodot-agent-new:/usr/src/app/

# Connect to the container
$ docker exec -it anodot-agent-new bash

# Restore DataBase
$ agent restore database /usr/src/app/agent.dump
Are you sure you want to restore `agent` database from the dump? All current data in the database will be overwritten [y/N]: y
Database `agent` successfully restored

# Restore StreamSets 
# Because pipelines are moving to other StreamSets instances, _--use-available_ is recommended
# You will an output for each pipeline restoring process, to which StreamSets instance it is assigned
$ agent restore streamsets --use-available
2022-11-09 12:50:52,983 - INFO - Get pipelines
2022-11-09 12:50:52,989 - INFO - Get pipelines
Creating pipeline `pipeline_1`
2022-11-09 12:50:53,111 - INFO - Create pipeline `pipeline_1` in `http://dc2:18630`
2022-11-09 12:50:53,171 - INFO - Get pipeline `pipeline_1`
2022-11-09 12:50:53,209 - INFO - Update pipeline `pipeline_1` in `http://dc2:18630`
2022-11-09 12:50:53,261 - INFO - Get pipeline status `pipeline_1`
Success
Creating pipeline `pipeline_2`
2022-11-09 12:50:53,323 - INFO - Create pipeline `pipeline_2` in `http://dc2:18630`
2022-11-09 12:50:53,394 - INFO - Get pipeline `pipeline_2`
2022-11-09 12:50:53,433 - INFO - Update pipeline `pipeline_2` in `http://dc2:18630`
2022-11-09 12:50:53,482 - INFO - Get pipeline status `pipeline_2`
Success
StreamSets pipelines successfully restored

4. Verify pipeline statuses

agent pipeline list

Check there are no error statuses and that all required pipelines are running

Clone this wiki locally