Folder Synchroniser

This is a brief Python (3.10.0) project that aims to synchronise source and destination directories using a client-server that communications over an IP.

This was developed and tested on Windows 11, so I cannot verifiy if it will work on Linux.

Usage

To run the program, run main.py and specify the existing source and destination folders as the --src and --dst arguments respectively, such as:

python .\src\main.py --src=test_components/src/ --dst=test_components/dst

To run the behavioural tests:

behave .\test\features\

Methods

Identifying modifications to the source directory

The client component recursively scans the source directory and retrieves the path of every file found. The client then iterates over the list of paths, and determines if the files are:

New - the path hasn't been saved by the client and thus not synchronised yet.
Modified - the path has been seen already, and the modification timestamp metadata of the file is more recent than the timestamp of the last scan
Deleted - a previously found file is not present in the list of found paths within the directory.

Communication between the client-sever

Once the list of new, modified and deleted files has been obtained, the client shall read the data of the file, and encode it within the ISO8859-1 scheme. This scheme facilities the sending and writing of different file extensions, while being in a relatively compact format, and has yielded great results when testing against the popular document, video, and audio file types. The component then wraps this encoded data with the relative file path to the source directory folder within a JSON object, and sends the data to the server over an IP, facilitated by the socket library.

If the file has been deleted, the client simply sends the path to the server. While this works well, it doesn't inform of behaviour well - given the message simply contains a path. While adding complexity, this could be improved by creating two different channels of communication between the client-server, one for writing data and one for deletion of files at a given path. As a result, it becomes incredibly clear to the server, or any future software, what it should do with the data.

Testing

This project features behavioural tests using the behave library. The high-level scenarios and result are shown below, but additional breakdown can be found in /test/features/*.feature.

Results

Detection and saving of new files

Synchronise single tiered directory - Passed ✅
Synchronise multi tiered directory - Passed ✅
Synchronise copied file - Passed ✅

Detection and removal of existing files

Synchronise directories when a file is removed - Passed ✅

The client rejects significantly large files

The client is given a large file, but it is too large to send. - Passed ✅
A rejected large file is later deleted from the client directory. - Passed ✅

Detection and saving of different file types

Synchronise .mp4 files - Passed ✅
Synchronise .avi files - Passed ✅
Synchronise .mov files - Passed ✅
Synchronise .jpg files - Passed ✅
Synchronise .tiff files - Passed ✅
Synchronise .png files - Passed ✅
Synchronise .mp3 files - Passed ✅
Synchronise .wav files - Passed ✅
Synchronise .pdf files - Passed ✅
Synchronise .docx files - Passed ✅
Synchronise .xlsx files - Passed ✅

Evidence

Improvements

Other than the aforementioned communication, another improvement could be to reduce the amount of data required to be sent in the event of a file being modified. Currently, if a file is updated, all the file's data is resent to the server, which is inefficient. To improve this, the client could firstly identify a series of modifications, and then upload each series with a starting position within the file. Therefore, if a text file containing the string Hello were to be updated to Hello World!, the client would simply need to send World! along with an index informing the server to insert World! to the end of the string.

Furthermore, testing could be improved with stub messaging. Currently, tests in place are behavioural tests for the entire system, and so doesn't particularly well inform at a low-level what has failed. This could be improved by messaging interfaces, which could be implemented using the socket library for actual software usage, but could also be implemented by a stub messaging class that simply sends and/or receives data through a list. As a result, it would allow a further breakdown of testing - for example testing that if a new file was added, the stub messaging list would receive a single entry for the file, and could be verified standalone without the use of the server. I do something similar at work, hence the observation!

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
doc		doc
src		src
test		test
.gitignore		.gitignore
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Folder Synchroniser

Usage

Methods

Identifying modifications to the source directory

Communication between the client-sever

Testing

Results

Detection and saving of new files

Detection and removal of existing files

The client rejects significantly large files

Detection and saving of different file types

Evidence

Improvements

About

Languages

chulme/folder-synchroniser

Folders and files

Latest commit

History

Repository files navigation

Folder Synchroniser

Usage

Methods

Identifying modifications to the source directory

Communication between the client-sever

Testing

Results

Detection and saving of new files

Detection and removal of existing files

The client rejects significantly large files

Detection and saving of different file types

Evidence

Improvements

About

Topics

Resources

Stars

Watchers

Forks

Languages