Performance Tweaks, Iterators, and Lazy Evaluations #308
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
A new function,
dejavu.logic.decoder.find_files_g
, has been created as an iterator replacement forfind_files
in the same file. This allows the updatedDejavu.fingerprint_directory
method to utilize theconcurrent.futures.ProcessPoolExecutor
in conjunction with theconcurrent.futures.as_completed
function to submit files to be processed as they are yielded for immediate processing. Once all of the files have been submitted to the executor for processing, their respective results will be iterated over inas_completed
...as they are completed.These two modifications will allow considerable speed improvements over the existing methods. If anyone knows of where to find a large number of Creative Common openly licensed audio files to download and test on, I would be glad to post comparison results. I tried finding some today but every website was either broken, required creating accounts, or was so extensively rate-limited to be near useless.
Other minor improvements are
songs
andsonghashes_set
in the__init__
ofDejavu
to take advantage of__init__
's special dictcounts
andsongs_matches
inDejavu.align_matches
to generator comprehensionssong_hash
directly toself.songhashes_set
instead of creating the variable first. Saves a lookup per iterationDejavu.__load_fingerprinted_audio_hashes()
until after all files have been processedchannels
indejavu.logic.decoder.read
to use list comprehensions