Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Actually make dask.distributed optional #22 #27

Merged
merged 9 commits into from
Oct 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Changed

- fix bug by making dependency `distributed` optional ![\#27](https://github.com/mllam/mllam-data-prep/pull/27)
- change config example to call validation split `val` instead of `validation` [\#28](https://github.com/mllam/mllam-data-prep/pull/28)
- fix typo in install dependency `distributed` ![\#20](https://github.com/mllam/mllam-data-prep/pull/20)
- add missing `psutil` requirement. [\#21](https://github.com/mllam/mllam-data-prep/pull/21).
Expand Down
19 changes: 16 additions & 3 deletions mllam_data_prep/__main__.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,19 @@
import os
from pathlib import Path

import psutil
from dask.diagnostics import ProgressBar
from dask.distributed import LocalCluster
from loguru import logger

from .create_dataset import create_dataset_zarr

# Attempt to import psutil and dask.distributed modules
DASK_DISTRIBUTED_AVAILABLE = True
try:
import psutil
from dask.diagnostics import ProgressBar
from dask.distributed import LocalCluster
except ImportError or ModuleNotFoundError:
DASK_DISTRIBUTED_AVAILABLE = False

if __name__ == "__main__":
import argparse

Expand Down Expand Up @@ -36,6 +42,13 @@
ProgressBar().register()

if args.dask_distributed_local_core_fraction > 0.0:
# Only run this block if dask.distributed is available
if not DASK_DISTRIBUTED_AVAILABLE:
raise ModuleNotFoundError(
"Currently dask.distributed isn't installed and therefore can't "
"be used in mllam-data-prep. Please install the optional dependency "
'with `python -m pip install "mllam-data-prep[dask-distributed]"`'
)
# get the number of system cores
n_system_cores = os.cpu_count()
# compute the number of cores to use
Expand Down
Loading