Skip to content

Commit

Permalink
Merge pull request #3 from aidotse/master
Browse files Browse the repository at this point in the history
Small correction
  • Loading branch information
sancarlim authored Jul 6, 2022
2 parents e853d7f + 0af0779 commit 29ed075
Show file tree
Hide file tree
Showing 7 changed files with 69 additions and 18 deletions.
4 changes: 2 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,12 @@ __pycache__/
*jpg
*pth
*npz
*txt*
*even*
*tar*
*jpeg
*json*
*tsv
.ipynb_checkpoints/
*batch*
wandb/
wandb/
data/
24 changes: 24 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
FROM nvcr.io/nvidia/pytorch:21.11-py3

ARG USER_ID
ARG GROUP_ID

ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

WORKDIR /workspace

COPY requirements.txt .

RUN pip install -r requirements.txt

# Unset TORCH_CUDA_ARCH_LIST and exec. This makes pytorch run-time
# extension builds significantly faster as we only compile for the
# currently active GPU configuration.
#RUN (printf '#!/bin/bash\nunset TORCH_CUDA_ARCH_LIST\nexec \"$@\"\n' >> /entry.sh) && chmod a+x /entry.sh
#ENTRYPOINT ["/entry.sh"]

RUN addgroup --gid $GROUP_ID user
RUN adduser --disabled-password --gecos '' --uid $USER_ID --gid $GROUP_ID user
USER user

30 changes: 25 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# Federated Learning in Healthcare with Flower

[![Watch the video](https://img.youtube.com/vi/kifLAY_5JA0/maxresdefault.jpg)](https://youtu.be/kifLAY_5JA0)
Result from the project was presented at [Flower Summit, Cambridge 2022](https://flower.dev/conf/flower-summit-2022/).

Harnessing the potential of AI in healthcare requires access to huge amounts of data to build robust models. One solution to overcome the problems of sharing healthcare data to develop better models is federated learning. In this scenario, different models are trained on each hospital's local data and share their knowledge (parameters) with a central server that performs the aggregation in order to achieve a more robust and fair model.

This repository contains the code to reproduce the experiments performed in the framework of the Decentralized AI in Healthcare project at Sahlgrenska University Hospital and AI Sweden. In this example repository we are working on publicly available data (ISIC Archive) and simulating the decentralised setup internally. We have two different tasks on which we are actively working :
Expand All @@ -9,7 +12,8 @@ This repository contains the code to reproduce the experiments performed in the
* Image classification in FL/SL setup.

Our main use case is connected with Melanoma Diagnosis using ISIC Dataset:
* **ISIC 2020**: Download the [ISIC 2020 dataset](https://www.kaggle.com/nroman/melanoma-external-malignant-256)

* **ISIC 2020**: Download the [ISIC 2020 dataset](https://www.kaggle.com/nroman/melanoma-external-malignant-256)

## Flower framework
Flower is a user-friendly framework designed for implementing the Federated Learning approach.
Expand All @@ -18,7 +22,7 @@ Flower is a user-friendly framework designed for implementing the Federated Lear

Installing the Flower framework requires Python 3.6 or higher version.

To install its stabte version found on PyPI:
To install its stable version found on PyPI:

```pip install flwr ```

Expand All @@ -30,6 +34,22 @@ To install its latest version from GitHub

```pip install git+https://github.com/adap/flower.git ```


### Requirements

Use the provided Dockerfile to build and image with the required library dependencies, provided in requirements.txt.

```docker build -t decentralized_ai_dermatology --build-arg USER_ID=$(id -u) --build-arg GROUP_ID=$(id -g) -f Dockerfile .```

Start the container

```docker run -d --rm -it --volume $(pwd):/workspace --shm-size 8G --name decentralized_ai_dermatology decentralized_ai_dermatology```

Execute container

```docker exec -it decentralized_ai_dermatology bash```


### Federated learning pipeline

A federated learning system needs two parts
Expand Down Expand Up @@ -434,7 +454,7 @@ With both client and server ready, we can now run everything and see federated l
Once the server is running we can start the clients in different terminals. Open a new terminal per client and start the client:

[`client_isic.py`](https://github.com/aidotse/decentralizedAI_dermatology/blob/master/client_isic.py) per terminal:
```python client_isic.py –path_data <path> –num_partitions <> –partition <> –gpu <gpu ID>```
```python client_isic.py –-path <path> –-num_partitions <> –-partition <> –-gpu <gpu ID>```

Note: Use `--nowandb` flag if you want to disable wandb logging.

Expand All @@ -443,7 +463,7 @@ Note: Use `--nowandb` flag if you want to disable wandb logging.
To train the model in a centralized way in case you want to make a comparison, you can run:

[`train_local.py`](https://github.com/aidotse/decentralizedAI_dermatology/blob/master/train_local.py)
```python train_local.py –path_data <path> –num_partitions <> –partition <> –gpu <gpu ID>```
```python train_local.py –-path_data <path> –-num_partitions <> –-partition <> –-gpu <gpu ID>```

Note: Use `--nowandb` flag if you want to disable wandb logging.

Expand All @@ -465,7 +485,7 @@ For StylGAN2-ADA implementation we used:
The model is evaluated in a decentralized manner. For some GANs parameters see the script.

2. Launch one [`client_isic_gan.py`](https://github.com/aidotse/decentralizedAI_dermatology/blob/master/client_isic_gan.py) per terminal:
```python client_isic_gan.py –data <path> –num_partitions <> –partition <> –gpu <gpu ID>```
```python client_isic_gan.py –-data <path> –-num_partitions <> –-partition <> –gpu <gpu ID>```

Note: Use `--wandb` flag if you want to enable wandb logging.

Expand Down
18 changes: 10 additions & 8 deletions client_isic.py
Original file line number Diff line number Diff line change
Expand Up @@ -111,18 +111,20 @@ def evaluate(

if __name__ == "__main__":
parser = ArgumentParser()
parser.add_argument("--model", type=str, default='efficientnet-b2')
parser.add_argument("--model", type=str, default="efficientnet-b2")
parser.add_argument("--log_interval", type=int, default=100)
parser.add_argument("--epochs", type=int, default=2)
parser.add_argument("--batch_train", type=int, default=32)
parser.add_argument("--num_partitions", type=int, default=20)
parser.add_argument("--partition", type=int, default=0)
parser.add_argument("--gpu", type=int, default=0)
parser.add_argument("--tags", type=str, default='Exp 5. FedBN')
parser.add_argument("--tags", type=str, default="Exp 5. FedBN")
parser.add_argument("--nowandb", action="store_true")
parser.add_argument("--path", type=str, default='/workspace/melanoma_isic_dataset')
parser.add_argument("--path", type=str, default="/workspace/melanoma_isic_dataset")
parser.add_argument("--host", type=str, default="0.0.0.0")
parser.add_argument("--port", type=str, default="8080")
args = parser.parse_args()


# Setting up GPU for processing or CPU if GPU isn't available
os.environ["CUDA_VISIBLE_DEVICES"] = str(args.gpu)
device = torch.device( "cuda" if torch.cuda.is_available() else "cpu")
Expand All @@ -132,13 +134,13 @@ def evaluate(
model = utils.load_model(args.model, device)

if not args.nowandb:
wandb.init(project="dai-healthcare" , entity='eyeforai', group='FL', tags=[args.tags], config={"model": args.model})
wandb.init(project="dai-healthcare" , entity="eyeforai", group="FL", tags=[args.tags], config={"model": args.model})
wandb.config.update(args)
# wandb.watch(model, log="all")

# Load data
# Normal partition
trainset, valset, num_examples = utils.load_isic_data()
trainset, valset, num_examples = utils.load_isic_data(args.path)
trainset, valset, num_examples = utils.load_partition(trainset, valset, num_examples, idx=args.partition, num_partitions=args.num_partitions)
# Exp 1
# trainset, testset, num_examples = utils.load_exp1_partition(trainset, testset, num_examples, idx=args.partition)
Expand All @@ -150,12 +152,12 @@ def evaluate(

print(f"Train dataset: {len(trainset)}, Val dataset: {len(valset)}, Test dataset: {len(testset)}")

train_loader = DataLoader(trainset, batch_size=32, num_workers=4, worker_init_fn=utils.seed_worker, shuffle=True)
train_loader = DataLoader(trainset, batch_size=args.batch_train, num_workers=4, worker_init_fn=utils.seed_worker, shuffle=True)
val_loader = DataLoader(valset, batch_size=16, num_workers=4, worker_init_fn=utils.seed_worker, shuffle = False)
test_loader = DataLoader(testset, batch_size=16, num_workers=4, worker_init_fn=utils.seed_worker, shuffle = False)

# Start client
client = Client(model, train_loader, val_loader, test_loader, num_examples)
fl.client.start_numpy_client("0.0.0.0:8080", client)
fl.client.start_numpy_client(args.host + ":" + args.port, client)


5 changes: 5 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
imageio-ffmpeg==0.4.3
pyspng==0.1.0
flwr==0.18.0
efficientnet_pytorch==0.7.1
wandb==0.12.17
2 changes: 1 addition & 1 deletion server_advanced.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ def get_eval_fn(model, path):
# Load data and model here to avoid the overhead of doing it in `evaluate` itself

# Exp 1
trainset, testset, num_examples = utils.load_isic_data()
trainset, testset, num_examples = utils.load_isic_data(path)
trainset, testset, num_examples = utils.load_partition(trainset, testset, num_examples, idx=3, num_partitions=10) # Use validation set partition 3 for evaluation of the whole model

# Exp 2
Expand Down
4 changes: 2 additions & 2 deletions train_local.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,13 +48,13 @@

# Load data
# trainset, testset, num_examples = utils.load_exp1_partition(trainset, testset, num_examples, idx=args.partition)
train_df, validation_df, num_examples = utils.load_isic_by_patient(args.partition)
train_df, validation_df, num_examples = utils.load_isic_by_patient(args.partition, args.path_data)

trainset = utils.CustomDataset(df = train_df, train = True, transforms = training_transforms)
valset = utils.CustomDataset(df = validation_df, train = True, transforms = testing_transforms )
train_loader = DataLoader(trainset, batch_size=32, num_workers=8, worker_init_fn=utils.seed_worker ,shuffle=True)
val_loader = DataLoader(valset, batch_size=16, num_workers=4, worker_init_fn=utils.seed_worker, shuffle = False)
testset = utils.load_isic_by_patient(-1)
testset = utils.load_isic_by_patient(-1, args.path_data)
test_loader = DataLoader(testset, batch_size=16, num_workers=4, worker_init_fn=utils.seed_worker, shuffle = False)
print(f"Train dataset: {len(trainset)}, Val dataset: {len(valset)}, Test dataset: {len(testset)}")

Expand Down

0 comments on commit 29ed075

Please sign in to comment.