Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add hardware guide for bioimage analysis #47

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

jackyko1991
Copy link

brief hardware guide for bioimage analysis

@haesleinhuepf
Copy link
Owner

haesleinhuepf commented Jul 6, 2024

Hi @jackyko1991 ,

thanks for sending this! Before I read through this in very detail, can you confirm that you hold the copyright for the figures?

Edit: One more thing: The text does not really give advice how to choosing the Optimal Computer and Operating System. But the headline promises so...

Thanks!

Best,
Robert

@jackyko1991
Copy link
Author

Hi @haesleinhuepf I have updated the figures to guarantee copyright.

A hardware comparison table is added for quick reference to different hardware types.

Copy link
Owner

@haesleinhuepf haesleinhuepf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jackyko1991,

thanks again for working on this! Computing hardware is certainly a section I havent't considered yet. However the question "What computer should I buy to do bio-image analysis" comes up frequently. Hence, this page could be a good resource to point at. I made some comments in specific cases. Some general feedback: I would love to see this document simplified quite a bit. At the moment, it contains many many terms that must be deterrent to a bio-image analysis beginner and even to advanced folks, who did not study computer science. When updating the document think about the target audience a specified here: "Python beginners who are interested in analyzing images" I'm sure it's interesting for them to read what an NPU is. But spending multiple sentences and text sections on the benefits of NPU might no be worth the effort, because the reader may struggle with setting up a functional Python environment on the remote computer which has an NPU.
The same goes with "sockets", "edge comuting", "shells", "VSCode", "SSH" and "IPPs". All those things might not be very relevant for someone who aims at learning Python to analyse their images. But they would be thankful for some advice regarding what computer to buy and which GPU to use for which kind of tasks.

In the section GPU-acceleration a more advanced document could live explaining differences between NVidia, AMD and Intel GPUs, CUDA, OpenCL, etc. also NPUs could live there. Do you by chance have an example notebook demonstrating how to use an NPU?

I know I'm requesting quite some changes. But I think it's worth the effort.

Thanks for your work!

Best,
Robert

docs/01_introduction/hardware.md Outdated Show resolved Hide resolved

Though Python is runnable on most of modern operating systems (OS) including Windows, MacOS and Linux, it is beneficial to keep scripting under *nix environment. Here we provide a guide for beginners to choose your computing hardwares.

This guide is intentionally written for programming beginners to code locally. For advance research units equip with python servers, we will cover a series of remote coding techniques to unleash more complex bioimage analysis.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This guide is intentionally written for programming beginners to code locally. For advance research units equip with python servers, we will cover a series of remote coding techniques to unleash more complex bioimage analysis.
This guide is intentionally written for programming beginners to code locally.

I'm removing the sentence about remote coding, because the mentioned series is not linked yet. We can add the reference back once these tutorials were written.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed accordingly

| **OS** | Windows, macOS, Linux | Windows, macOS, Linux | Windows, macOS, Linux | Linux, Windows Server |
| **Portability** | Highly portable | Not portable | Not portable | Not portable |
| **Application Scenarios** | Mobile work, basic to moderate tasks | Stationary use, moderate to intensive tasks | Intensive tasks, advanced analysis | Large-scale projects, remote access, collaborative research |
| **ARM vs x86** | Mostly x86 (some ARM options like Apple Silicon and Snapdragon XLite) | Mostly x86 except for Apple | Mostly x86 except for Apple | Mostly x86 (ARM servers available, e.g., AWS Graviton) |
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious: Have you tried installing Pyton on a non-Mac ARM computer, e.g. featuring the Snapdragon CPU. I'm curious how well the Python ecosystem is compatible with these machines. Also which Operating System runs on non-Mac ARM computers?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far I have tested on several ARM platforms like Raspberry Pi and nvidia SBC. Both are running under Ubuntu.

Software slowly catching up with pre-compiled libraries and Linux side. If one using VSCode and miniforge the environment is quite mature.

miniforge has no pre-compiled version for ARM Windows. 100% not recommended.

Certain image libraries that require OpenGL, and some SoC from Broadcom (RPI) run natively OpenGLES2 that emulates OpenGL using mesa drivers from Debian. High-performance rendering like 3D data plots is very bottlenecked.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, then how likely is it that someone who aims at analysing images uses a Raspberry PI or an NVidia SBC?

| **Portability** | Highly portable | Not portable | Not portable | Not portable |
| **Application Scenarios** | Mobile work, basic to moderate tasks | Stationary use, moderate to intensive tasks | Intensive tasks, advanced analysis | Large-scale projects, remote access, collaborative research |
| **ARM vs x86** | Mostly x86 (some ARM options like Apple Silicon and Snapdragon XLite) | Mostly x86 except for Apple | Mostly x86 except for Apple | Mostly x86 (ARM servers available, e.g., AWS Graviton) |
| **ARM Performance** | Energy-efficient, good for battery life | Limited use, lower performance than x86, suitable for edging computing like smart microscopy | Rare, used in specific scenarios | High efficiency, used in cloud services |
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The not-so-computational reader might wonder what "edging computing" is.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the term

| **Application Scenarios** | Mobile work, basic to moderate tasks | Stationary use, moderate to intensive tasks | Intensive tasks, advanced analysis | Large-scale projects, remote access, collaborative research |
| **ARM vs x86** | Mostly x86 (some ARM options like Apple Silicon and Snapdragon XLite) | Mostly x86 except for Apple | Mostly x86 except for Apple | Mostly x86 (ARM servers available, e.g., AWS Graviton) |
| **ARM Performance** | Energy-efficient, good for battery life | Limited use, lower performance than x86, suitable for edging computing like smart microscopy | Rare, used in specific scenarios | High efficiency, used in cloud services |
| **x86 Performance** | High performance, widely supported | Higher performance, widely supported | Highest performance, widely supported | Highest performance, widely supported |
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you think about the difference between x86 and x64 ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x64 (in full x86-64) is the 64bit version of x86 instruction set. x86 is the broader family name.


## GPU Support
### AI Training
Though all SoC manufacturers embeds GPU in the chipset, the AI based analysis is largely relying on NVidia CUDA as the base software stack. Common neural network libraries in Python (pyTorch and Tensorflow) are the foundation stone of popular models like UNet, Cellpose and Stardist. Yet we are seeing a recent support to pyTorch AMD ROCm and Intel OneAPI AI acceleration, the community support is fairly limited when comparing to CUDA. Considering the training scalability and infrastructure support across major GPU farms/research clusters, NVidia is still the sole runner when consider new model training.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if SoC-users are the primary target audience. I'm wondering if most imaging scientists in "rich" institutes may have a workstation with an NVidia GPU. Less wealthy image analysts may do their work on cheaper laptops, maybe offering gaming GPUs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NPU is more for inference. In my analysis experience now cell detections are 80% relies on AI based segmentation (mainly cellpose).

I don't think too many bioimage analysts will train their specific cell detection model. That's the reason why I think the NPU will play a significant role in the upcoming years and is worth more mentioning than GPU.

All new laptop CPUs are SoC from the year 2024 with most of them embedded with NPU. The only difference is if they have an independent CUDA chipset.

Though all SoC manufacturers embeds GPU in the chipset, the AI based analysis is largely relying on NVidia CUDA as the base software stack. Common neural network libraries in Python (pyTorch and Tensorflow) are the foundation stone of popular models like UNet, Cellpose and Stardist. Yet we are seeing a recent support to pyTorch AMD ROCm and Intel OneAPI AI acceleration, the community support is fairly limited when comparing to CUDA. Considering the training scalability and infrastructure support across major GPU farms/research clusters, NVidia is still the sole runner when consider new model training.

### AI Inference
Machine learning algorithms consists of two parts: model training and inference. The computation resources for a fixed AI model to be implemented in new data are much smaller than training from scratch. On smaller AI tasks non-CUDA chipsets bring larger options for bioimage analysis. The inference of neural network based AI can be physically accelerated with specifically designed circuits. Such designs are often referred as neural processing units (NPU). NVidia, specifically added Tensor Core in bundle with optimised packages like cuDNN and Transformer Engine, to their later GPU products. We will cover this topic on the later of the article.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I'm not sure if specifically designed circuits and NPUs are available to common bio-image analysts.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are on all newly released laptops, ranging from Intel, Apple, AMD and Nvidia.

Apple equipped NPU after M1 and Nvidia has Tensor core after Volta (V100/GeForce RTX 20 series)

All tensor cores are enabled by default with Tensorflow (https://docs.nvidia.com/deeplearning/frameworks/tensorflow-user-guide/index.html#tf_disable_tensor_op_math).

To the concern of beginners, I am wondering if one cannot afford the high price of nvidia devices, they should still have a guide on NPU accelerated inference. For apple users they will need the forked version of tensorflow: https://developer.apple.com/metal/tensorflow-plugin/

</div>

### GPGPU Acceleration
Apart from AI applications, bioimage analysis tasks like single plane illumination fluorescent correlation spectroscopy (SPIM-FCS) performs [pixelwise fitting of the autocorrelation function](https://github.com/bpi-oxford/Gpufit/blob/master/Gpufit/models/spim_acfN.cuh). In quantitative imaging one may be interested in photon counting or camera calibrated denoising, that largely relies on the [pixel-by-pixel gain fitting](https://github.com/jackyko1991/sCMOS-Denoise/blob/main/notebooks/camera_calibration.ipynb). Such image analysis can utilise the parallelisation power of GPU to accelerate the research.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link removed

### GPGPU Acceleration
Apart from AI applications, bioimage analysis tasks like single plane illumination fluorescent correlation spectroscopy (SPIM-FCS) performs [pixelwise fitting of the autocorrelation function](https://github.com/bpi-oxford/Gpufit/blob/master/Gpufit/models/spim_acfN.cuh). In quantitative imaging one may be interested in photon counting or camera calibrated denoising, that largely relies on the [pixel-by-pixel gain fitting](https://github.com/jackyko1991/sCMOS-Denoise/blob/main/notebooks/camera_calibration.ipynb). Such image analysis can utilise the parallelisation power of GPU to accelerate the research.

One high level analysis package [py-clesperanto](https://github.com/clEsperanto/pyclesperanto_prototype) attempts GPU acceleration based on OpenCL. Such computing process allows bioimage analysis not bound to graphic processing, but to more generic calculations. From this the GPU is often referred as general purpose GPU (GPGPU). Vendors like AMD and Intel are alternatives to NVidia in this sense.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about mentioning AMD and Intel GPUs in a table above?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comparision between GPU vendors

docs/_toc.yml Outdated
@@ -8,6 +8,7 @@ parts:
- caption: Basics
chapters:
- file: 01_introduction/trailer
- file: 01_introduction/hardware
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if the document might fit better in an advanced section, e.g. in the GPU-acceleration section? We should certainly not introduce hardware-aspects before anything else.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to a separate hardware section

@jackyko1991
Copy link
Author

@haesleinhuepf I think the hardware part may be less relevant to beginners but still serves as a good reference to intermediate to advance level bioimage analysts. I have moved the article out of the beginning and put it in a separate "hardware" folder.

For better readability, I split it into several pages. Hope it helps.

Not quite sure shell scripting is needed for this page to focus on python coding. Yet in my analysis experience, bash coding is essential for analysis task automation and advancing to more complex bioimage analysis. So I keep the section under advanced python.

My idea is to allow this teaching material to not just Windows users but a boarder audience using different OSs. To accommodate cross-platform compatibility using bash convention is advantageous. Let me know if my though meet the page's intention.

Copy link
Owner

@haesleinhuepf haesleinhuepf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi again,

it would be great if you could comment on the questions I raised in my review this afternoon. The new text is still full of aspects which I think are irelevant to the target audience (SoCs, NPUs, IPP, ...), and also huge parts of the text require studying computer science to understand it. Please think of the target audience, and consider shortening.

Thank you for your time!

Best,
Robert


PyTorch provides AMP through the torch.cuda.amp module.

```python
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you put that code into a notebook like all the other notebboks in this collection?

<p><em>Intel Meteror Lake processor architecture. Modern day IC vendors tends to integrate various computation components on one single chipset to facilitate performance. When performing bioimage analysis we often utilise the processor's different computational units. Certain processor architectures facilitate more one specific tasks, e.g. image decode/encode tasks can take advantage of Intel Integrated Performance Primitives (IPP) library with hardware level accelerations.</em></p>
</div>

Modern days computer CPUs are more lean to a System-on-a-Chip (SoC) that integrates all major components of a computing device including CPU, GPU, NPU and RAM. The physically compactness brings shorter communication route among each computing units, hence facilitate computing performance.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think that detailed knowledge of what SoC are is relevant for the target audience (Python beginners who strive to analyse images)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove all technical terms.

@jackyko1991
Copy link
Author

@haesleinhuepf All of the technical terms are removed. An example notebook of tensor core running is added.

If you think NPU is still too out of context for beginners I can move it away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants