Skip to content

Latest commit

 

History

History
320 lines (251 loc) · 16.9 KB

README.md

File metadata and controls

320 lines (251 loc) · 16.9 KB

Fogg Lab Tissue Model Analysis Tools

An application for automated high-throughput analysis of cancer and endothelial cell dynamics in hydrogels.

gui-screenshot

Try the interactive demo notebook in Google Colab

Open In Colab

Table of Contents

Capabilities
GUI Setup
CLI Setup
Tools
Supported Image Formats
Image Input Directory Structure
Usage

Capabilities

For a detailed description of analysis capabilities, see the capabilities overview notebook.

Setup Option 1: Graphical User Interface (GUI)

This option is currently available for Windows users only. Linux and macOS users are encouraged to install the CLI tools.

  1. Navigate to the latest release on the releases page.
  2. Under "Assets", download the tmat-win64.zip file.
  3. Extract the contents of the zip file to any location on your system (e.g. Desktop). Make sure you use the extract button ("extract all" or "extract to") intead of manually dragging files out of the zip file. To use the tools to analyze your images, open the tmat program stored within the extracted folder and follow the guidance on-screen (see the Usage section for more info).

Setup Option 2: Command Line Interface (CLI)

Windows, MacOS, and Linux are all supported for regular installation without GPU acceleration. For GPU acceleration, Linux or Windows Subsystem for Linux (WSL) and an NVidia CUDA-cable GPU is required.

Prerequisite for the CLI Option: Install Python and pipx

Note: Windows users also need to install Microsoft C++ Redistributable.

Note: As an alternative option to using pipx, you could install Tissue Model Analysis Tools in a Conda environment. Otherwise, follow the instructions below.

1. Install a version of Python in the range >=3.9,<3.12 (such as Python 3.11.9). Confirm that the correct Python version was installed by running each of these commands in a terminal or command prompt window, and find out which command is recognized (depending on your system configuration, it could be installed as either python, python3, or py, or something version-specific such as python3.10):

python --version
python3 --version
py --version

2. In your terminal or command prompt window, install pipx. To do so, run the two commands below that correspond to your Python installation (starting with either python -m ..., python3 -m ..., or py -m ...):

python -m pip install --user pipx
python -m pipx ensurepath
python3 -m pip install --user pipx
python3 -m pipx ensurepath
py -m pip install --user pipx
py -m pipx ensurepath

After you run these commands, close the terminal or command prompt window. pipx will be available the next time you open a terminal or command prompt window, and you can proceed with the setup.

CLI Setup

Run the following commands in a terminal or command prompt window.

1. Install fl_tissue_model_tools command-line utility, tmat for short (est. time 5 minutes):

Note: You need to use the "Regular Installation" option unless you have an NVidia GPU and are running in Linux or WSL.

Regular Installation

pipx install fl_tissue_model_tools@git+https://github.com/fogg-lab/tissue-model-analysis-tools.git

Installation with CUDA (GPU Acceleration)

pipx install 'fl_tissue_model_tools[and-cuda]@git+https://github.com/fogg-lab/tissue-model-analysis-tools.git'

2. Configure base directory to store data, scripts, and script configuration files:

tmat configure

3. Note that commands will follow this layout (more details in usage):

tmat [SUBCOMMAND] [OPTIONS]

Or you can use the interactive mode:

tmat

Uninstall fl_tissue_model_tools CLI Utility

Execute

pipx uninstall fl_tissue_model_tools

Update fl_tissue_model_tools CLI Utility

To update tmat, just reinstall it with the --force flag:

pipx install fl_tissue_model_tools@git+https://github.com/fogg-lab/tissue-model-analysis-tools.git --force
tmat configure

Tools

tmat consists of four automated image analysis tools:

  • Z projection of image Z stacks. The input is a directory of Z stacks. The output is a directory of Z projections.
  • Cell coverage area computation. The input is a directory of images (for instance, Z projections). The output is a CSV file and a directory of binary masks, one mask per image, to visually show what was detected as cells.
  • Invasion depth computation (of Z stacks). The input is a directory of Z stacks. The output is a CSV file containing invasion predictions for each Z position in each Z stack.
  • Quantify microvessel formation (number of branches, lengths of branches). The input is a directory of images. Images can either be 2D images such as Z projections for 3D, Z-stack images. Z stacks should be provided as either single files or numbered image sequences (with z0, z1, z2, etc. in the filenames. Not case sensitive.) contained in subdirectories, 1 subdirectory per Z stack. The output is a CSV file containing the total number of branches, total branch length, and average branch length. Additionally, this tool outputs a directory of intermediate outputs, which are all visualizations that you can use to confirm the validity of the analysis. These visualizations can also help you tweak the configuration parameters and run the tool again if it doesn't do a very good job.

Supported Image Formats

tmat supports all images that can be read by aicsimageio[nd2] such as TIFF, OME-TIFF, and ND2.

As a workaround for other formats, you can use the bioformats command line tool or Fiji to convert your files to TIFF or OME-TIFF and feed the converted files to tmat. Be aware that this might not preserve metadata related to physical pixel sizes/spacing, which means you will need manually specify parameters such as image_width_microns as needed (as it can no longer be inferred for the converted image based on its metadata).

Z stacks can be provided as input to the software in two different ways; either as a single file in a multidimensional image format such as OME-TIFF or ND2, or as a numbered image sequence which is the same as how you might load a Z stack from image sequence in ImageJ. The scripts will infer whether input Z stacks are provided as image sequences or single files, by checking whether there are multiple Z slices per file.

Image Input Directory Structure

Input images should be organized in directories containing only input images, and no other files. Make sure your input images for each tool are organized in one of the following ways:

1. Single File Z Stacks (e.g. ND2, OME-TIFF)

Valid input directory structure for all tools

For example:

  • input_directory/
    • first_zstack.nd2
    • second_zstack.nd2
    • ...

2. Image Sequences

Valid input for all tools

Example 1 (using subdirectories):

  • input_directory/
    • first_zstack/
      • zstack1_z0.tif
      • zstack1_z1.tif
      • zstack1_z2.tif
      • ...
    • second_zstack/
      • zstack2_z0.tif
      • zstack2_z1.tif
      • zstack2_z2.tif
      • ...

Example 2 (no subdirectories):

  • input_directory/
    • first_zstack_Z0.tif
    • first_zstack_Z1.tif
    • first_zstack_Z2.tif
    • ...
    • second_zstack_Z0.tif
    • second_zstack_Z1.tif
    • second_zstack_Z2.tif
    • ...

What's important here is that all filenames contain the z position denoted by the letter z (uppercase or lowercase) followed by a number. Other characters in the filename must be consistent across all files in the sequence. Also note that nested subdirectories are not supported.

3. Z Projections or 2D Images

Valid input for cell coverage area computation and microvessels quantification

For example:

  • input_directory/
    • image1.tiff
    • image2.tiff
    • ...

Usage

GUI

  1. Open the program.
  2. Select one of the four tools from the tabs near the top of the window.
  3. Fill in the required input fields, and check the optional fields for more customization.
  4. Click the "Start" button.
  5. Once the analysis is complete, the output will be saved in the output directory you specified.

CLI

To use tmat CLI utility, open a terminal or command prompt window and execute commands in the following format:

# For non-interactive use, specify all arguments in a single command
tmat [command_script] [-flags] [arguments]

For interactive use, just execute tmat.

# Interactive mode can be useful if you forget what command line arguments are available
tmat

For input data paths on Windows, it is usually easiest to copy the path from the file explorer search bar.

For a description of all parameters that each commandline tool accepts, execute one of the following:

tmat [command_script] -h
# or
tmat [command_script] --help
# or (get help at the interactive prompt)
tmat

Cell Area (CLI Usage)

Basic usage (accept the default configuration)

tmat compute_cell_area "/path/to/input/folder" "/path/to/output/folder"

Here, /path/to/input/folder is the full path to a directory of images which will be analyzed.

If your images are not cropped to the region inside the well, you can have the script automatically detect the well region by adding the --detect-well flag (or -w for short). For instance, if your wells are circular and you add the --detect-well flag, the script will detect and mask out the region outside of this circular well. Also works for "squircle" shaped (i.e. square with rounded corners) wells. Example usage:

tmat compute_cell_area --detect-well "/path/to/input/folder" "/path/to/output/folder"

Custom usage in the CLI utility (customize the analysis configuration)

  • Create custom configuration .json file, using config/default_cell_area_computation.json as a template. The following parameters can be customized:
    • dsamp_size (int): Size that input images will be downsampled to for analysis. Smaller sizes mean faster, less accurate analysis. Default is 512, meaning the image will be downscaled so that the maximum dimension is 512 (e.g., 1000x1500 is downsampled to 341x512).
    • sd_coef (float): Strictness of thresholding. Positive numbers are more strict, negative numbers are less strict. This is a multiplier of the foreground pixel standard deviation, so values in the range (-2, 2) are the most reasonable.
    • rs_seed (integer): A random seed for the algorithm. Allows for reproducability since the Gaussian curves are randomly initialized. Default is 0.
    • batch_size (integer): Number of images to process at once. Larger numbers are faster but require more memory. Default is 4.

Run with custom configuration file:

tmat compute_cell_area --config "/path/to/config/file.json" "/path/to/input/folder" "/path/to/output/folder"

Z Projection (CLI Usage)

Basic usage (accept the default configuration)

tmat compute_zproj "/path/to/input/directory" "/path/to/output/folder"

Here, "/path/to/input/directory" is the full path to a directory of Z stacks. Z stacks should be in one of the following supported formats:

  • Each Z stack can be a sequence of images with numbered z positions. Each Z stack image sequence can be in its own subdirectory.
  • Each Z stack can be contained in one file. ND2, TIFF and OME-TIFF files are supported.

To compute Z-projections and their cell area, add the --area flag:

tmat compute_zproj --area "/path/to/input/folder" "/path/to/output/folder"
  • Use the --method flag to select custom Z projection method, from:
    • Minimum: Minimum intensity projection, use --method min
    • Maximum (default): Maximum intensity projection, use --method max
    • Median: Median intensity projection, use --method med
    • Average: Average intensity projection, use --method avg
    • Focus Stacking: Focus stacking projection, use --method fs.

Example: Compute Z projections and cell coverage area with the focus stacking method

tmat compute_zproj --area --method fs "/path/to/input/folder" "/path/to/output/folder"

See Capabilities for details.

Invasion Depth (CLI Usage)

Usage

tmat compute_inv_depth "/path/to/input/folder" "/path/to/output/folder"

For a description of the input directory structure, see Z Projection.

Branches (quantify vessel formation)

Basic usage (accept the default configuration)

tmat compute_branches "/path/to/input/folder" "/path/to/output/folder"

Here, /path/to/input/folder is the full path to a directory of images which will be analyzed.

If your images are not cropped to the region inside the well, you can have the script automatically detect the well region by adding the --detect-well flag (or -w for short). For instance, if your wells are circular and you add the --detect-well flag, the script will detect and mask out the region outside of this circular well. Also works for "squircle" shaped (i.e. square with rounded corners; lens) wells. Example usage:

tmat compute_branches --detect-well "/path/to/input/folder" "/path/to/output/folder"

Custom usage in the CLI utility (customize the analysis configuration)

Customize configuration variables (you can edit config/default_branching_computation.json in your base directory, or refer to src/fl_tissue_model_tools/config in this repository):

  • image_width_microns (float): Optional but recommended. Physical width in microns of the region captured by each image. For instance, if 1 pixel in the image corresponds to 0.8 microns, this value should equal to 0.8x the horizontal resolution of the image. If not specified, the script will attempt to infer this value from the image metadata.
  • model_cfg_path (string): Optional. This is the path to the configuration file of the segmentation model. This parameter is not included in the default configuration file. If it is not specified, the latest pretrained model in the model_training folder will be used.
  • graph_thresh_1 (float): May require some experimentation to find the best value for your data. This threshold controls how much of the morse graph is used to compute the number of branches. Lower values include more of the graph, and more branches are detected. Higher values include less of the graph, and fewer branches are detected. The default is 5. If the default value does not work well, try different values like 0.25, 0.5, 1, 2, 4, etc. up to around 64.
  • graph_thresh_2 (float): Also could use some tuning. This is the threshold for connecting branches, e.g. where it is ambiguous whether two branches are part of the same component. Lower values result in more connected branches, and higher values result in more disconnections. The default is 10. If the default value does not work well, try values like 0.0, 0.25, 0.5, 1, 2, 4, etc. up to around 64.
  • min_branch_length (integer): The minimum branch length (in microns) to consider. The default is 12.
  • max_branch_length (integer): Optional. This is the maximum branch length (in microns) to consider. By default, this parameter is not included in the configuration file. If it is not in the configuration, no maximum branch length will be enforced.
  • remove_isolated_branches (boolean): Whether to remove branches that are not connected to any other branches after the network is trimmed per the branch length constraints (enforcing minimum and maximum branch lengths might isolate some branches, which may or may not be desired). The default is "false".
  • graph_smoothing_window (float): This is the window size (in microns) for smoothing the branch paths. The default is 12.

Trying out a few different values for the graph thresholds tends to yield more accurate quantification of vessel formation. An efficient way to do this is to specify a list of values directly in the configuration file, for example:

{
    "image_width_microns": 1000.0,
    "graph_thresh_1": [0.5, 2, 5, 12, 25],
    "graph_thresh_2": [0, 4, 8, 16],
    "graph_smoothing_window": 12,
    "min_branch_length": 12,
    "remove_isolated_branches": false
}

The example configuration above runs the analysis for all 20 combinations of thresholds.