QuPath H-DAB Project
  • ๐Ÿ“ŠQuPath guide for H-DAB Cell Counting Docs
  • ๐Ÿ’กPipeline
    • ๐Ÿ’กSimple Pipeline
  • Fundamentals
    • ๐Ÿ› ๏ธGetting Set Up
  • QuPath H-Dab Docs
    • ๐Ÿ’กQuPath H-DAB Tutorial
      • Creating and Opening of Projects
      • Estimating Stain Vectors
      • Training Image Creation
      • Cell Detection
      • Object Classifier
      • Tissue Detection
        • 1๏ธโƒฃAutomatic General Tissue Detection
        • 2๏ธโƒฃManual Specific Tissue Detection
    • ๐ŸงพQuPath Script
      • Batch Processing
    • ๐Ÿ“‚Output
  • Result Analysis Docs
    • ๐Ÿ’กProcessing Package Tutorial
      • Addition Information for Python insiders
      • In-Depth Python usage
    • ๐Ÿ—„๏ธFinal Spreadsheet creation
    • ๐Ÿ†˜Out-of-memory error
  • non-expert docs
    • ๐Ÿ”ŽQuPath Installation
    • ๐Ÿ’ปGit Bash Installation
    • ๐Ÿ–ฅ๏ธPython Installation & Packages
    • Git Bash installation
Powered by GitBook
On this page
  • General Information
  • Step by Step Instructions:
  • Getting Started:
  • Set the system for the analysis:
  • Launch the ProcessScan Classification from the terminal
  • Limitations
  • License
  1. Result Analysis Docs

Processing Package Tutorial

ProcessData is a Python class designed to process data from MRXS, NDPI, CZI and BIF files, merge it with inventory data, and calculate immunopositivity statistics.

Last updated 7 months ago

General Information

From this point on, the tutorial will refer to Python codes. To run the files and produce the positivity rates for your data and if you are not familiar with Python or any terminal prompt, please refer first of all to the section of . Subsequently, follow the rest of the steps of this tutorial. If your computer system is Linux, then a normal Terminal prompt should work.

Step by Step Instructions:

Getting Started:

  1. Create a ProjectFolder on your system:

    You should manually create an empty folder named ProjectFolder for the new steps.

  2. Configure the ProjectFolder:

    The internal structure of the folder should be like this:

The Data_output folder is a new empty folder you create manually. The data files generated by the Python files in the next steps will automatically be stored in this folder and can be opened after the processing.

The Results folder contains all the output files from QuPaht in a .txt format. Those files are generated as output files by QuPath. They contain the PositiveCell and NegativeCell classification which is now used to calculate the positivity rate.

For the ProcessingScan Package, the names must be set to ID_Sample, Antibody, and ID_Slidescanner. A simple Exel sheet for one antibody or multi-sheet Excel workbook containing information on all the samples and antibodies, if multiple antibodies are analysed, is expected.

ID_Slidescanner

The ID_Slidescanner is the ID given to a sample automatically by the process of slide scanning. In our case, those IDs are e.g. slides-2023-06-02T09-24-14-R1-S1. Performing the analysis with the ID_slidescanner allows us to do the analysis blinded without bias.

ID_Sample

The ID_Sample is the Sample number.

In our lab, we mainly work with human samples. They are saved in a database containing information about the donor such as age, gender, etc., and are labeled with numbers e.g. 435.

The ID_Sample however does not have to be a number but can also be e.g. Rat5_TreatmentX_WeekY.

By not working with the ID_Sample directly but only linking it up at the end, the risk of unconscious bias is minimized.

If you analysed multiple antibodies within the same slides you will be able to differentiate them within the annotations in the output file. Therefore within the Antibody column of the Excel file write "antibody".

If you already changed the name of the ID_Slidescanner at the beginning of the project just use the same name for the ID_Sample and ID_Slidescanner and add the Antibody to the Excel sheet.

There should be no spaces in your directories or file names!

Set the system for the analysis:

  1. Open Git Bash:

First time using a terminal window prompt?

  1. Open the project folder:

    You can open the ProjectFolder from the terminal using the command cd.

cd Path/To/ProjectFolder 
How can you retrieve the path of your folder?
  • Windows:

    • To retrieve the path of your folder, you can right-click on the folder on it once you select it. In address bar at the top, the full path will be displayed. You can copy this path by right-clicking and selecting "Copy address" or by simply highlighting the path and copying it (Ctrl+C).

    • It is possible to retrieve the path navigating to the folder for which you want the path, right-click on the folder and select "Properties".

      In the "Properties" window, go to the "General" tab. The "Location" field displays the path. You can copy it by selecting the text and copying (Ctrl+C).

  • MacOs:

    • You can retrieve the path using Finder. Navigate to the folder for which you want the path, right-click (or Control-click) on the folder while holding down the Option key. Choose "Copy (folder name) as Pathname." The path is now copied to your clipboard.

    • You can also retrieve it through Get Info. Navigate to the folder for which you want the path, right-click (or Control-click) on the folder and select "Get Info" or press Command + I. n the Get Info window, the "Where" section shows the path. You can copy it by selecting the text and copying (Command+C).

Bash is a Linux-based system so the direction of \ needs to be changed to /:

C:/Users/andre/OneDrive/Desktop/ProjectFolder

  1. Download the Process_scan class:

  1. Move the directory to the ProjectFolder:

    The downloaded directory, process_scan-main.zip, located in the Downloads folder, should be moved into the ProjectFolder. You can do it simply by moving the directory most conveniently for you and your computer.

Everything installed?

Have you installed the required packages?

No? Then write this command line in your GitBash/Terminal.

pip install pandas numpy seaborn matplotlib scipy openpyxl
  1. De-compress the ProcessScan Python folder:

To access and use the ProcessScam Python Class via the command line, you have to re-open the Git Bash terminal extract the directory, and open the directory using the cd command

unzip process_scan-main
cd process_scan
cd process_scan-main

Launch the ProcessScan Classification from the terminal

python workflow_template.py path/to/your/Results path/to/your/Inventory.xlsx path/to/your/directory/to/Data_output xlsx

Modify the command accordingly with your paths. The last thing is to type in after the paths the extension for the final spreadsheets, you can choose between xlsx or csv.

Don't know how to construct the command in the terminal prompt?

If you are not familiar with coding it might be easiest if you use a Word document to prepare the relative paths and copying it in the Git Bash terminal then. The paths you will need are:

  • path/to/your/Results

  • path/to/your/Inventory.xlsx

  • path/to/your/Data_output

Remember that \ needs to be changed to / in Windows!

Remember that no spaces should be in the names of the folders or files!

A great practice is to substitute the spaces with underscores _

Once you have retrieved the paths of the subfolders in the ProjectFolder, you can put them all together in a new line with one space in between them:

path/to/your/Results path/to/your/Inventory.xlsx path/to/your/Data_output

In the above command, the workflow_template.py is the command to be run, already in the process_mrxs directory, so to complete the command you copy the above line with paths using (Ctrl+C).

Back to the Git Bash terminal prompt, type in:

python workflow_template.py

Move the cursor to the command line and fill in the line pressing Shift + Insert to paste the copied text.

The last thing is to type in after the paths the extension for the final spreadsheets, you can choose between xlsx or csv.

If you get an error message check that there is only a single space between the different paths. Check that your spelling is correct and that you have everything correctly in the ProjectFolder.

Some error messages and how to solve them

KeyError: "['ID_Sample'] not in index"

-> Check your Inventory file, you have a spelling mistake

usage: workflow_template.py [-h] directory_path inventory_file output_path output_extension workflow_template.py: error: the following arguments are required: output_extension

-> You forgot the xlsx after python workflow_template.py path/to/your/Results path/to/your/Inventory.xlsx path/to/your/directory/to/Data_output xlsx

main\process_mrxs_data.py", line 7, in from scipy.stats import pearsonr, spearmanr, kendalltau ModuleNotFoundError: No module named 'scipy'

-> 'scipy' needs to be reinstalled, write code pip install scipy this is the same for all the modules that can't be found

Once the code has been successfully run, you can find spreadsheets in your Data_output folder containing the positivity rate for each antibody and region, as well as basic heatmaps and scatterplots.

If you have multiple antibodies or just want to link the antibody with the donor/sample information use our merger to create a sheet containing all the information linked by the ID_Sample

Limitations

The Inventory file can be .csv or .xslx, and a multi-sheet format is expected.

When defining the output_filename, the options .csv and .xslx are implemented, you can choose based on your preference.

License

This project is licensed under the MIT License - see the LICENSE file for details to the rightful owner.

As the ID_Slidescanner must be linked with the ID_Sample and the Antibody, an Inventory file connecting those needs to be manually generated. The file will link the two IDs. We generated this by opening the slides with , or , depending on your slide scanner, showing the "barcode" to identify the ID_Sample and connect it with the ID-Slidescanner.

To open Git Bash search for it on your computer Menu and double-click on the icon to open. A standard prompt window will open.

The navigation within the GitBash interface does not work with the mouse you need to use the arrow keys on the keyboard.

Downloadrepository or clone it to your local machine/laptop by clicking on the Download ZIP option coming out once you click on the Code button:

From the same Git Bash terminal, the Python script is going to be launched. If Python is not still installed in your system, you can follow the instructions in the tutorial from the same Git Bash terminal. The command you can paste directly into the terminal prompt is:

We have a brief tutorial on how to do that .

Please, consider that the current development is only considered to be launched with the complied on an SH Operating System.

CaseViewer
NDPI.view2
ZEISS ZEN lite
this
Python Installation
here
Prerequisites
๐Ÿ’ก
Git Bash
Basic legend of the Git Bash prompt
You can call the ProjectFolder accordingly.
CaeViewer used to link ID_Slidescanner with ID_Sample
Excel spreadsheet linking the ID_Sample with the ID_Slidescanner
The names of the directory in the example are consistent with the ones used in QuPath, so if you want to change them consider it also in the previous QuPath step.
Example of the address bar
Dowload the ZIP and move it to the folder of your project
Drawing
Page cover image