Setup
This section outlines the steps required to download and configure the CDPP (CRS4 Digital Pathology Platform) using Docker Compose. The platform is composed of multiple containerized services—including the Slides Manager, Annotations Manager, and CWL-based Workflow Engine—designed to work together within a shared environment. You will begin by cloning the deployment repository and generating the necessary configuration files, which define environment variables and service-specific parameters.
Download
You can download the Docker Compose setup from GitHub.
To begin, clone the repository to your local machine:
git clone https://github.com/crs4/cdpp-workflows.git
Configuration
The first step is to generate the default configuration files. Navigate into the cloned repository directory and run the create_env.sh
script:
cd cdpp-workflows
./create_env.sh
After the script completes, the following configuration files will be created:
.env
promort_config/config.yaml
While promort_config/config.yaml
is used specifically by the CDPP Annotations Manager, the main configuration for the entire platform is in the .env
file. This file includes default values for all services, which you can adjust based on your setup.
The configuration for each component is described in detail below:
Slides Manager and Virtual Microscope
The configuration of the Slide Manager component (which also provides Virtual Microscope features) is performed by editing some of the variables defined in the .env
file:
CWL_INPUTS_FOLDER
: This folder contains all the whole slide images (WSI) managed by OMERO, including those in MIRAX formatPREDICTIONS_DIR
: This folder stores the output datasets generated by the computational pipelines managed by the Workflow Engine. These outputs are automatically registered in the OMERO server database. Some outputs are post-processed to create artifacts, such as ROIs for review, while others are rendered on-the-fly as visual layers, such as cancer heatmaps, during slide viewing
Annotations Manager
To configure the Annotations Manager, edit the following variables in the .env
file:
PROMORT_IMG
: Specifies the Docker image used for the Annotations Manager. You can browse available image versions on Docker Hub.PROMORT_PORT
: Port used to access the Annotations Manager’s web user interfacePROMORT_DB_NAME
: Name of the database that will store the data for the Annotations ManagerPROMORT_DB_USER
: Username for connecting toPROMORT_DB
PROMORT_DB_PASSWORD
: Password for thePROMORT_USER
PROMORT_SESSION_ID
: Session ID used for the Django session cookie
The system will automatically create a user in the Annotation Manager which will be used by the Workflow Engine’s tools to interact with it when reading/writing data from/to is necessary. To setup this user edit the following variables:
PROMORT_USER
: Username of the user that will be used by the workflow engine’s toolsPROMORT_PASSWORD
: Password for thePROMORT_USER
to access the Annotation Manager API
Workflow Engine
To configure the Workflow Engine, edit the following variables in the .env
file:
AIRFLOW_HOME
: Base directory for AirflowCWL_TMP_FOLDER
: Temporary directory for CWL-based workflow executionCWL_INPUTS_FOLDER
: Directory for CWL inputs (shared with the Slides Manager, if both systems are running on the same host)CWL_OUTPUTS_FOLDER
: Directory for CWL outputsCWL_PICKLE_FOLDER
: Directory for CWL pickled filesAIRFLOW_WEBSERVER_PORT
: Port to access Airflow web interface (default: 8080)CWL_AIRFLOW_API_PORT
: Port to contact the Airflow API (default: 8081)AIRFLOW_USER
: Admin username for AirflowAIRFLOW_PASSWORD
: Admin password for AirflowINPUT_DIR
: Directory for workflow inputsFAILED_DIR
: Directory for storing data from failed workflowsBACKUP_DIR
: Directory for storing backups of workflow-processed dataPREDICTIONS_DIR
: Directory for model outputs (shared with the Slides Manager, if both systems are running on the same host)CWLDOCKER_GPUS
: GPU IDs for running inference (if available)MYSQL_ROOT_PASSWORD
: Root password for the workflow engine’s databaseMYSQL_DATABASE
: Name of the workflow databaseMYSQL_USER
: Workflow database userMYSQL_PASSWORD
: Password for the workflow database userMYSQL_PORT
: Database portMYSQL_DATA
: Volume for the databaseOME_SEADRAGON_URL
: URL of the Slides Manager web applicationPROMORT_HOST
: Hostname of the Annotations ManagerPROMORT_CONN_TYPE
: Protocol for connecting to the Annotations ManagerPROMORT_PORT
: Port used by the Annotations ManagerPROMORT_USER
: Username of the user to interact with the Annotations ManagerPROMORT_PASSWORD
: Password for thePROMORT_USER
PROMORT_SESSION_ID
: Session ID for the Django session cookiePROMORT_TOOLS_IMG
: Specifies the Docker image used for the Annotations Manager auxiliary tools. You can explore available versions on Docker Hub.
Misc
DOCKER_NETWORK
: Docker Compose network namePROJECT
: Docker Compose project namePROXY_PORT
: Proxy port