- Introduction
- Installation Intructions
- DiMo3D Functions
- Separate Programs
- MATLAB Scripts
- Example Use of Pipeline
DiMo3d is a python package (with underlying matlab functions and c++ libraries) that can be used to compute DM graph reconstructions for 3D imaging data. There are functions for dividing an image stack into subregions, computing persistence diagram for each subregion (required for computing DM graph reconstruction, but could also be of independent interest), computing DM graph reconstruction for each subregion, merging graphs of subregions into a single graph for the full image_stack, and writing graphs to .vtp format to be visualize by applications such as paraview (https://www.paraview.org/).
DM graph reconstruction is a methodology used to extract true underlying graph structure behind noisy input data. The input for the algorithm is a density function defined on a triangulation. Intuitively, picturing the density plotted over the domain as a terrain, the output graph is the mountain ridges of the terrain. For mouse brain imaging data, the density function used is simply the voxel values. The mountain ridges of the density function do an excellent job of captures the neuronal branches in the image stack. However, the persistence computation is a computational bottleneck, and cannot be performed on large datasets. By dividing input images into overlapping subregions, the pipeline is able to efficiently extract accurate graphs for each subregion, and merge these graphs into a single reconstruction for the full domain.
This package was designed to be an intergral part of the pipeline to extract single neuron reconstructions from full mouse brain imaging data. While the pipeline will output a graph capture all neuronal branches, further postprocess of the graph will be required to obtain individual neuron reconstructions. The package, while designed with mouse imaging data specifically in mind, can be used on any 3D imaging dataset. Optimal parameters will vary by dataset and intended use case of the graphs.
Copy DiMo3d to your current working directory, then ensure all system requirements and python libraries are installed. Then compile c++ programs that the library calls, and copy matlab functions to the current working directory. See below for specific instructions.
- Python 3.8.8 (or newer)
- g++ 9.4.0 (or newer)
- cmake 3.16.3 (or newer)
-
cv2 (https://pypi.org/project/opencv-python/)
pip install opencv-python
-
PIL (https://pypi.org/project/Pillow/)
pip install pillow
-
vtk (https://pypi.org/project/vtk/)
pip install vtk
Dipha Persistence Module
> cd ./DiMo3d/code/dipha-3d/
> mkdir build
> cd build
> cmake ../
> make
Discrete Morse Graph Reconstruction Module
> cd ./DiMo3d/code/dipha-output/
> g++ ComputeGraphReconstruction.cpp
Merge Complex Module
> cd DiMo3d/code/merge/
> g++ combine.cpp
Complex Persistence + Discrete Morse Graph Reconstruction Module
> cd DiMo3d/code/spt_cpp/
> g++ DiMoSC.cpp -I./phat/include -std=c++11 -o spt_cpp
> cp ./DiMo3d/code/matlab/* ./
Given an input image stack, divide the image stack into overlapping subregions. The purpose of this is to have subregions of a size such that the DM graph reconstruction algorithm is efficient to run on each individual region. It is recommended to take as large of subregions as possible such that the running time for DM graph reconstruction is still acceptable to the user. Overlap of subregions allows for the merging of the graphs of subregions into a single graph for the full domain. It is recommended to take as small of an overlap as possible such that all individual neuron branches that cross subregions are correctly captured after merging graphs.
- Input_dir - path to input image stack
- output_dir - path to dir containing results for each subregion
- x_len - x-axis length of each subregion
- y_len - y-axis length of each subregion
- z_len - z-axis length of each subregion
- overlap - pixel overlap for each axis between adjacent subregions
Output dir is made containing subdirectories for each region. Each region contains an image stack and its startings x,y,z coordinates Returns nx, ny, nz, and overlap - the x/y/z dimensions of the image stack and the overlap for each axis
>import DiMo3d as dm
>image_stack_dir = “data/image_stack/”
>morse_dir = “results/image_stack_morse/”
>dm.split_domain(image_stack_dir, morse_dir, 64, 64, 64, 5)
The first step of compute DM graph reconstruction to compute the persistence diagrams of each subregion. This packages uses a modified version of the DIPHA software for this computation. DIPHA takes specifically formatted input file - this function creates the required input file for each subregion in the domain.
- input_path - input path to the directory containing subregions for which we will need to compute persistence on. This argument should match output_dir of a previous DiMo3d.split_domain call.
Input file for DIPHA program. A file is written for each subregion.
>import DiMo3d as dm
>image_stack_dir = “data/image_stack/”
>morse_dir = “results/image_stack_morse/”
>dm.split_domain(image_stack_dir, morse_dir, 64, 64, 64, 5)
>dm.write_dipha_persistence_input(morse_dir)
The first step of compute DM graph reconstruction to compute the persistence diagrams of each subregion. This packages uses a modified version of the DIPHA software for this computation. DIPHA takes specifically formatted input file. After generating the input files for the dipha program for each subregion (DiMo3d.write_dipha_persistence_input) this function will call the DIPHA program to compute the persistence diagram for each subregion.
- input_path - input path to the directory containing subregions for which we will need to compute persistence on. This argument should be the same as input_path of a previous DiMo3d.write_dipha_persistence_input call
- threads - number of threads used to run in parallel
Persistence Diagram for each subregion. A file is written for each subregion.
>import DiMo3d as dm
>image_stack_dir = “data/image_stack/”
>morse_dir = “results/image_stack_morse/”
>dm.split_domain(image_stack_dir, morse_dir, 64, 64, 64, 5)
>dm.write_dipha_persistence_input(morse_dir)
>dm.compute_dipha_persistence(morse_dir)
The DM graph reconstruction program takes a specifically formatted file of persistence values of edges in domain as input. The DIPHA program outputs this information but in an incompatiable format. This function will convert the DIPHA output containing the persistence information needed to perform DM graph reconstruction for each subregion.
- input_path - input path to the directory containing subregions for which we will need to compute persistence on. This argument should be the same as input_path of a previous DiMo3d.compute_dipha_persistence call
- threads - number of threads used to run in parallel
Persistence Diagram for each subregion in format meant for discrete Morse graph reconstruction program. A file is written for each subregion.
>import DiMo3d as dm
>image_stack_dir = “data/image_stack/”
>morse_dir = “results/image_stack_morse/”
>dm.split_domain(image_stack_dir, morse_dir, 64, 64, 64, 5)
>dm.write_dipha_persistence_input(morse_dir)
>dm.compute_dipha_persistence(morse_dir)
>dm.convert_persistence_diagram(morse_dir)
The DM graph reconstruction program takes a file as input that contains the coordinates and density (voxel) function values. This function will convert the image stack of each subregion into the .txt format required by the DM graph reconstruction algorithm.
- input_path - input path to the directory containing subregions for which we will need to compute persistence on. This argument should be the same as input_path of a previous DiMo3d.convert_persistence_diagram call
- threads - number of threads used to run in parallel
Text file containing vertex coordinates for each subregion in format meant for discrete Morse graph reconstruction program. A file is written for each subregion.
>import DiMo3d as dm
>image_stack_dir = “data/image_stack/”
>morse_dir = “results/image_stack_morse/”
>dm.split_domain(image_stack_dir, morse_dir, 64, 64, 64, 5)
>dm.write_vertex_files(morse_dir)
Run discrete Morse graph reconstruction on each subregion within input_path directory. The persistence threshold is the only parameter required - a higher threshold will results in a greater simplification of the output graph (fewer edges). Outputs are stored in a folder named after the persistence threshold - so a user can see results at a specific threshold then rerun with an appropriately adjusted threshold.
- input_path - input path to directory containing subregions for which we will need to compute persistence on. This argument should be the same as input_path of a previous DiMo3d.convert_persistence_diagram call
- persistence_threshold - value of persistence threshold parameter used by discrete Morse graph reconstruction algorithm.
- threads - number of threads used to run in parallel
Vertex and edge file representing the discrete Morse graph reconstruction output. The graph files are stored in a directory for the persistence threshold for each threshold
>import DiMo3d as dm
>image_stack_dir = “data/image_stack/”
>morse_dir = “results/image_stack_morse/”
>dm.split_domain(image_stack_dir, morse_dir, 64, 64, 64, 5)
>dm.write_dipha_persistence_input(morse_dir)
>dm.compute_dipha_persistence(morse_dir)
>dm.convert_persistence_diagram(morse_dir)
>dm.write_vertex_file(morse_dir)
>dm.graph_reconstruction(morse_dir, 32)
DiMo3d.merge(input_path, merge_dir, persistence_threshold, merge_threshold, nx, ny, nz, x_len, y_len, z_len, overlap, threads=1)
Perform DM-based hierarchical merging of graphs of subregions. After getting graph reconstructions for each subregion, a user can merge the graphs together in order to obtain a single graph for the entire domain. Merging is required because branches that cut across different subregions might not already be connected by taking the union of the two graphs. First a triangulation is built in the overlap regions, which connects branches meant to be merged. Density estimation is used assign density values to each vertex in the triangulation. Then DM graph reconstruction is performed to extract the true connection path between overlap regions. These triangulations are significantly smaller than that of the full domain, making the merging significantly more efficient than running DM graph reconstruction on full domain.
- input_path - input path to directory containing subregions for which we will need to compute persistence on. This argument should be the same as input_path of a previous DiMo3d.convert_persistence_diagram call
- merge_dir - directory for merge results will be written to
- persistence_threshold - value of persistence threshold parameter used by discrete Morse graph reconstruction algorithm.
- merge_threshold - persistence threshold used during DM computation for merges
- nx - size of x axis of original image stack
- ny - size of y axis of original image stack
- nz - size of z axis of original image stack
- x_len - base size of x axis for subregions
- y_len - base size of y axis for subregions
- z_len - base size of z axis for subregions
- overlap - size of overlap on axes between regions
- threads - number of threads used to run in parallel
Vertex and edge files representing the discrete Morse graph reconstruction output.
>import DiMo3d as dm
>image_stack_dir = “data/image_stack/”
>morse_dir = “results/image_stack_morse/”
>dm.split_domain(image_stack_dir, morse_dir, 64, 64, 64, 5)
>dm.write_dipha_persistence_input(morse_dir)
>dm.compute_dipha_persistence(morse_dir)
>dm.convert_persistence_diagram(morse_dir)
>dm.write_vertex_file(morse_dir)
>dm.graph_reconstruction(morse_dir, 32)
>dm.merge(morse_dir, merge_dir, 32, 32, 256, 256, 256, 64, 64, 64, 5, 1)
Convert .txt format graph (vert file and edge file) to .vtp format. Through the package, DM graphs are outputted as two .txt files - a text file for vertices and a text file for edges. Different formatting is required to visualize graphs in 3rd party software. This function will output a .vtp file for the specified input graph. .vtp can be viewed in softwares such as Paraview.
- vert_filename - vertex coordinates of graph
- edge_filename - edges of graph
- output_filename - vtp formatted graph file
A single file (output_filename) written in .vtp format containing the input graph
>import DiMo3d as dm
>vert_filename = “/path/to/dimo_vert.txt”
>edge_filename = “/path/to/dimo_edge.txt”
>output_filename = "/path/to/graph.vtp"
>dm.write_vtp_graph(vert_filename, edge_filename, output_filename)
DiMo3d.extract_subregion(input_dir, output_dir, x_center, y_center, z_center, x_len, y_len, z_len, threads=1):
Extract a subregion from full brain fMOST image stack. This function is meant for extracting a subregion off of an fMOST image stack. fMOST brain imaging data is huge and can take a very long time to process. Taking a smaller - but still sizable subregion is a great way to see DM graphs for meaningful portions of the brain.
- input_dir - input image stack of large (presumably full) domain
- output_dir - directory subregion image stack will be stored
- x_center - subregion center's x coordinate (in microns! - take from ground truth .swc file)
- y_center - subregion center's y coordinate (in microns! - take from ground truth .swc file)
- z_center - subregion center's z coordinate (in microns! - take from ground truth .swc file)
- x_len - length of x-axis of subregion (in pixels!)
- y_len - length of y-axis of subregion (in pixels!)
- z_len - length of z-axis of subregion (in pixels!)
- threads - images handled at a time
Image stack of a subregion of an fMOST brain
>import DiMo3d as dm
>dm.extract_subregion("data/image_stack/", "results/smaller_image_stack/", 128, 128, 128, 64, 64, 64)
The first step of the DM graph reconstruction algorithm is to compute persistence on the so called lower star filtration of the input triangulation with respect to the density function. This package uses the state of the art DIPHA persistence program for this step. The program is a modified version of the code found at (https://github.com/DIPHA/dipha) that is meant to explicitly take 3D data and output all of the persistence information needed for the DM graph reconstruction algorithm.
DiMo3d.compute_dipha_persistence
- input_filename - path to DIPHA input file
- output_filename - filename for traditional DIPHA program persistence diagram
- edge_filename - filename for edge information - vertices, edge type (negative or positive), and persistence value
Binary file (edge_filename) containing persistence information for all edges
After computing persistence of the edges in the domain, the DM algorithm uses the persistence information to build an appropriate spanning forest and extracts the stable 1-manifolds (mountain ridges) on top of the spanning forest. The program takes a list of vertices in the domain and a file containing the persistence information computed by DIPHA.
DiMo3d.graph_reconstruction
- vert_filename - file contain verts in domain
- dipha_edge_filename - .txt format of edge persistence information
- persistence threshold - persistence threshold for graph reconstruction algorithm
- output_dir - directory where output graph will be written
Graph (dimo_vert.txt and dimo_edge.txt) written to output_dir
Builds a single complex out of multiple subregion's graphs to output that will be simplified with the DM graph reconstruction algorithm. After computing graph reconstruction of each subregion - the first step of the merging process is to build a complex containing a triangulation of overlap regions that contain graphs in order to connect falsely broken branches running between subregions. This program build a complex for a single merge.
DiMo3d.merge
- input_dir - directory with results we will need to merge
- output_dir - directory where merge results will be saved
- config_filename - file containing information of regions we will merge
- persistence_threshold - threshold of graphs we wish to merge - this is NOT the persistence threshold used for simplifying DM graph of the output complex
A single complex built from up to 8 individual subregions DM graph reconstructions. Will perform another round of DM on this output to get a single graph reconstruction for all subregions
Run DM Graph reconstruction on a simplicial complex with density function defined on its vertices. After building a triangulation of the overlap region, another round of DM on the overlap regions is required to extract the true connection between subregions - without this connections will be entire triangulations which is not a true representation of the underlying neuronal branches. The program is a modified version of the code found at (https://github.com/wangjiayuan007/graph_recon_DM) meant to include what are known as unpaired edges and their stable 1-manifolds. Unpaired edges capture loops that are in the input complex and cannot be removed with persistence thresholding.
DiMo3d.merge
- input_filename - simplicial complex file (.bin)
- output_dir - directory where morse graph will be written
- persistence_threshold - threshold used by DM graph reconstruction algorithm
- dimension - dimension simplicial complex lies within (this should always be 3 for the sake of this pipeline)
Graph in .txt format (overlap_dimo_vert.txt and overlap_dimo_edge.txt)
Creates DIPHA input file for all subregions. The DIPHA program takes a specifically formatted input file. The authors of DIPHA included a script for generating such file for imaging data. This matlab function is a modified version meant to build write the input file needed for every subregion.
DiMo3d.write_dipha_persistence_input
input_path - input path to directory containing subregions for which we will need to compute persistence on. valid_filename - filename containing list of valid subregions. This is generated within the python pipeline
DIPHA input file for each subregion. A subregion’s file is written to its results directory
Converts DIPHA persistence diagram to .txt format for use by Discrete Morse Graph Reconstruction program. The DIPHA program includes a matlab function to read in binary output file of program to visualize persistence diagram. This matlab function is a modified version meant to convert the dipha output of a subregion into a usable format for the DM-graph reconstruction program.
DiMo3d.convert_persistence_diagram
- input_filename - Persistence diagram output directly from DIPHA program (.bin format)
- output_filename - filename for .txt format of persistence diagram
Persistence Diagram in .txt format for each subregion
>import DiMo3d as dm
>persistence_threshold = 256
>merge_persistence_threshold = 256
>image_stack_dir = “data/image_stack/”
>morse_dir = “results/image_stack_morse/”
>merge_dir = “results/image_stack_merge/”
>nx, ny, nz, overlap = dm.split_domain(image_stack_dir, morse_dir, 128, 128, 128, 16)
>dm.write_dipha_persistence_input(morse_dir)
>dm.compute_dipha_persistence(morse_dir)
>dm.convert_persistence_diagram(morse_dir)
>dm.write_vertex_file(morse_dir)
>dm.graph_reconstruction(morse_dir, persistence_threshold)
>dm.merge(morse_dir, merge_dir, persistence_threshold, merge_persistence_threshold, nx, ny, nz, 128, 128, 128, overlap, 1)