Usage
Upload data and select coordinates
To upload a dataset, use the Drag and drop or select a file to upload
button and then click on Upload data
. After the data is uploaded correctly, a confirmation message will appear. If the uploaded data is not normalized, an additional window will appear that will allow you to do so. Next, select the appropriate features with cell coordinates and the corresponding velocity vector components. Click on the Submit selected coordinates
button to confirm your selection. The Scatter Plot and Cone Plot sections will update automatically.
There is also an alternative way to upload the file by specifying the path directly on the command line with the --file
flag, e.g.
python celljourney.py --file=/data/projects/sequencing/file.h5ad
Global plot configuration
This section allows global customization of the Cell Journey visualizations.
Show features on hover
- information about the cell to be displayed when the cursor hovers over it. This is an experimental functionality and may e.g. noticeably slow down the program, mainly when applied to many features.
Template
- general theme of the chart. The default value is simple_white. In addition, the appearance can be modified through the Legend
switch (value Show (default) or Hide), the Axes
switch, which allows you to turn off all axes (value Show (default) or Hide), and Color scale
switch which controls appearance of the scale of a continuous feature. Available templates: ggplot2, seaborn, simple_white, plotly, plotly_white, plotly_dark, presentation.
Legend orientation
- the direction in which the legend should display the elements (Vertical (default) or horizontal).
Legend: horizontal/vertical position
- a value between 0 and 1 that controls the legend's position. For horizontal position 0 indicates left, and 1 right. For horizontal 1 indicates top, and 0 bottom.
If the user does not want the legend displayed, change the switch Legend
value to Hide (the default value is Show). Also, for scatterplots combined with trajectories, it is possible to exclude streamlines or streamlets from the legend (switch Streamlines/streamlets in legend
).
Scatter plot
Important note. The 3D scatter plot customization is applied not only to the Scatter plot
section, but also to Plot trajectories
and Cell Journey
.
Select feature
- a variable used for coloring the plot. Its selection automatically changes the value of the button below from Single color
to Feature-base colors
. To return to a single color, select Single color
. A single color can be selected using the color picker located at the bottom.
Modality
- the field is visible only when multimodal data (h5mu file) is loaded. After selecting the modality, an additional field appears directly below, allowing, for example, the selection of a gene when RNA modality is chosen. The plot will be colored based on the selected feature. This parameter also affects the appearance of the heatmap in the Cell Journey section.
Point size
- the size of the displayed points. The minimum effective value for this parameter is 0.1. No upper limit.
Opacity
- the level of points transparency, the lowest value is 0 (full transparency/no visibility), the maximum value is 1 (no transparency/full visibility).
Built-in continuous color scale
- The color palette used when a continuous variable is selected. The color order can be reversed using switch Continuous color scale in reversed direction
(default value is OFF). Available palettes: Balance, Blackbody, Blues, Bluered, Electric, Greens, Greys, Hot, Inferno, Jet, Magma, Plasma, Plotly3, Rainbow, Reds, Solar, Temps, Turbo, Twilight, Viridis. This parameter also affects the volcano plot.
Built-in discrete color scale
- The color palette used when a discrete variable is selected. Available palettes: Crayola, Crayola_Mix, Crayola_Fluorescent, Crayola_Magic_Scent, Crayola_Changeables, Crayola_Pearl_Brite, Crayola_Glitter, Crayola_Metallic_FX, Crayola_Gel_FX, Crayola_Silly_Scents, Crayola_Heads_n_Tails, Crayola_Mini_Twistables, Crayola_10, Crayola_15, Crayola_20.
Space-separated list of hex values (max 20 colors).
- a text field where custom colors can be specified.
The user can define their own color palette for both continuous and discrete data. To do this, switch the Use my custom color scale
to ON and define the colors in the Space-separated list of hex values
field. It is possible to select colors using the color picker at the bottom by enabling the Create custom palette from the color picker
switch. When set to ON, colors chosen by mouse click will be added automatically. Below the color picker, there is also a palette of colors defined in the built-in discrete color scale
field.
If the continuous data scale has an unusual character, such as being highly nonlinear, the user can adjust the gradient transitions using the slider located directly above the Create custom palette from the color picker
switch. By default, this slider is inactive and becomes active only when at least two custom colors are present. When a continuous feature is selected, a histogram of that feature's values appears below the slider, with added dashed lines indicating the transition points between colors.
Volume plot
Add volume plot to continuous feature
- switch to add a cloud that interpolates the selected continuous variable in space. Default is off
force a single point colour when volume plot is drawn. Default is off.
Single color scatter when volume is plotted
- the value from which the spatially interpolated value of the selected feature becomes visible. Default is off.
Volume plot transparency cut-off quantile
- the transparency level of the highest value. Transparency increases gradually, starting from 0 for the smallest value, set by the cut-off quantile. Default is 0.1
Radial basis function
- radial basis function used to interpolate the selected feature. Available options: gaussian, inverse (default), linear, cubic, multi-quadric, thin_plate. The functions are described in details in the scipy documentation (read more).
Smoothing level
- smoothing level for radial basis function.Increasing the value smooths the approximation. For 0 the function always goes through the nodal points. Default is 20.
Gaussian filter standard deviation multipler
- density of the grid on which the values of the selected feature are interpolated. Default is 25.
Grid size
- the multiplier of the radius length that checks for the presence of points in space. A larger value smoothenss the cloud but reduces its accuracy. The default value is 1.
Radius scaler
- the multiplier of the components of the standard deviation vector for the Gaussian filter. The default value is 5 * (dx, dy, dz), where dx, dy and dz are the lengths of the sides of the grid on which the values of the selected feature are interpolated. Increasing this value smoothens the plot but also reduces its accuracy.
Cone plot
Cone size
- a relative sizes of the displayed cones.
Opacity
- a level of cones transparency, the lowest value is 0 (full transparency/no visibility), the maximum value is 1 (no transparency/full visibility)
Color scale
- a selector with a list of built-in palettes for continuous data. Cones are colored based on the lengths of the velocity vectors. Available palettes: Balance, Blackbody, Blues, Bluered, Electric, Greens, Greys, Hot, Inferno, Jet, Magma, Plasma, Plotly3, Rainbow, Reds, Solar, Temps, Turbo, Twilight, Viridis.
Streamline plot
The Trajectories section is designed to determine the global characteristic of cell differentiation. By default, trajectories are calculated for each grid point where the value of the averaged vector field is non-zero. Once the trajectories are generated, the user can adjust their appearance in real-time.
Grid size
- the number of splits into which the interval from the minimum to the maximum value for each axis is divided. The averaged value of the vector field is subsequently determined for each grid point.
Integration method
- algorithm used to determine the position of the cell for the next step. Euler's method (default) is faster but less accurate than 4th Order Runge-Kutta.
Number of steps
- the maximum number of steps taken to determine the trajectory. This is the fundamental parameter that allows controlling the length of the trajectory. Note that the final number of steps can be lower if one of two conditions is met:
- The point is outside the point cloud. The criterion for determining whether a point should be considered to be outside the point cloud can be controlled using
Grid size
andScale grid
. - The progress of the trajectory is not large enough (the trajectory moves too slowly). This criterion can be controlled with the
Difference threshold
parameter.
Step size
- a parameter that scales the size of the step. A larger size step translates into longer and less accurate trajectories. The parameter affects not only length of the trajectories but also their total number.
Difference threshold
- the minimum distance between the last two points of the trajectory. If this threshold is not exceeded, the integration is aborted. This parameter is intended to speed up the calculation process. It can be disabled by setting its the value to 0.
Scale grid
- changes the grid size, which is considered when checking the cells' presence. A value greater than 1 extends the grid and, therefore, increases the probability of generating a longer trajectory but also runs the risk of trajectories going outside the point cloud. A value less than 1 allows better control of trajectories in terms of not leaving the point cloud but will result in fewer and shorter trajectories.
After selecting the preferred values, click the Generate trajectories (streamlines and streamlets)
button. The process of determining the trajectories heavily depends on selected parameters, and it can take from a few seconds to several minutes. The calculation progress can be monitored in the terminal. An example of the final result might look like this:
Generating trajectories for n_grid=30, n_steps=400, step_size=1, diff=0.002
1/2 Averaging vector space consisting of 27000 grid cells
[========================================================================] 100%
2/2 Calculating trajectories for 1196 starting points
[========================================================================] 100%
Finished! Generated 861 trajectories in total.
Trajectories can be depicted as streamlines or streamlets. To choose preferable method click on Show streamlines
(default value) or Show streamlets
.
Streamlets length
- the length of fragments into which trajectories (streamlines) are to be divided. Too large a value relative to the Number of steps
may result in generating an insufficient number of streamlets. This parameter can be updated separately, after generating the averaged vector field and trajectories, by clicking on the Update streamlets
button.
Once the calculation is complete, the figure in the Plot trajectories
tab will be automatically updated. In addition, a histogram with the length (number of steps) of the trajectories (streamlines or streamlets) and a red slider will appear. This allows to filter out trajectories of unwanted length, such as those that are too short.
By default, trajectories are combined with a scatter plot. To turn this feature off switch Combine trajectories with scatter plot
to OFF.
Subset current number of trajectories
- the proportion of displayed trajectories. If the user wants to reduce their number, they should decrease the default value of 1 to a smaller value and then confirm by clicking on Confirm
. To restore all trajectories, click on the Restore all
button. Note: It is recommended to use this function before filtering trajectories using the red slider under the histogram.
Line width
- the thickness of the trajectory. This parameter also affects the appearance of the single trajectory in the Cell Journey section.
Opacity
- the level of transparency of the trajectory (0 - full transparency, 1 - trajectories opaque). This parameter also affects the appearance of the single trajectory in the Cell Journey section.
Color scale
- the palette of colors used to draw trajectories. If user prefers to reverse order of selected color scale, i.e. color representing beginning of the trajecotry should represent its end, a dedicated switch labeled Color scale in reversed direction
must be triggered (the default value is OFF). Available options: Balance, Blackbody, Blues, Bluered, Electric, Greens, Greys, Hot, Inferno, Jet, Magma, Plasma, Plotly3, Rainbow, Reds, Solar, Temps, Turbo, Twilight, Viridis. This parameter also affects the appearance of the single trajectory in the Cell Journey section.
Cell Journey (trajectory)
In the Cell Journey section user can simulate and analyze trajectories of selected cell simply by clicking on them. Here a second vector field is generated. This seemingly redundant feature allows user to generate a denser grid than in the case of the Plot trajectories
, where hundreds of trajectories need to be integrated. The simulation results can be more precise while avoiding the most time consuming calculations. In the case of determining multiple trajectories for a very dense grid, the generation of results can take a long time, and the exploration of the results itself does not work smoothly enough. In this case, this problem will not occur due to the drawing of only one trajectory. Moreover, a user interested solely in analyzing individual trajectories can skip the Plot trajectories
section and avoid generating the global characterization of cell trajectories altogether.
Be careful when manipulating plot view, as any accidental click on a cell may overwrite current result with a new trajectory. This feature can be disabled by switching Lock trajectory
to ON.
Select trajectory
- this field is active only if the user has generated trajectories in the Plot trajectories section. It is recommended to use the same grid size in the Cell Journey section as in Plot Trajectories when using one of the trajectories.
Grid size
- the number of splits into which the interval from the minimum to the maximum value for each axis is divided. The averaged value of the vector field is subsequently determined for each grid point. This is the only parameter that needs to be specified before clicking on Generate grid.
The progress of grid calculation can be monitored in the terminal. An example result:
Averaging vector space consisting of 8000 grid cells
[========================================================================] 100%
Finished!
When the second grid is generated, a scatter plot with the same visual parameters as used in the Scatter plot
will appear. To select a cell, click on it. The appearance of the generated trajectory (line thickness, transparency, color scale) can be controlled using the parameters of the Plot trajectories
section. The points in the Scatter plot
section can be adjusted in the same way.
Integration method
- algorithm used to determine the position of the cell for the next step. Euler's method (default) is faster but less accurate than 4th Order Runge-Kutta.
Starting velocity
- The initial velocity value for integrating the trajectory. For highly noisy data, Cells exact velocity may not provide as robust results as Interpolated or Highest velocity (within the grid) options.
Number of steps
- the maximum number of steps taken to determine the trajectory. This is the fundamental parameter that allows controlling the length of the trajectory. Note that the final number of steps can be shorter if one of two conditions is met:
- The point is outside the point cloud. The criterion for determining whether a point should be considered to be outside the point cloud can be controlled using
Grid size
andScale grid
. - The progress of the trajectory is not large enough (the trajectory moves too slowly). This criterion can be controlled with the
Difference threshold
parameter.
Step size
- a parameter that scales the size of the step. A larger size step translates into longer and less accurate trajectories. The parameter affects not only length of the trajectories but also their total number.
Difference threshold
- the minimum distance between the last two points of the trajectory. If this threshold is not exceeded, the integration is aborted. This parameter is intended to speed up the calculation process. It can be disabled by setting its the value to 0.
Scale grid
- changes the grid size, which is considered when checking the cells' presence. A value greater than 1 extends the grid and, therefore, increases the probability of generating a longer trajectory but also runs the risk of trajectories going outside the point cloud. A value less than 1 allows better control of trajectories in terms of not leaving the point cloud but will result in fewer and shorter trajectories.
Heatmap
By default, the heatmap is determined automatically on the basis of the most variable features in cells close to the selected trajectory. In addition to the automatic selection of features (button Automatically selected
), the user can select their own features of interest (Custom features
) or combine their selection with the automatic results (Show both sets
). The list of custom genes can be determined in the field directly below the buttons. For custom features, the user must select at least two features. In general, there must be at least as many features as there are clusters (Number of clusters field).
Heatmap method
- the method of displaying the average expression on the heatmap. Absolute shows the mean value, while Relative to first segment shows the value subtracted from the mean expression of the first segment of the trajectory. Note: This feature is not available when loading CSV data.
Number of clusters
- the number of clusters for the k-means algorithm to divide the data on the heatmap.
Number of features
- the number of features (genes) displayed on the heatmap.
Tube radius
- the radius of the cylinder tube whose center is defined by the trajectory.
Tube segment
- the number of segments the trajectory is divided into.
Tube cells size
- the size of the displayed cells that are within a distance no greater than the Tube radius from the trajectory. These cells can also be colored by selecting a preferred value from the color picker. To avoid highlighting cells belonging to the tube, change Highlight tube cells
switch to OFF (default value is ON).
Heatmap color scale
- the color scale used for visualizing the heatmap. Available options: Balance, Blackbody, Blues, Bluered, Electric, Greens, Greys, Hot, Inferno, Jet, Magma, Plasma, Plotly3, Rainbow, Reds, Solar, Temps, Turbo, Twilight, Viridis.
Heatmap popover
Remove zeros from feature activities
- when set to ON (default value is OFF), zero values are ignored in the heatmap popover plot (expression of selected feature vs segment).
Plot type
- type of the heatmap popover plot. Available options: Strip plot (default), Box plot, Mean, and Median.
Trendline
- Add expression levels trendline to the heatmap popover plot. Available options: None (default), Mean-based cubic spline, Median-based cubic spline, Ordinary least squares, LOWESS (fine), LOWESS (medium), and LOWESS (coarse).
Export results
Select figure
- the plot to be saved. Available options: Scatter plot, Cone plot, Trajectories (streamlines/streamlets), Single trajectory (Cell Journey), Heatmap (Cell Journey).
Filename
- the file name with or without the extension. Please name your files carefully as they will overwrite files existing in the saved_plots directory.
Format
- the format in which the visualization will be saved. Available formats: png, jpg, webp, svg, pdf, html.
Width/height
- the width and height of the graphic. These values can be altered when the scale figure
parameter is not equal to 1.
Scale figure (size multipler)
- a multiplier that allows you to enlarge (or reduce) the size of the plot without changing the values in the width and height fields.
Select table
- the data to be saved. Available options: Heatmap expression (with feature names and clusters), Trajectory cells (with segment).