plotting - plotting utilities

Utilities for plotting.

plotting.plot_2d_dim_red(df_dim_red, output_file, config, columns=['C1', 'C2'], groups_column=None, groups=None, plot_other_groups=False)

Plot the results of a two-dimensional dimensionality reduction.

Parameters:
df_dim_redpandas.DataFrame

A data frame containing the results of the dimensionality reduction.

The rows should contain the data points, while the columns should contain the values of each data point’s projection along the principal components.

output_filestr

The file where the plot will be saved.

configdict

A dictionary containing the configuration for the plot’s aesthetics.

columnslist, ["PC1", "PC2"]

A list with the names of the two columns that contain the values of the two dimensions of the projection’s space to be considered when plotting.

groups_columnstr, optional

The name of the column containing the labels of different groups, if any.

If not provided, the data points will be assumed to belong to one group.

If provided, the data points will be colored according to the group they belong.

groupslist, optional

A list of groups of interest. If a list of groups is provided and plot_other_groups is False, only data points belonging to the groups of interest will be plotted. If plot_other_groups is True, the other groups will be plotted according to the aesthetic specifications provided in the configuration.

plot_other_groupsbool, False

If a list of groups of interest if provided, set whether to plot data points belonging to the other groups according to the aesthetic specifications provided in the configuration (True) or not to plot the data points belonging to the other groups at all (False).

plotting.plots.plot_multiple_2d_dim_red(dfs_dim_red, output_prefix, output_fmt, config, plots_per_output=9, columns=['C1', 'C2'], groups_column=None, groups=None, plot_other_groups=False, dfs_names=None)

Plot the results of a series of dimensionality reduction analyses on a single figure (which may be split on multiple pages).

Parameters:
dfs_dim_redpandas.DataFrame

A list of data frames containing the results of the dimensionality reduction analyses.

The rows of each data frame should contain the data points, while the columns should contain the values of each data point’s projection along the principal components.

output_prefixstr

The prefix of the output file(s) that will be written.

The number of output files depends on the number of data frames passed and on the number of plots_per_output.

output_fmt:class`str`

The format of the output file(s) that will be written.

configdict

A dictionary containing the configuration for the plots’ aesthetics.

plots_per_outputint, 9

The maximum number of plots to draw on each output file.

columnslist, ["PC1", "PC2"]

A list with the names of the two columns in each data frame that contain the values of the two dimensions of the projection’s space to be considered when plotting.

groups_columnstr, optional

The name of the column containing the labels of different groups in the data frames, if any.

If not provided, the data points will be assumed to belong to one group.

If provided, the data points will be colored according to the group they belong.

groupslist, optional

A list of groups of interest. If a list of groups is provided and plot_other_groups is False, only data points belonging to the groups of interest will be plotted. If plot_other_groups is True, the other groups will be plotted according to the aesthetic specifications provided in the configuration.

plot_other_groupsbool, False

If a list of groups of interest if provided, set whether to plot data points belonging to the other groups according to the aesthetic specifications provided in the configuration (True) or not to plot the data points belonging to the other groups at all (False).

dfs_nameslist, optional

A list of names for the data frames passed. These names, if passed, will be used as the titles of the corresponding plots.

plotting.plot_get_representations_time(df_time, output_file, config)

Plot the CPU/wall clock time spent in each epoch of each round of optimization when finding the representations for a set of samples (both for the full epoch and for the backward step performed in each epoch).

Parameters:
df_timepandas.DataFrame

A data frame containing the time data. This data frame is produced as an output by the bulkDGD.core.model.DGDModel.get_representations method.

output_filestr

The file where the plot will be saved.

configdict

A dictionary containing the configuration for the plot’s aesthetics.

plotting.plot_r_values_hist(r_values, output_file, config)

Plot a histogram of the r-values.

Parameters:
r_valuesnumpy.ndarray

The r-values. This is a 1D array whose length is equal to the number of genes included in the DGD model.

output_filestr

The file where the plot will be saved.

configdict

The configuration for the plot’s aesthetics.