simplicity package

Subpackages

Submodules

simplicity.dir_manager module

Created on Tue Aug 27 19:36:24 2024

@author: pietro

simplicity.dir_manager.create_directories(experiment_name)[source]: Create necessary subdirectories within the data directory.

simplicity.dir_manager.get_data_dir()[source]: Get the current data directory path.

simplicity.dir_manager.get_experiment_cluster_dir(experiment_name)[source]

simplicity.dir_manager.get_experiment_dir(experiment_name)[source]: Get the experiment_name directory path.

simplicity.dir_manager.get_experiment_fit_result_dir(experiment_name)[source]

simplicity.dir_manager.get_experiment_foldername_from_SSOD(seeded_simulation_output_dir)[source]

simplicity.dir_manager.get_experiment_output_dir(experiment_name)[source]: Get the experiment_name output directory path.

simplicity.dir_manager.get_experiment_plots_dir(experiment_name)[source]

simplicity.dir_manager.get_experiment_settings_dir(experiment_name)[source]: Get the experiment_name simulation parameters directory path.

simplicity.dir_manager.get_experiment_simulations_plots_dir(experiment_name)[source]

simplicity.dir_manager.get_experiment_tree_dir(experiment_name)[source]

simplicity.dir_manager.get_experiment_tree_simulation_dir(experiment_name, seeded_simulation_output_dir)[source]

simplicity.dir_manager.get_experiment_tree_simulation_files_dir(experiment_name, seeded_simulation_output_dir)[source]

simplicity.dir_manager.get_experiment_tree_simulation_plots_dir(experiment_name, seeded_simulation_output_dir)[source]

simplicity.dir_manager.get_nextstrain_dir(experiment_name)[source]

simplicity.dir_manager.get_reference_parameters_dir()[source]

simplicity.dir_manager.get_seed_from_SSOD(seeded_simulation_output_dir)[source]: Extract the zero-padded seed number (e.g. ‘0007’)

simplicity.dir_manager.get_seeded_simulation_output_dirs(simulation_output_dir)[source]

simplicity.dir_manager.get_seeded_simulation_parameters_dir(experiment_name)[source]: Get the experiment_name seeded simulation parameters directory path.

simplicity.dir_manager.get_simulation_output_dirs(experiment_name)[source]

simplicity.dir_manager.get_simulation_output_foldername_from_SSOD(seeded_simulation_output_dir)[source]: Get the simulation_output folder_name of Data/experiment/04_Output/simulation_output/seed_nr

simplicity.dir_manager.get_simulation_parameters_dir(experiment_name)[source]: Get the experiment_name simulation parameters directory path.

simplicity.dir_manager.get_slurm_id_map_dir(experiment_name)[source]

simplicity.dir_manager.get_slurm_logs_dir(experiment_name)[source]

simplicity.dir_manager.get_ssod(sim_out_dir, seed_number)[source]: Returns the seeded simulation output directory (SSOD) for the given seed number.

simplicity.dir_manager.set_data_dir(path)[source]: Set the data directory path.

simplicity.extrande module

Created on Fri Mar 28 13:31:54 2025

@author: pietro

class simplicity.extrande.ProgressReporter(total_time, simulation_id)[source]

Bases: object

__init__(total_time, simulation_id)[source]

close()[source]

update(population, delta_t, reaction_id=None, event_type=None)[source]

simplicity.extrande.extrande_core_loop(parameters, population, helpers, sim_id)[source]: Core extrande loop.

simplicity.extrande.extrande_factory(phenotype_model, parameters, sim_id, rng1, rng2)[source]

simplicity.extrande.get_helpers(phenotype_model, parameters, rng1, rng2)[source]: Returns helper functions for extrande (SIMPLICITY engine).

simplicity.intra_host_model module

Intra host transient state model of SARS-COV-2 infection

@author: Pietro Gerletti

class simplicity.intra_host_model.Host(tau_1, tau_2, tau_3, tau_4, update_mode='matrix')[source]

Bases: object

This class defines the intra-host model of SARS-CoV-2 pathogenesis.

__init__(tau_1, tau_2, tau_3, tau_4, update_mode='matrix')[source]

compute_all_probabilities(delta_t)[source]: Compute probability vectors for all initial states.

data_plot_ih_solution(state, time, step)[source]

Compute:: p_inf - probability of being infectious after a time t p_det - probability of being detectable after a time t p_rec - probability of being recovered after a time t

Parameters:

state (int) – Intra-host model starting state
time (float) – Time for the intra-host model solution

Returns:

Either p_inf, p_dia or p_red. The output is used to plot the
intra-host model results.

factory_get_A_t()[source]: Compute or retrieve matrix exponential expm(A * t) from the precomputed table.

get_jump_rate(state)[source]

get_p_t(A_t, state)[source]: Compute state probability vector p(t) for given A^t and state.

get_update_mode()[source]

simulate_trajectory(delta_t, rng=None, exponential_dt=False)[source]

Simulate disease progression trajectory.

Parameters:

delta_t (float) – Mean time step or fixed step size
rng (np.random.Generator) – Optional random generator for reproducibility
exponential_dt (bool) – If True, draw step from Exp(delta_t)

Returns:

trajectory, time_points, info – Full simulation result

Return type:

tuple

static update_state(p_t, tau)[source]: Sample next state based on rejection sampling.

simplicity.intra_host_model.compute_phase_durations(trajectories)[source]: Compute durations in major infection phases for each trajectory. Returns: list of dicts with phase durations per individual

simplicity.intra_host_model.compute_state_durations(trajectories, max_state=20)[source]: Correctly compute residence time in each state, accounting for repeated states. Returns: dict of state -> list of durations

simplicity.intra_host_model.load_results(filename)[source]

simplicity.intra_host_model.main()[source]

simplicity.intra_host_model.plot_duration_summary_scatter(fixed_results, exp_results, fixed_dts, exp_lambdas, x_shift=0.015)[source]: Compare phase durations vs. Δt (fixed) and λ (exp) using scatter plots with error bars. Points are shifted slightly for clarity.

simplicity.intra_host_model.plot_infectious_duration_vs_step(fixed_phase_durations, exp_phase_durations, fixed_dts, exp_lambdas)[source]

Plot infectious duration vs Δt or λ on log scale.

Parameters:

fixed_phase_durations (dict) – { Δt: list of dicts with phase durations }
exp_phase_durations (dict) – { λ: list of dicts with phase durations }
fixed_dts (list of floats) – Fixed step sizes
exp_lambdas (list of floats) – Exponential step scales

simplicity.intra_host_model.plot_state_duration_stats_grid(trajectories_dict, keys, title_prefix)[source]

Create a 2x2 grid of bar charts showing durations in each intra-host state (0–20) for each Δt or λ value.

Parameters:

trajectories_dict (dict) – Dict of { Δt or λ : list of trajectories ([(t, s), …]) }
keys (list) – List of Δt or λ values to plot
title_prefix (str) – Title prefix for each subplot (e.g., ‘Fixed’, ‘Exp’)

simplicity.intra_host_model.plot_state_timeline_summary(state_durations_dict, phase_durations_dict, title_prefix='')[source]

Visualize average residence times for each state as a timeline-style plot (one per Δt or λ). Each state’s duration is shown as a horizontal line, placed sequentially on the time axis. States are color-coded by the infection phase they belong to.

Parameters:

state_durations_dict (dict) – Dictionary of {Δt or λ: state_durations}, where each state_durations is a dict: { state_index -> list of durations } (e.g. output of compute_state_durations)
title_prefix (str) – Prefix to add to each subplot title, e.g. “Fixed” or “Exp”

simplicity.intra_host_model.run_parallel_simulations(delta_t, n_runs=100, tau_1=2.86, tau_2=3.91, tau_3=7.5, tau_4=8.0, exponential_dt=False, base_seed=None)[source]

simplicity.intra_host_model.save_results(filename, data)[source]

simplicity.output_manager module

Created on Thu Aug 29 13:56:20 2024

@author: pietro

simplicity.output_manager.archive_experiment(experiment_name)[source]

Archives the specified experiment folder as a .tar.gz file and deletes the original folder.

This script: 1. Checks if the ‘Archive’ directory exists within the data directory, and creates it if not. 2. Compresses the experiment folder into a .tar.gz archive that is compatible with Windows. 3. Deletes the original experiment folder after archiving.

Parameters:: experiment_name (str) – The name of the experiment to be archived.
Raises:: FileNotFoundError – If the experiment folder does not exist.

simplicity.output_manager.create_combined_sequencing_df(source, min_seq_number=0, min_sim_lenght=0, individual_type=None)[source]

Join sequencing_data_regression.csv files.

Parameters:: source – Either a string (path to simulation directory) or a list of SSOD paths. - If str: Globs all seeds in that directory. - If list: Uses only the provided seed paths.

simplicity.output_manager.detect_sod_outliers(sod_df, threshold=1.5, hard_floor=1e-09)[source]: Identifies outliers using a Hard Floor for failures AND IQR for statistical deviants.

simplicity.output_manager.export_tree(tree, experiment_name, seeded_simulation_output_dir, tree_type, tree_subtype, file_type)[source]: Export an anytree tree to file

simplicity.output_manager.filter_lineage_frequency_df(lineage_frequency_df, threshold)[source]

Filters the lineage frequency DataFrame by threshold.

Parameters:

lineage_frequency_df – raw DataFrame from read_lineage_frequency()
threshold – float, minimum frequency a lineage must reach at any time

Returns:

pivoted and filtered DataFrame (Time_sampling x Lineage)

Return type:

filtered_df

simplicity.output_manager.filter_sequencing_files_by_simulation_lenght(files, min_sim_lenght)[source]: Filters sequencing files by keeping only the ones from simulation that lasted at least min_sim_lenght.

simplicity.output_manager.get_IH_lineages_data_experiment(experiment_name)[source]

simplicity.output_manager.get_IH_lineages_data_simulation(simulation_output_dir)[source]

simplicity.output_manager.get_OSR_vs_parameter_csv_file_path(experiment_name, parameter, min_seq_number, min_sim_lenght, individual_type=None)[source]

simplicity.output_manager.get_all_individuals_data_for_simulation_output_dir(simulation_output_dir)[source]

simplicity.output_manager.get_clustering_table_filepath(experiment_name: str, ssod: str) → str[source]: gets the filepath: 07_Clustering/{SOD}_seed_{SEED}_clustering.csv

simplicity.output_manager.get_combined_OSR_vs_parameter_csv_file_path(experiment_name, parameter, min_seq_number, min_sim_lenght, individual_type=None)[source]

simplicity.output_manager.get_fit_results_filepath(experiment_name, model_type, individual_type=None, experiment_group=None)[source]

simplicity.output_manager.get_mean_std_OSR(experiment_name, parameter, min_seq_number, min_sim_lenght, individual_type=None, include_outliers=False)[source]

simplicity.output_manager.get_nextstrain_dataset_paths(experiment_name: str, ssod: str)[source]: Returns (dataset_json_path, metadata_tsv_path, base_name) for a given SSOD.

simplicity.output_manager.get_procomputed_matrix_table_filepath(tau_1, tau_2, tau_3, tau_4)[source]

simplicity.output_manager.get_r_effective_lineages_csv_filepath(experiment_name, seeded_simulation_output_dir, window_size, threshold)[source]

simplicity.output_manager.get_r_effective_lineages_traj(ssod, time_window, threshold)[source]: Compute and return the lineage-level R_effective trajectories as a dictionary of Series.

simplicity.output_manager.get_r_effective_population_csv_filepath(experiment_name, seeded_simulation_output_dir, window_size, threshold)[source]

simplicity.output_manager.get_r_effective_population_traj(ssod, time_window)[source]: Compute and return the population-level R_effective trajectory as a pandas Series.

simplicity.output_manager.get_tree_file_filepath(experiment_name, seeded_simulation_output_dir, tree_type, tree_subtype, file_type)[source]

simplicity.output_manager.get_tree_filename(experiment_name, seeded_simulation_output_dir, tree_type, tree_subtype, file_type)[source]

simplicity.output_manager.get_tree_plot_filepath(experiment_name, seeded_simulation_output_dir, tree_type, tree_subtype, file_type='img')[source]

simplicity.output_manager.read_OSR_vs_parameter_csv(experiment_name, parameter, min_seq_number, min_sim_lenght, individual_type=None, include_outliers=False)[source]

simplicity.output_manager.read_R_effective_trajectory(seeded_simulation_output_dir)[source]

simplicity.output_manager.read_clustering_table(experiment_name: str, ssod: str) → DataFrame[source]: Reads the single clustering CSV and literal-evals nested columns.

simplicity.output_manager.read_combined_OSR_vs_parameter_csv(experiment_name, parameter, min_seq_number, min_sim_lenght, individual_type=None)[source]

simplicity.output_manager.read_final_time(seeded_simulation_output_dir)[source]

simplicity.output_manager.read_fit_results_csv(experiment_name, model_type, individual_type=None, experiment_group=None)[source]

simplicity.output_manager.read_individuals_data(seeded_simulation_output_dir)[source]

simplicity.output_manager.read_lineage_frequency(seeded_simulation_output_dir)[source]

simplicity.output_manager.read_phylogenetic_data(seeded_simulation_output_dir)[source]

simplicity.output_manager.read_r_effective_trajs_csv(experiment_name, seeded_simulation_output_dir, time_window, threshold)[source]

Read R_effective (population + lineage) trajectories from CSV. If files are missing, issues a warning and generates them.

Returns:

pd.Series - r_effective_lineages_traj: dict of {lineage_name: pd.Series}

Return type:

r_effective_population_traj

simplicity.output_manager.read_sequencing_data_regression(seeded_simulation_output_dir)[source]

simplicity.output_manager.read_simulation_trajectory(seeded_simulation_output_dir)[source]

simplicity.output_manager.save_R_effective_trajectory(simulation_output, seeded_simulation_output_dir)[source]

simplicity.output_manager.save_final_time(simulation_output, seeded_simulation_output_dir)[source]

simplicity.output_manager.save_fitness_trajectory(simulation_output, seeded_simulation_output_dir)[source]

simplicity.output_manager.save_individuals_data(simulation_output, seeded_simulation_output_dir)[source]

simplicity.output_manager.save_lineage_frequency(simulation_output, seeded_simulation_output_dir)[source]

simplicity.output_manager.save_phylogenetic_data(simulation_output, seeded_simulation_output_dir)[source]

simplicity.output_manager.save_sequencing_dataset(simulation_output, output_path, sequence_long_shedders=False)[source]

Writes the simulated sequencing data to FASTA files and a regression CSV.

Parameters: - simulation_output: The object containing simulation results. - output_path: Directory to save files. - sequence_long_shedders (bool): If True, extracts and saves all lineages

from long shedders (at end of infection/sim) to a separate FASTA and includes them in the regression CSV.

Outputs: 1. sequencing_data.fasta: Random surveillance sequences 2. sequencing_data_long.fasta: All lineages from long-shedders 3. sequencing_data_regression.csv: Combined metrics for both groups. 4. sequencing_data.csv: Raw surveillance metadata

simplicity.output_manager.save_simulation_trajectory(simulation_output, seeded_simulation_output_dir)[source]

simplicity.output_manager.setup_output_directory(experiment_name, seeded_simulation_parameters_path)[source]

Sets up the output directory structure for a simulation.

This function constructs the output directory path based on the provided seeded simulation parameters file path and the experiment name.

Parameters:

seeded_simulation_parameters_path (str) – The file path to the seeded simulation parameters.
experiment_name (str) – The name of the experiment for which the output directory is being set up.

Returns:

The path to the main output directory created for this experiment.

Return type:

str

Example

If the seeded_simulation_parameters_path is ‘/path/to/seed/files/seed_file.txt’ and the experiment_name is ‘Experiment_1’, the function will create and return the path: ‘/data_dir/Experiment_1/04_Output/files/seed_file’

simplicity.output_manager.write_OSR_vs_parameter_csv(experiment_name, parameter, min_seq_number=0, min_sim_lenght=0, individual_type=None)[source]: Calculates OSR for every individual seed and detects outliers per parameter set.

simplicity.output_manager.write_clustering_table(experiment_name: str, ssod: str, df: DataFrame)[source]: Writes the clustering CSV.

simplicity.output_manager.write_combined_OSR_vs_parameter_csv(experiment_name, parameter, min_seq_number=0, min_sim_lenght=0, individual_type=None, include_outliers=False)[source]: Create df of observed substitution rate (tempest regression on joint data). Uses individual OSR results to filter out outliers before combining, unless include_outliers=True.

simplicity.output_manager.write_fit_results_csv(experiment_name, model_type, fit_result, individual_type=None, experiment_group=None)[source]

simplicity.output_manager.write_r_effective_trajs_csv(experiment_name, seeded_simulation_output_dir, time_window, threshold)[source]: Compute and write R_effective (population + lineage) trajectories to CSV.

simplicity.plots_manager module

simplicity.plots_manager.apply_plos_rcparams()[source]

simplicity.plots_manager.apply_standard_axis_style(ax, has_secondary_y=False)[source]: Apply PLOS-compliant style to an axis: - 8pt font (inherited via rcParams) - No top/right spines unless has_secondary_y is True

simplicity.plots_manager.get_fitness_color(fitness_score, nodes_data)[source]

simplicity.plots_manager.get_lineage_color(lineage_name, colormap_df, cmap_name='gist_rainbow')[source]

Retrieve the color assigned to a specific lineage.

Given a lineage name and a lineage-to-color mapping DataFrame (as generated by make_lineages_colormap), this function returns the corresponding color as a hexadecimal string.

Parameters:

lineage_name (str) – Name of the lineage to retrieve the color for.
colormap_df (pd.DataFrame) – DataFrame containing ‘Lineage_name’ and ‘Color’ columns, as returned by make_lineages_colormap.
cmap_name (str, optional) – Unused in this function (retained for compatibility), default is ‘gist_rainbow’.

Returns:

Hexadecimal color code (e.g., ‘#aabbcc’) corresponding to the given lineage.

Return type:

str

Raises:

ValueError – If lineage_name is None or not found in the colormap.

simplicity.plots_manager.get_node_color(node, coloring, tree_data, colormap_df)[source]

simplicity.plots_manager.get_state_color(state)[source]

simplicity.plots_manager.ideal_subplot_grid(num_plots)[source]

simplicity.plots_manager.make_lineages_colormap(seeded_simulation_output_dir, cmap_name='gist_rainbow')[source]

Generate a colormap for lineages based on their order of emergence.

This function reads phylogenetic data for a seeded simulation and assigns each lineage a unique color from a specified matplotlib colormap, ordered by time of emergence. The output is a DataFrame that maps lineage names to hexadecimal color codes.

Parameters:

seeded_simulation_output_dir (str) – Path to the directory containing simulation output and phylogenetic data.
cmap_name (str, optional) – Name of the matplotlib colormap to use for color assignment (default is ‘gist_rainbow’).

Returns:

DataFrame with columns: - ‘Lineage_name’: str, lineage identifiers - ‘Color’: str, hexadecimal color codes

Return type:

pd.DataFrame

simplicity.plots_manager.plot_IH_lineage_distribution(experiment_name)[source]

Plot distribution histograms for intra-host lineage variability across an experiment.

Two subplots are shown: - Left: total number of intra-host lineages per individual - Right: number of distinct lineages per individual

Parameters:: experiment_name (str) – Name of the experiment used to load data and save the plot.

simplicity.plots_manager.plot_IH_lineage_distribution_grouped_by_simulation(experiment_name)[source]

Plot intra-host lineage distributions grouped by tau_3 parameter across simulations.

Each bar represents the proportion of individuals with a given number of distinct intra-host lineages, grouped and color-coded by tau_3 values.

Parameters:: experiment_name (str) – Name of the experiment used to retrieve simulation data and save the plot.

simplicity.plots_manager.plot_IH_lineage_distribution_simulation(experiment_name)[source]

Plot intra-host distinct lineage distributions for each simulation in the experiment.

Each subplot shows the distribution of the number of distinct intra-host lineages per individual, grouped by simulation.

Parameters:: experiment_name (str) – Name of the experiment used to load simulation data and save the plot.

simplicity.plots_manager.plot_OSR_and_IH_lineages_by_parameter(experiment_name, parameter='tau_3', min_seq_number=0, min_sim_lenght=0)[source]

simplicity.plots_manager.plot_OSR_fit(experiment_name, fit_result, model_type, min_seq_number, min_sim_lenght)[source]

Plot nucleotide substitution rate vs. observed substitution rate (OSR) including three panel views (linear, semilog, and log-log).

Each panel shows: - OSR estimates (single simulations) - Fitted curve - Combined regression estimates - Mean OSR values

Parameters:

experiment_name (str) – Name of the experiment, used to load and save data.
fit_result (lmfit.model.ModelResult) – Fitted model result containing .best_fit.
model_type (str) – Model used (used in output file name).
min_seq_number (int) – Minimum number of sequences required for inclusion.
min_sim_lenght (int) – Minimum simulation duration required for inclusion.

simplicity.plots_manager.plot_OSR_fit_figure(experiment_name, parameter, fit_result, OSR_single_sim_data, OSR_combined_sim_data, OSR_mean_data, model_type, min_seq_number, min_sim_lenght)[source]: plot fit of nucleotide substitution rate / observed substitution rates curve

simplicity.plots_manager.plot_R_effective(experiment_name, seeded_simulation_output_dir, window_size, threshold)[source]

Create a two-panel plot: - Top: population-level Rₑ and infection histogram - Bottom: lineage-specific Rₑ (filtered by frequency threshold)

Data is loaded or computed from CSV using output_manager functions.

simplicity.plots_manager.plot_circular_tree(ete_root, tree_type, colormap_df, individuals_lineages, file_path)[source]

simplicity.plots_manager.plot_combined_OSR_fit(experiment_name, fit_result, model_type, min_seq_number, min_sim_lenght)[source]

Plot the fit of nucleotide substitution rate (NSR) vs. observed substitution rate (OSR).

Includes three views of the same data and fit: 1. Linear scale 2. Semi-log scale (log x-axis) 3. Log-log scale (log x- and y-axis)

Parameters:

experiment_name (str) – Name of the experiment for reading data and saving plots.
fit_result (lmfit.model.ModelResult) – Fitted model result containing .best_fit.
model_type (str) – Type of model used (used in file name).
min_seq_number (int) – Minimum number of sequences for inclusion.
min_sim_lenght (int) – Minimum simulation length for inclusion.

simplicity.plots_manager.plot_combined_OSR_vs_parameter(experiment_name, parameter, min_seq_number=0, min_sim_lenght=0)[source]

Plot observed substitution rate (OSR) against the desired simulation parameter.

This figure shows the distribution of OSR values across different values of the specified simulation parameter using a boxplot.

Parameters:

experiment_name (str) – Name of the experiment to fetch output data and save plots.
parameter (str) – Parameter (e.g., ‘nucleotide_substitution_rate’) to group OSR values by.
min_seq_number (int, optional) – Minimum number of sequences required for inclusion (default is 0).
min_sim_lenght (int, optional) – Minimum simulation duration to include (default is 0).

simplicity.plots_manager.plot_combined_tempest_regressions(experiment_name, parameter, min_seq_number=0, min_sim_lenght=0, individual_type=None, y_axis_max=0.01)[source]

Plot a grid of tempest regressions for each simulation, grouped by a parameter value.

Each subplot shows the observed substitution rate (OSR) regression for a simulation with a specific value of the given parameter.

Parameters:

experiment_name (str) – Name of the experiment to retrieve simulation outputs.
parameter (str) – Parameter name to sort simulation outputs by (e.g., ‘nucleotide_substitution_rate’).
min_seq_number (int, optional) – Minimum number of sequences required for regression (default is 0).
min_sim_lenght (int, optional) – Minimum simulation length for inclusion (default is 0).
y_axis_max (float, optional) – Maximum y-axis value for all plots (default is 0.1).

simplicity.plots_manager.plot_comparison_intra_host_models(experiment_name, intra_host_model)[source]

Compare intra-host model dynamics for standard vs. long-shedder individuals.

Two curves are plotted: - One for typical infectious period (standard) - One for long shedders (immunocompromised)

Parameters:

experiment_name (str) – Name of the experiment used to save the plot.
intra_host_model (IntraHostModel) – The intra-host model instance with matrix and solver methods.

simplicity.plots_manager.plot_effective_theoretical_diagnosis_rate(experiment_name, individual_type)[source]

Plot effective vs. theoretical diagnosis rate as a scatter plot with regression fit.

For each simulation, the plot compares the user-specified (theoretical) diagnosis rate with the effective diagnosis rate observed from the output, including standard deviation as vertical error bars.

Parameters:: experiment_name (str) – Name of the experiment used to retrieve simulation outputs and save the plot.

simplicity.plots_manager.plot_extrande_pop_runtime(extrande_pop_runtime_csv)[source]

Plot the number of infected individuals versus simulation runtime.

This figure is generated from a CSV file containing runtime profiling output.

Parameters:: extrande_pop_runtime_csv (str) – Path to the CSV file with two columns: runtime (s) and infected count.

simplicity.plots_manager.plot_fitness(simulation_output)[source]

Plot the average fitness trajectory of a simulation over time.

This figure displays the mean fitness and its standard deviation (as a shaded area) across time steps of the simulation.

Parameters:: simulation_output (object) – A simulation output object containing a .fitness_trajectory attribute, which is a list of (time, (mean, std)) tuples.

simplicity.plots_manager.plot_heatmap_R_diagnosis_rate(experiment_name)[source]

Plot a heatmap showing the relationship between R and theoretical diagnosis rate.

The heatmap values represent the ratio of effective to theoretical diagnosis rate, across all simulation outputs within the experiment.

Parameters:: experiment_name (str) – Name of the experiment used to retrieve simulation output and save the plot.

simplicity.plots_manager.plot_histograms(experiment_name, final_times_data_frames, r_order=None)[source]

Plot histograms of final times across multiple seeded simulations.

Parameters:

experiment_name (str) – Name of the experiment used for organizing output.
final_times_data_frames (pd.DataFrame) – DataFrame where each column contains final time values from different conditions or folders.
r_order (list of float, optional) – Optional list to reorder plots based on their corresponding R values.

simplicity.plots_manager.plot_infection_tree(root, infection_tree_data, tree_subtype, coloring, colormap_df, tree_plot_filepath)[source]

simplicity.plots_manager.plot_infections_hist(individuals_df, ax, colormap_df, bin_size)[source]

Plot stacked histogram of infections per lineage over time (colored by infecting lineage).

Parameters:

individuals_df (pd.DataFrame) – DataFrame containing individual infection data with ‘t_infection’ and ‘inherited_lineage’.
ax (matplotlib.axes.Axes) – Axis object on which to draw the histogram.
colormap_df (pd.DataFrame) – DataFrame mapping lineage names to colors.
bin_size (int) – Width of the time bins (in days) for the histogram.

simplicity.plots_manager.plot_intra_host(experiment_name, intra_host_model, time, step)[source]

Plot intra-host model solutions over time for all initial states.

Displays curves for the probability of being infectious at time t for all possible initial states (0 to 20), with a continuous color mapping.

Parameters:

experiment_name (str) – Name of the experiment used for saving the plot.
intra_host_model (IntraHostModel) – Intra-host model object with _data_plot_model(state, time, step) method.
time (float) – Maximum time to simulate.
step (float) – Time step resolution.

simplicity.plots_manager.plot_lineages_colors_tab(seeded_simulation_output_dir)[source]

Plot a labeled color reference for all lineages in the simulation.

Each lineage is represented by a colored circle and its label, displayed in a vertical table format. Colors are assigned based on the order of emergence using a specified colormap.

The figure is saved to the experiment’s simulation plots directory using the seeded simulation folder and seed as part of the filename.

Parameters:: seeded_simulation_output_dir (str) – Path to the seeded simulation output directory. Used to read lineage data and determine the output save location.

simplicity.plots_manager.plot_phylogenetic_tree(root, phylogenetic_data, tree_subtype, coloring, colormap_df, tree_plot_filepath)[source]

simplicity.plots_manager.plot_simulation(seeded_simulation_output_dir, threshold)[source]

Plot a combined simulation summary figure.

This multi-panel figure includes: 1. System trajectory: Number of infected individuals over time. 2. Lineage frequencies over time (filtered by threshold). 3. Average fitness score with standard deviation.

Parameters:

seeded_simulation_output_dir (str) – Path to the seeded simulation output directory.
threshold (float) – Relative frequency threshold for filtering lineages (e.g., 0.05 = 5%).

simplicity.plots_manager.plot_tempest_regression(sequencing_data_df, fitted_tempest_regression, ax)[source]

Plot the tempest regression for observed substitution rates (OSR).

This plot shows the linear regression of root-to-tip distance over time, with the observed substitution rate (OSR) shown as the slope. The OSR value is also included in the legend.

Parameters:

sequencing_data_df (pandas.DataFrame) – DataFrame with ‘Sequencing_time’ and ‘Distance_from_root’ columns.
fitted_tempest_regression (sklearn.linear_model.LinearRegression) – Fitted regression model for OSR estimation.
ax (matplotlib.axes.Axes) – The matplotlib axis object to draw the plot on.

simplicity.plots_manager.plot_trajectory(seeded_simulation_output_dir)[source]

Plot the population trajectory of a SARS-CoV-2 simulation.

This figure shows the evolution of three compartments over time: - Susceptibles - Infected individuals - Diagnosed individuals (cumulative)

Parameters:: seeded_simulation_output_dir (str) – Path to the seeded simulation output directory.

simplicity.plots_manager.plot_w_t(t_max, t_eval, experiment_name)[source]

Plot the time-dependent weight function w(t) used in the phenotype model.

The weight function represents the relative importance of sequences at different time points in computing the consensus used for fitness estimation.

Parameters:

t_max (float) – Maximum simulation time.
t_eval (float) – Current simulation time at which the weight curve is evaluated.

simplicity.plots_manager.save_plos_figure(fig, filepath)[source]: Save figure as high-resolution TIFF (PLOS compatible).

simplicity.plots_manager.tree_fitness_legend(tree_data, tree_type, tree_plot_filepath)[source]

Create legend for fitness color in plotted trees

tree_data:: phylogenetic data OR individuals data
tree: str: ‘infection’ or ‘phylogenetic’

simplicity.population module

Created on Tue Jun 6 13:13:14 2023

@author: pietro

class simplicity.population.Population(size, I_0, ih_model_parameters, rng3, rng4, rng5, rng6, NSR_long, long_shedders_ratio=0, sequence_long_shedders=False, reservoir=100000)[source]

Bases: object

The class defines a population for the SIMPLICITY simulations. It contains the data about every individual as well as their intra-host model.

__init__(size, I_0, ih_model_parameters, rng3, rng4, rng5, rng6, NSR_long, long_shedders_ratio=0, sequence_long_shedders=False, reservoir=100000)[source]

fitness_trajectory_to_df()[source]

get_lineage_genome(lineage_name)[source]: Fetch lineage genome from lineage name

individuals_data_to_df()[source]

lineage_frequency_to_df()[source]

phylogenetic_data_to_df()[source]

update_fitness_trajectory()[source]

update_ih_lineages_trajectories()[source]

update_lineage_frequency_t(t)[source]: Count how many individuals are infected by each lineage at time t in the population and store the following information: time, lineage_name, frequency.

update_states(delta_t)[source]

update_time(time)[source]

update_trajectory()[source]

simplicity.population.create_population(parameters)[source]: Create population instance from parameters file and return it.

simplicity.population_model module

Created on Fri Mar 28 12:51:33 2025

@author: pietro

simplicity.population_model.SIDR_propensities(population, beta_standard, beta_long, k_ds, k_dl, k_v, seq_rate)[source]

simplicity.population_model.add_lineage(population)[source]: Adds a new lineage to a randomly selected infected individual, duplicating an existing lineage or replacing one if at max capacity.

simplicity.population_model.diagnosis(population, from_long_shedder, seq_rate=0)[source]: Select an infected (and detectable) individual at random and tags it as “diagnosed”. Update the infected and diagnosed compartments. Update detectable_i, infectious_i, infected_i and diagnosed_i.

simplicity.population_model.infect_long_shedder(population, new_infected_index)[source]

simplicity.population_model.infection(population, from_long_shedder=False)[source]: Select a random susceptible individual to be infected and tags it as such. Update compartments and infected_i

simplicity.random_gen module

simplicity.random_gen.randomgen(seed=None)[source]

simplicity.runme module

@author: pietro

STANDARD_VALUES for SIMPLICITY simulation:

“population_size”: 1000 “infected_individuals_at_start”: 100 “R”: 1.5 “k_d”: 0.0055 “k_v”: 0.0085 “e”: 0.0017 (evolutionary rate) “final_time”: 365*3 “max_runtime”: 300 “phenotype_model”: ‘immune waning’ or ‘distance from wt’ “sequencing_rate”: 0.05 “seed”: None “F”: 1.25

If you want to change any, you can specify them in the parameters dictionary below. For each parameter, specify a list of values that you would like to use for the simulation. If you want to change more than one parameter at the time, consider that you need to enter the same number of values for each parameter, e.g. :

par 1 = [value1, value2] par 2 = [value3, value4]

This will run a simulation with par 1 = value1 and par 2 = value 3, and a simulation with par 1 = value2 and par 2 = value4.

Each simulation will be repeated n_seeds time with a different random seed.

The set of all simulations is what we call an experiment.

simplicity.runme.run_experiment(experiment_name: str, set_experiment_parameters: LambdaType, simplicity_runner: ModuleType, archive_experiment=False)[source]

simplicity.runme.user_set_experiment_settings()[source]

simplicity.settings_manager module

Created on Tue Aug 27 19:38:43 2024

@author: pietro

simplicity.settings_manager.check_parameters_names(parameters_dic)[source]

simplicity.settings_manager.generate_experiment_settings(varying_params: dict, fixed_params: dict = None)[source]

Generates a list of parameter combinations from varying and fixed parameters.

Parameters:

varying (dict) – Parameters for which all combinations should be generated.
fixed (dict) – Parameters that should have the same value across all combinations.

Returns:

A list of dictionaries with combined parameter sets.

Return type:

List[dict]

simplicity.settings_manager.generate_filename_from_params(params: dict)[source]

simplicity.settings_manager.get_experiment_settings_file_path(experiment_name)[source]

simplicity.settings_manager.get_n_seeds_file_path(experiment_name)[source]

simplicity.settings_manager.get_n_seeds_from_experiment_settings(experiment_numbered_name)[source]: Reads n_seeds from the specific setting JSON file.

simplicity.settings_manager.get_parameter_specs_file_path()[source]

simplicity.settings_manager.get_parameter_value_from_simulation_output_dir(simulation_output_dir, parameter)[source]

simplicity.settings_manager.get_seeded_simulation_parameters_paths(experiment_name)[source]

Retrieves all seeded simulation parameter file paths for a given experiment.

This function searches through the directory structure of the provided experiment name, looking for all JSON seeded simulation parameters files. It returns a list of full paths to these files.

Parameters:

experiment_name (str) – The name of the experiment for which seeded simulation parameter paths are to be retrieved.

Returns:

A list of file paths, each pointing to a JSON file containing: seeded simulation parameters.

Return type:

list of str

Example

If experiment_name is ‘Experiment_1’, and the directory structure contains multiple JSON files under:

‘/data_dir/Experiment_1/03_Seeded_simulation_parameters’,

the function will return a list like:

[: ‘/data_dir/Experiment_1/03_Seeded_simulation_parameters/subdir/file1.json’, ‘/data_dir/Experiment_1/03_Seeded_simulation_parameters/subdir/file2.json’

]

simplicity.settings_manager.get_simulation_parameters_filepath_of_simulation_output_dir(simulation_output_dir)[source]

simplicity.settings_manager.get_standard_parameters_values_file_path()[source]

simplicity.settings_manager.read_OSR_NSR_regressor_parameters()[source]

simplicity.settings_manager.read_experiment_settings(experiment_name)[source]

simplicity.settings_manager.read_n_seeds_file(experiment_name)[source]

simplicity.settings_manager.read_parameter_specs()[source]

simplicity.settings_manager.read_seeded_simulation_parameters(experiment_name, seeded_simulation_parameters_path)[source]

simplicity.settings_manager.read_settings_and_write_simulation_parameters(experiment_name)[source]

Reads an experiment settings file in JSON format and generates individual simulation parameter files based on the combinations of parameters in the settings. The function will create a separate JSON file for each parameter combination within a directory named after the experiment.

Parameters:

experiment_namestr: The name of the experiment. This is used to locate the settings file and to create the corresponding simulation parameters directory.

simplicity.settings_manager.read_standard_parameters_values()[source]

simplicity.settings_manager.read_user_set_parameters_file(filename)[source]

simplicity.settings_manager.write_experiment_settings(experiment_name: str, experiment_settings: list, n_seeds: int)[source]

Writes experiment settings (a list of parameter dictionaries) to a JSON file.

Parameters:

experiment_name (str) – Name of the experiment (used for output folder).
experiment_settings (list) – List of parameter dictionaries.
n_seeds (int) – Number of random seeds to be stored separately.

simplicity.settings_manager.write_parameter_specs()[source]

simplicity.settings_manager.write_seeded_simulation_parameters(experiment_name: str)[source]

Generates multiple JSON files with different seeds for each simulation parameter file within a specified experiment. The function reads the original simulation parameter files, adds a ‘seed’ field, and writes the modified files to subdirectories named after the original files.

Parameters:

experiment_namestr: The name of the experiment.
n_seedsint: The number of seeded JSON files to generate for each simulation parameter file.

Directory Structure:

Data/ └── experiment_name/

├── 02_Simulation_parameters/ │ ├── param_file_1.json │ ├── param_file_2.json │ └── … └── 03_Seeded_simulation_parameters/

├── param_file_1/ │ ├── seed_0.json │ ├── seed_1.json │ └── … ├── param_file_2/ │ ├── seed_0.json │ ├── seed_1.json │ └── … └── …

simplicity.settings_manager.write_simulation_parameters(file_path, population_size, long_shedders_ratio, tau_1, tau_2, tau_3, tau_3_long, tau_4, infected_individuals_at_start, R, R_long, nucleotide_substitution_rate_long, diagnosis_rate_standard, diagnosis_rate_long, IH_virus_emergence_rate, nucleotide_substitution_rate, final_time, max_runtime, phenotype_model, sequencing_rate, sequence_long_shedders, seed)[source]

simplicity.settings_manager.write_standard_parameters_values()[source]

simplicity.settings_manager.write_user_set_parameters_file(user_set_parameters, filename)[source]

simplicity.simulation module

@author: pietro @author: jbescudie

class simplicity.simulation.Simplicity(parameters, output_directory, sim_id)[source]

Bases: object

__init__(parameters, output_directory, sim_id)[source]

plot()[source]

run()[source]

save_consensus()[source]

simplicity package

Subpackages

Submodules

simplicity.dir_manager module

simplicity.extrande module

simplicity.intra_host_model module

simplicity.output_manager module

simplicity.plots_manager module

simplicity.population module

simplicity.population_model module

simplicity.random_gen module

simplicity.runme module

simplicity.settings_manager module

Parameters:

Parameters:

Directory Structure:

simplicity.simulation module

Module contents