phenonaut.data package
Submodules
phenonaut.data.dataset module
- class phenonaut.data.dataset.Dataset(dataset_name: str, input_file_path_or_df: Path | str | DataFrame | None = None, metadata: dict | Path | str = {}, kind: str | None = None, init_hash: str | bytes | None = None, h5_key: str | None = None, features: list[str] | None = None)
Bases:
object
Dataset constructor
Dataset holds source agnostic datasets, read in using hints from a user configurable YAML file describing the input CSV file format and indicating key columns.
- Parameters:
dataset_name (str) – Dataset name
input_file_path (Union[Path, str, pd.DataFrame]) – Location of the input CSV/TSV/H5 file to be read, or a pd.DataFrame to use. If None, then an empty DataFrame object is returned. As well as CSV/TSV files, the location of an h5 file may also be given which contains a pandas dataframe. If h5 is given, then it is expected that a h5_key argument be passed.
metadata (Union[dict, Path, str]) – Dictionary or path to yml file describing CSV file format and key columns. a ‘sep’ key:value pair may be supplied, but if absent, then the file is examined and if a TAB character is found present in the first line of the file, then it is assumed that the TAB character should be used to delimit values. This check is not performed if a ‘sep’ key is found in metadata, allowing a simple way to override this check. By default {}.
kind (Optional[str]) – Instead of providing metadata, some presets are available, which make reading in things like DRUG-Seq easier, without the need to explicitly set all required transforms. If used as well as metadata, then the preset metadata dictionary from the kind argument is first loaded, then updated with anything in the metadata dictionary, this therefore allows overriding specific presets present in kind dictionaries. Available ‘kind’ dictionaries may be listed by examining: phenonaut.data.recipes.recipes.keys()
init_hash (Optional[Union[str, bytes]]) – Cryptographic hashing within Phenonaut Datasets can be initialised with a starting/seed hash. This is useful in the creation of blockchain-like chains of hashes. In environments where timestamping is unavailable, hashes may be published and then used as input to subsequent experiments. Building up a provable chain along the way. By default None, implying an empty bytes array.
h5_key (Optional[str]) – If input_file_path is an h5 file, then a key to access the target DataFrame must be supplied.
features (Optional[list[str]]) – Features may be supplied here which are then added to the metadata dict if supplied.
- Raises:
FileNotFoundError – Input CSV file not found
DataError – Metadata could not be used to parse input CSV
- add_well_id(numerical_column_name: str = 'COLUMN', numerical_row_name: str = 'ROW', plate_type: int = 384, new_well_column_name: str = 'Well', add_empty_wells: bool = False, plate_barcode_column: str | None = None, no_sort: bool = False)
Add standard well IDs - such as A1, A2, etc.
If a dataset contains numerical row and column names, then they may be translated into standard letter-number well IDs.
- Parameters:
numerical_column_name (str, optional) – Name of column containing numeric column number, by default “COLUMN”.
numerical_row_name (str, optional) – Name of column containing numeric column number, by default “ROW”.
plate_type (int, optional) – Plate type - note, at present, only 384 well plate format is supported, by default 384.
new_well_column_name (str, optional) – Name of new column containing letter-number well ID, by default “Well”.
add_empty_wells (bool, optional) – Should all wells from a plate be inserted, even when missing from the data, by default False.
plate_barcode_column (str, optional) – Multiple plates may be in a dataset, this column contains their unique ID, by default None.
no_sort (bool, optional) – Do not resort the dataset by well ID, by default False.
- copy()
Return a deep copy of the Dataset object
- Returns:
Copy of the input object.
- Return type:
PhenonautData
- property data
Return self.df[self.features]
- Returns:
DataFrame containing only features and index
- Return type:
pd.DataFrame
- df_to_csv(output_path: Path | str, **kwargs)
Write DataFrame to CSV
Convenience function to write the underlying DataFrame to a CSV. Additional arguments will be passed to the Pandas.DataFrame.to_csv function.
- Parameters:
output_path (Union[Path, str]) – Target output file
- df_to_multiple_csvs(split_by_column: str, output_dir: str | Path | None = None, file_prefix: str = '', file_suffix='', file_extension='.csv', **kwargs)
Wite multiple CSV files from a dataset DataFrame.
In the case where one output CSV is required per plate, then splitting the underlying DataFrame on something like a PlateID serves the purpose of generating one output CSV file per plate. This can be achieved with this function and providing the column to split on.
- Parameters:
split_by_column (str) – Column containing unique values within a split output CSV file
output_dir (Optional[Union[str, Path]], optional) – Target output directory for split CSV files, by default None
file_prefix (str, optional) – Prefix for split CSV files, by default “”
file_suffix (str, optional) – Suffix for split CSV files, by default “”
file_extension (str, optional) – File extension for split CSV files, by default “.csv”
- distance_df(candidate_dataset: Dataset, metric: str | Callable = 'euclidean', return_best_n_indexes_and_score: int | None = None, lower_is_better=True) DataFrame
Generate a distance DataFrame
Distance DataFrames allow simple generation of pd.DataFrames where the index take the form of perturbations and the columns other perturbations. The values at the intersections are therfore the distances between these perturbations in feature space. Many different metrics both inbuilt and custom/user defined may be used.
- Parameters:
candidate_dataset (Dataset) – The dataset to which the query (this) should be compared.
metric (Union[str, Callable], optional) – Metric which should be used for the distance calculation. May be a simple string understood by scipy.spatial.distance.cdist, or a callable, like a function or lambda accepting two vectors representing query and candidate features. By default “euclidean”.
return_best_n_indexes_and_score (Optional[int], optional) – If an integer is given, then just that number of best pairs/measures are returned. By default None.
lower_is_better (bool, optional) – If using the above ‘return_best_n_indexes_and_score’ then it needs to be flagged if lower is better (default), or higher is better. By default True
- Returns:
Returns a distance Dataframe, unless ‘return_best_n_indexes_and_score’ is an int, in which case a list of the top scoring pairs are returned in the form of a nested tuple: ((from, to), score)
- Return type:
Union [pd.DataFrame, tuple(tuple(int, int), float)]
- Raises:
ValueError – Error raised if this Dataset and the given candidate Dataset do not share common features.
- divide_mean(query: str) None
Divide dataset features by the mean of rows identified in the query
Useful function for normalising to controls.
- Parameters:
query (str) – Pandas style query to retrieve rows from which means are calculated.
- divide_median(query: str) None
Divide dataset features by the median of rows identified in the query
Useful function for normalising to controls.
- Parameters:
query (str) – Pandas style query to retrieve rows from which medians are calculated.
- drop_columns(column_labels: str | list[str] | tuple[str], reason: str | None = None) None
Drop columns inplace, update features if needed and set new history.
Intelligently drop columns from the dataset (inplace). If any of those columns were listed as features, then remove them from the features list and set a new new history. Updating features and new history only happens if it needs to (removed column was a feature). Updaing of features will cause hash update.
- Parameters:
column_labels (Union[str, list[str], tuple[str]]) – List of column labels which should be removed. Can also be a str to remove just one column.
reason (Optional[str]) – A reason may be given for dropping the column. If not None and the column was a feature, then this reason is recorded along with the history. If None, and the column was a feature, then the history entry will state: “Droped columns ({column_labels})” Where column_labels contains the dropped columns. If reason is not None and the column is a feature, then the history entry will state: “Droped columns ({column_labels}), reason:{reason}” where {reason} is the given reason. Has no effect if the dropped column is not a feature, or the list of dropped columns do not contain a feature. By default None.
- drop_nans_with_cutoff(axis: int | None = None, nan_cutoff: float = 0.1) None
Drop rows or columns containing NaN or Inf values above a specified cutoff percentage.
Parameters:
- axis: Optional[int], default=None
Axis along which to drop NaN or Inf values. If None, both rows and columns are dropped.
- nan_cutoff: float, default=0.1
Cutoff percentage for NaN or Inf values. Rows or columns with NaN or Inf percentages greater than this value will be dropped.
- drop_rows(row_indices: Index) None
Drop rows inplace given a set of indices.
Intelligently drop rows from the dataset (inplace). Updating of rows will not cause hash update as features are unchanged.
- Parameters:
row_indices (pd.Index) – List of row indexes which should be removed. Can also be an int to remove just one row.
- Raises:
KeyError – Error raised if the index is missing from dataframe index:
- property features
Return current dataset features
- Returns:
List of strings containing current features
- Return type:
list
- filter_columns(column_names: list, keep=True, regex=False)
Filter dataframe columns
- Parameters:
column_names (list) – Column names
keep (bool, optional) – Keep columns listed in column_names, if false, then the opposite happens and these columns are removed, by default True.
- filter_columns_with_prefix(column_prefix: str | list, keep: bool = False)
Filter columns based on prefix
- Parameters:
column_prefix (Union[str, list]) – Prefix for columns as a string, or alternatively, a list of string prefixes
keep (bool, optional) – If true, only columns matching the prefix are kept, if false, these columns are removed, by default False
- filter_inplace(query: str) None
Apply a pandas query style filter, keeping all that pass
- Parameters:
query (str) – Pandas style query which when applied to rows, will keep all those which return True.
- filter_rows(query_column: str, values: list | str, keep: bool = True)
Filter dataframe rows
- Parameters:
query_column (str) – Column name which is being filtered on
values (Union[list, str]) – List or string of values to be filtered on
keep (bool, optional) – If true, then only rows containing listed values in query column are kept. If this argument is false, then the opposite occurs, and the rows matching are discarded, by default True
- get_df_features_perturbation_column(quiet: bool = False) tuple[DataFrame, list[str], str | None]
Helper function to obtain DataFrame, features and perturbation column name.
Some Phenonaut functions allow passing of a Phenonaut object, or DataSet. They then access the underlying pd.DataFrame for calculations. This helper function is present on Phenonaut objects and Dataset objects, allowing more concise code and less replication when obtaining the underlying data.
- Parameters:
quiet (bool) – When checking if perturbation is set, check without inducing a warning if it is None.
- Returns:
Tuple containing the Dataframe, a list of features and the perturbation column name.
- Return type:
tuple[pd.DataFrame, list[str], str]
- get_ds_from_query(name: str, query: str)
Make a new Dataset object from a pandas style query.
- Parameters:
name (str) – Name of new dataset
query (str) – Pandas style query from which all rows returning true will be included into the new PhenonautGenericData set object.
- Returns:
New dataset created from query
- Return type:
- get_feature_ranges(pad_by_percent: float | int) tuple
Get the ranges of feature columns
- Parameters:
pad_by_percent (Union[float, int]) – Optionally, pad the ranges by a percent value
- Returns:
Returns tuple of tuples with shape (features, 2), for example, with two features it would return: ((min_f1, max_f1), (min_f2, max_f2))
- Return type:
tuple
- get_history() list[TransformationHistory]
Get dataset history
- Returns:
List of TransformationHistory (named tuples) which contain a list of features as the first element and then a plain text description of what was applied to arrive at those features as the second element.
- Return type:
list[TransformationHistory]
- get_non_feature_columns() list
Get columns which are not features
- Returns:
Returns list of Dataset columns which are not currently features.
- Return type:
list[str]
- get_unique_perturbations()
- groupby(by: str | List[str])
Returns multiple new Dataset objects by splitting on columns
Akin to performing groupby on a pd.DataFrame, split a dataset on one or many columns and return a list of Phenonaut Datasets containing the information contained within each unique split.
- Parameters:
by (Union[str, list[str]]) – If a string, then this is used as a column name upon which to group the dataset and return unique classes based on this column. A list of strings is also allowed, enabling grouping of datasets by multiple columns, such as [‘timepoint’, ‘concentration’]
- Returns:
A list of new phenonaut.Dataset objects split on the value(s) of the by argument
- Return type:
List[phenonaut.Dataset]
- property history
Get dataset history
Returns the same as calling .get_history on the dataset
- Returns:
List of TransformationHistory (named tuples) which contain a list of features as the first element and then a plain text description of what was applied to arrive at those features as the second element.
- Return type:
list[TransformationHistory]
- impute_nans(groupby_col: str | list[str] | None = None, impute_fn: Callable | str | None = 'median') None
Impute missing values in the DataFrame.
Parameters:
- groupby_col: str or list of str, default=None
The name(s) of the column(s) to group by when imputing missing values. If None, impute missing values across the entire DataFrame.
- impute_fn: Union[Callable, str, None]
The callable to use for imputing missing values on the DataFrame or grouped DataFrame as defined by the groupby_col. Special cases exist for ‘median’ and ‘mean’, whereby pd.median and pd.mean are applied. If None, then no action is taken. By default ‘median’.
- new_aggregated_dataset(identifier_columns: list[str], new_dataset_name: str = 'Merged rows dataset', transformation_lookup: dict[str, Callable | str] | None = None, tranformation_lookup_default_value: str | Callable = 'mean')
Merge dataset rows and make a new dataframe
If we have a pd.DataFrame containing data derived from 2 fields of view from a microscopy image, a sensible approach is averaging features. If we have the DataFrame below, we may merge FOV 1 and FOV 2, taking the mean of all features. As strings such as filenames should be kept, they are concatenated together, separated by a comma, unless the strings are the same, in which case just one is used.
Here we test a df as follows:
ROW
COLUMN
BARCODE
feat_1
feat_2
feat_3
filename
FOV
1
1
Plate1
1.2
1.2
1.3
fileA.png
1
1
1
Plate1
1.3
1.4
1.5
FileB.png
2
1
1
Plate2
5.2
5.1
5
FileC.png
1
1
1
Plate2
6.2
6.1
6.8
FileD.png
2
1
2
Plate1
0.1
0.2
0.3
fileE.png
1
1
2
Plate1
0.2
0.2
0.38
FileF.png
2
Merging produces:
ROW
COLUMN
BARCODE
feat_1
feat_2
feat_3
filename
FOV
1
1
Plate1
1.25
1.3
1.40
fileA.png,FileB.png
1.5
1
1
Plate2
5.70
5.6
5.90
FileC.png,FileD.png
1.5
1
2
Plate1
0.15
0.2
0.34
FileF.png,fileE.png
1.5
Note that the FOV column has also been averaged.
- Parameters:
identifier_columns (list[str]) – If a biochemical assay evaluated through imaging is identified by a row, column, and barcode (for the plate) but multiple images taken from a well, then these multiple fields of view can be merged, creating averaged features.
new_dataset_name (str, optional) – Name for the new Dataset, by default “Merged rows dataset”
transformation_lookup (dict[str,Union[Callable, str]]) – Dictionary mapping data types to aggregations. When None, it is as if the dictionary: {np.dtype(“O”): lambda x: “,”.join([f”{item}” for item in set(x)])} was provided, concatenating strings together (separated by a comma) if they are different and just using one if they are the same across rows. If a type not present in the dictionary is encountered (such as int, or float) in the above example, then the default specified by transformation_lookup_default_value is returned. By default, None.
tranformation_lookup_default_value (Union[str, Callable]) – Transformation to apply if the data type is not found in the transformation_lookup_dictionary, can be a callable or string to pandas defined string to function shortcut mappings. By default “mean”.
- Returns:
Dataset with samples merged.
- Return type:
- property perturbation_column
Return the name of the treatment column
A treatement is an identifier relating to the peturbation. In many cases, it is the unique compound name or identifier. Many replicates may be present, with identifiers like ‘DMSO’ etc.
- Returns:
Column name of dataframe containing the treatment.
- Return type:
String
- pivot(feature_names_column: str, values_column: str)
- remove_blocklist_features(blocklist: Path | str | list[str], skip_first_line_in_file: bool = True, erase_data: bool = True, apply_to_non_features: bool = True, remove_prefixed: bool = True)
Remove blocklisted features/columns from a Dataset
Allows removal of predefined feature blocklists. Featurisation may generate features which are to be excluded from analysis as standard. This is the case with cellular images featurised with cell profiler. As such, there are a set of blocklist feautures which are often applied. This function allows specification of a list of features for removal (in the form of a list), or a string or path object denoting the location of a file containing this information. A special string may also be passed to this function: “CellProfiler”, which instructs Phenonaut to download the standard blocklist located here: https://figshare.com/ndownloader/files/23661539. Whilst matching features are removed, by default features which have a prefix on a blocklist matched feature are also removed. See parameters.
Note: matching columns which are not features are also removed by default, see parameters.
- Parameters:
blocklist (Union[Path, str, list[str]]) – A str or Path directing Phenonaut to where a text file of blocklisted features is stored. Alternatively, a list of blocklisted features may be supplied. A special value is also accepted, whereby a string of “CellProfiler” is passed in, causing Phenonaut to retrieve the commonly used CellProfiler blocklisted features from https://figshare.com/ndownloader/files/23661539 .
skip_first_line_in_file (bool, optional) – Commonly, blocklist files have a title line, which can be ignored before starting to list features. If True, then the first line is ignored. By default True.
erase_data (bool) – If False, then no removal of columns from the Dataset is performed, only ensuring that no features are set which match the blocklist. This means that blocklist columns could persist in the Dataset as non-features. If True, then features are removed, and matching columns deleted. If False, apply_to_non_features has no effect. By default, True.
apply_to_non_features (bool) – If True, then apply the filtering to columns as well as features. By default True.
removed_prefixed (bool) – If True, features/columns may still be matched with blocklist features if they have a prefix followed by an underscore character. This allows transformations to be performed and features still removed. For example, applying the RobustMAD trasform prefixes features with ‘RobustMAD_’, generating RobustMAD_FeatureA, RobustMAD_FeatureB etc. remove_blocklist_features will identify FeatureA (if in blocklist) and still remove that blocklisted feature. To deactivate this default behavior, set remove_prefixed_features to False. By default True.
- Raises:
FileNotFoundError – Error raised if specified file is not found
- remove_features_with_outliers(outlier_cutoff=15.0, remove_data: bool = False)
Removes feature columns containing values greater than given cutoff
By default, any feature containing a value greater than 15 is removed. This cutoff can be raised and lowered as appropriate.
- Parameters:
outlier_cutoff (float, optional) – If a feature column contains a value greater than this cutoff, then the feature is removed. By default 15.
remove_data (bool, optional) – If True, then not only are feature columns with outliers removed from the Datasets list of features, but these columns are dropped from the DataFrames. If False, then only the Datasets list of features are changed. By default False.
- remove_low_variance_features(freq_cutoff=0.05, unique_cutoff=0.01)
Exclude low information content features.
Adapted from pycytominer variance_threshold method https://github.com/cytomining/pycytominer/blob/master/pycytominer/operations/variance_threshold.py
Sometimes, features can vary very little, this allows definition of cutoffs (ratios) of unique values that can exist in a feature. See parameters for further description of cutoffs.
- Parameters:
freq_cutoff (float, default 0.05) – Ratio as defined by 2nd most common feature value divided by the most common feature value). Must range between 0 and 1. Features below this cutoff have a large population with a unique value and will be removed.
unique_cutoff (float, default 0.01) – Remove features with little diversity in their measurements. Must range between 0 and 1. Dividing the number of unique values in a feature by the number of measurements returns a ‘unique’ ratio, values below this cutoff are removed.
- rename_column(from_column_name: str, to_column_name: str)
Rename a single dataset column
- Parameters:
from_column_name (str) – Name of column to rename
to_column_name (str) – New column name
- rename_columns(from_to: dict)
Rename multiple columns
- Parameters:
from_to (dict) – Dictionary of the form {‘old_name’:’new_name’}
- replace_str(column: str | int, pat: str, repl: str)
Replace a string present in a column
- Parameters:
column (Union[str, int]) – Name of the column(could be a feature), within which to search and replace instances of the string specified in the ‘pat’ argument.
pat (str) – The patter, (non-regex), just query substring to find and replace.
repl (str) – Replacement text for the substring identified in the ‘pat’ argument.
- split_column(column: str | int, pat: str, new_columns: list[str])
Split a column on a delimiter
If a column named ‘data’ contained:
idx
data
1
A2_CPD1_Plate1
Then calling:
split_column('data', '_', ['WellID', 'CpdID', 'PlateID'])
Would introduce the following new columns into the dataframe:
idx
WellID
CpdID
PlateID
1
A2
CPD1
Plate1
- Parameters:
column (Union[str, int]) – Name of column to split, or the index of the column.
pat (str) – Pattern (non-regex), usually a delimiter to split on.
new_columns (list[str]) – List of new column names. Should be the correct size to absorb all produced splits.
- Raises:
DataError – Inconsistent number of splits produced when splitting the column.
ValueError – Incorrect number of new column names given in new_columns.
- subtract_func_results_on_features(query_or_perturbation_name: str, groupby: str | list[str] | None, func: Callable | str | None = 'median') None
Subtract the result of a function applied to rows
Useful function for centering plates on DMSO or control perturbations. If called with no func, then median is taken as the required function. The median, or result of applied function to rows identified by the query string (query_or_perturbation_name parameter) are subtracted from all perturbations. The query_or_perturbation_name may also be an identifier present in the datasets perturbation column (if set). If a column name, or list of column names are given in the groupby argument, then the operation is carried out within these groups before being merged back to the original dataframe.
- Parameters:
query_or_perturbation_name (str) – Pandas style query to retrieve rows from which quantities for substraction are calculated, or, if the dataset has perturbation_column set and the parameter value can be found it the perturbation column, then these samples are used and have the given function applied to them. In short, for a Dataset with perturbation_column set to “cpd_name”, then the same effect can be achied with this parameter being “DMSO” and “cpd_name==’DMSO’”.
groupby (Optional[str, list[str]]) – The name, or list of names of columns that the DataSet should be grouped by for application of the transformation on a group-by group basis. This is very useful if neededing to subtract median DMSO perturbation features on a plate-by-plate basis, whereby the column containing plateIDs would be supplied. Multiple column names may also be supplied.
func (Union[Callable, str, None]) – The callable to use in calculation of the quantity to subtract for each perturbation. Special cases exist for ‘median’ and ‘mean’ strings whereby pd.median and pd.mean are applied respectively. If None, then no action is taken. By default ‘median’.
- subtract_mean(query_or_perturbation_name: str, groupby: str | list[str] | None) None
Subtract the mean of rows identified in the query from features
Useful function for centering plates on DMSO or control perturbations. The mean of row features identified by the query string (query_or_perturbation_name parameter) are subtracted from all perturbations. If the query_or_perturbation_name may also be an identifier present in the datasets perturbation column (if set). If a column name, or list of column names aregiven in the groupby argument, then the operation is carried out within these groups before being merged back to the original dataframe.
- Parameters:
query_or_perturbation_name (str) – Pandas style query to retrieve rows from which quantities for substraction are calculated, or, if the dataset has perturbation_column set and the parameter value can be found it the perturbation column, then these samples are used and have the given function applied to them. In short, for a Dataset with perturbation_column set to “cpd_name”, then the same effect can be achied with this parameter being “DMSO” and “cpd_name==’DMSO’”.
groupby (Optional[str, list[str]]) – The name, or list of names of columns that the DataSet should be grouped by for application of the transformation on a group-by group basis. This is very useful if neededing to subtract mean DMSO perturbation features on a plate-by-plate basis, whereby the column containing plateIDs would be supplied. Multiple column names may also be supplied.
- subtract_median(query_or_perturbation_name: str, groupby: str | list[str] | None) None
Subtract the median of rows identified in the query from features
Useful function for centering plates on DMSO or control perturbations. The median of row features identified by the query string (query_or_perturbation_name parameter) are subtracted from all perturbations. If the query_or_perturbation_name may also be an identifier present in the datasets perturbation column (if set). If a column name, or list of column names aregiven in the groupby argument, then the operation is carried out within these groups before being merged back to the original dataframe.
- Parameters:
query_or_perturbation_name (str) – Pandas style query to retrieve rows from which quantities for substraction are calculated, or, if the dataset has perturbation_column set and the parameter value can be found it the perturbation column, then these samples are used and have the given function applied to them. In short, for a Dataset with perturbation_column set to “cpd_name”, then the same effect can be achied with this parameter being “DMSO” and “cpd_name==’DMSO’”.
groupby (Optional[str, list[str]]) – The name, or list of names of columns that the DataSet should be grouped by for application of the transformation on a group-by group basis. This is very useful if neededing to subtract median DMSO perturbation features on a plate-by-plate basis, whereby the column containing plateIDs would be supplied. Multiple column names may also be supplied.
- subtract_median_perturbation(perturbation_label: str, per_column_name: str | None = None, new_features_prefix: str = 'SMP_')
Subtract the median perturbation from all features
Useful for normalisation within a well/plate format. The median feature may be identified through the per_column_name variable, and perturbation label. Newly generated features may have their prefixes controled via the new_features_prefix argument.
- Parameters:
perturbation_label (str) – The perturbation label which should be used to calculate the median
per_column_name (Optional[str], optional) – The perturbation column name. This is optional and can be None, as the Dataset may already have perturbation column set. By default, None.
new_features_prefix (str) – Prefix for new features, each with the median perturbation subtracted. By default ‘SMP_’ (for subtracted median perturbation).
- transpose(reset_index: bool = True, new_header_column: int | None = 0)
Transpose internal DataFrame
phenonaut.data.platemap_querier module
- class phenonaut.data.platemap_querier.PlatemapQuerier(platemap_directory: str | Path, platemap_csv_files: list | str | Path | None = None, plate_name_before_underscore_in_filename=True)
Bases:
object
- get_compound_locations(cpd, plates: str | list | None = None)
- plate_to_cpd_to_well_dict = {}
- platemap_files = None