lamindb.curators.CatManager¶
- class lamindb.curators.CatManager(*, dataset, categoricals, sources, organism, exclude, columns_field=None)¶
- Bases: - object- Manage valid categoricals by updating registries. - A - CatManagerobject makes it easy to validate, standardize & annotate datasets.- Example: - >>> cat_manager = ln.CatManager( >>> dataset, >>> # define validation criteria as mappings >>> columns=Feature.name, # map column names >>> categoricals={"perturbation": ULabel.name}, # map categories >>> ) >>> cat_manager.validate() # validate the dataframe >>> artifact = cat_manager.save_artifact(description="my RNA-seq") >>> artifact.describe() # see annotations - cat_manager.validate()maps values within- dfaccording to the mapping criteria and logs validated & problematic values.- If you find non-validated values, you have several options: - new values found in the data can be registered using - add_new_from()
- non-validated values can be accessed using - non_validated()and addressed manually
 - Attributes¶- property categoricals: dict¶
- Return the columns fields to validate against. 
 - property non_validated: dict[str, list[str]]¶
- Return the non-validated features and labels. 
 - Class methods¶- classmethod from_anndata(data, var_index, categoricals=None, obs_columns=FieldAttr(Feature.name), verbosity='hint', organism=None, sources=None)¶
- Return type:
- AnnDataCatManager 
 
 - classmethod from_df(df, categoricals=None, columns=FieldAttr(Feature.name), verbosity='hint', organism=None)¶
- Return type:
 
 - classmethod from_mudata(mdata, var_index, categoricals=None, verbosity='hint', organism=None)¶
- Return type:
- MuDataCatManager 
 
 - classmethod from_spatialdata(sdata, var_index, categoricals=None, organism=None, sources=None, exclude=None, verbosity='hint', *, sample_metadata_key='sample')¶
 - classmethod from_tiledbsoma(experiment_uri, var_index, categoricals=None, obs_columns=FieldAttr(Feature.name), organism=None, sources=None, exclude=None)¶
- Return type:
 
 - Methods¶- save_artifact(*, key=None, description=None, revises=None, run=None)¶
- Save an annotated artifact. - Parameters:
- key ( - str|- None, default:- None) – A path-like key to reference artifact in default storage, e.g.,- "myfolder/myfile.fcs". Artifacts with the same key form a version family.
- description ( - str|- None, default:- None) – A description.
- revises ( - Artifact|- None, default:- None) – Previous version of the artifact. Is an alternative way to passing- keyto trigger a new version.
- run ( - Run|- None, default:- None) – The run that creates the artifact.
 
- Return type:
- Returns:
- A saved artifact record. 
 
 - standardize(key)¶
- Replace synonyms with standardized values. - Inplace modification of the dataset. - Parameters:
- key ( - str) – The name of the column to standardize.
- Return type:
- None
- Returns:
- None 
 
 - validate()¶
- Validate dataset. - This method also registers the validated records in the current instance. - Return type:
- bool
- Returns:
- The boolean - Trueif the dataset is validated. Otherwise, a string with the error message.