nichecompass.benchmarking.compute_nasw

nichecompass.benchmarking.compute_nasw(adata, latent_knng_key='nichecompass_latent_knng', latent_key='nichecompass_latent', n_neighbors=15, min_res=0.1, max_res=1.0, res_num=3, n_jobs=1, seed=0)

Compute the Niche Average Silhouette Width (NASW). The NASW ranges between ‘0’ and ‘1’ with higher values indicating more distinct and compact clusters in the latent feature space. To compute the NASW, Leiden clusterings with different resolutions are computed for the latent nearest neighbor graph. The NASW for all clustering resolutions is computed and the average value is returned as metric for clusterability.

If existent, uses a precomputed latent nearest neighbor graph stored in ´adata.obsp[latent_knng_key + ‘_connectivities’]´. Alternatively, computes it on the fly using ´latent_key´ and ´n_neighbors´, and stores it in ´adata.obsp[latent_knng_key + ‘_connectivities’]´.

Parameters:
  • adata (AnnData) – AnnData object with a precomputed latent nearest neighbor graph stored in ´adata.obsp[latent_knng_key + ‘_connectivities’]´ or the latent representation from a model stored in ´adata.obsm[latent_key]´.

  • latent_knng_key (str (default: 'nichecompass_latent_knng')) – Key under which the latent nearest neighbor graph is / will be stored in ´adata.obsp´ with the suffix ‘_connectivities’.

  • latent_key (Optional[str] (default: 'nichecompass_latent')) – Key under which the latent representation from a model is stored in ´adata.obsm´.

  • n_neighbors (Optional[int] (default: 15)) – Number of neighbors used for the construction of the latent nearest neighbor graph from the latent representation from a model in case they are constructed.

  • min_res (float (default: 0.1)) – Minimum resolution for Leiden clustering.

  • max_res (float (default: 1.0)) – Maximum resolution for Leiden clustering.

  • res_num (int (default: 3)) – Number of linearly spaced Leiden resolutions between ´min_res´ and ´max_res´ for which Leiden clusterings will be computed.

  • n_jobs (int (default: 1)) – Number of jobs to use for parallelization of neighbor search.

  • seed (int (default: 0)) – Random seed for reproducibility.

Return type:

float

Returns:

nasw: Average NASW across all clustering resolutions.