conda create -n RADO_env python=3.7
conda activate RADO_env
pip install umap-learn==0.5.3 #(to be compatible with python3.7)
pip install scRADO==1.3
For scRNA-seq data
from RADO import DoubletDetection
# adata (.H5AD file) is commmon data form in single-cell data analysis
adata = DoubletDetection(adata)
# filter out doublet
adata = adata[adata.obs['RADO_doublet_call']==0,]
Also see the tutorial please, for any other questions, raise issues please!
For scATAC-seq data
from RADO import DoubletDetection
# Assume the adata.X is the peak matrix
adata = DoubletDetection(adata, atac_data=True)
# filter out doublet
adata = adata[adata.obs['RADO_doublet_call']==0,]
It will return an adata with predicted doublet score and doublet for each droplet in the dataset. The prediction can be found in adata.obs['RADO_doublet_score'] and adata.obs['RADO_doublet_call']. For doublet calling, 0 represents singlet and 1 represents doublet.
18 scRNA-seq datasets
2 scATAC-seq datasets
The 16 scRNA-seq datasets were collected from the benchmarking paper of Xi and Li. Datasets were transformed into H5AD format using sceasy. Processing script is convertH5AD.R.
The 2 DOGMA-seq datasets are from Xu et al. Datasets with original singlet or doublet annotation need to be requested from the original authors.