Bioinformatics (Oxford, England)
Alternate Journal

MOTIVATION: Single-cell RNA sequencing allows us to study cell heterogeneity at an unprecedented cell-level resolution and identify known and new cell populations. Current cell labeling pipeline uses unsupervised clustering and assigns labels to clusters by manual inspection. However, this pipeline does not utilize available gold-standard labels because there are usually too few of them to be useful to most computational methods. This paper aims to facilitate cell labeling with a semi-supervised method in an alternative pipeline, in which a few gold-standard labels are first identified and then extended to the rest of the cells computationally.

RESULTS: We built a semi-supervised dimensionality reduction method, a network-enhanced autoencoder (netAE). Tested on three public datasets, netAE outperforms various dimensionality reduction baselines and achieves satisfactory classification accuracy even when the labeled set is very small, without disrupting the similarity structure of the original space.

AVAILABILITY: The code of netAE is available on GitHub:

SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Dong Z, Alterovitz G. netAE: Semi-supervised dimensionality reduction of single-cell RNA sequencing to facilitate cell labeling. Bioinformatics (Oxford, England). 2020. doi:10.1093/bioinformatics/btaa669.