graphics modification：

TSNE graph refers to a graph which is drawn by projecting high latitude data into two-dimensional coordinate system through t-distributed stochastic neighbor embedding (tSNE), a nonlinear dimensionality reduction algorithm. TSNE graph can overcome the disadvantage of poor clustering effect of linear dimension reduction, exaggerate the distance relationship between samples, make the samples with high similarity more concentrated and the samples with low similarity more dispersed, and make the data show better clustering results. This tool does not support single cell data at present.

(1) Function: dynamically display the clustering relationship between different samples through tSNE diagram

(2) Announcement: While set the scRNA-seq dataset as input, the group file should be necessary and the number of cells should be less than 10000.

(3) Input:

Example file:

(4) Analysis and operation:

① Linear dimension reduction: Linear dimension reduction is the preprocessing of data. Using principal component to reduce the dimension of data can effectively improve the operation speed of tSNE algorithm. Partial_PCA algorithm has the fastest operation speed, but the lowest accuracy after dimension reduction, so it is recommended to use it only for large data. PCA algorithm and partial_PCA algorithm cannot be used at the same time.

② Normalization: We use z-score to normalize the data, that is, subtract the average value of the expression of each gene in all samples, and then divide it by its standard deviation. Normalizing the data can eliminate the influence of abnormally high expression genes and reduce the gap between the “rich” and the “poor”. We recommend normalizing the data.

③ Point density is the preset data complexity, which determines the maximum number of adjacent points near each point. When there are many samples, the point density can be appropriately reduced to make the clustering effect clearer; When there are few samples, the density of points can be increased to make the points in the cluster more concentrated.

④ Selecting row and column drawing: when selecting rows, drawing takes one row of data as a dimensionality reduction data point; when selecting columns, drawing takes one row of data as a dimensionality reduction data point; when the selected items are different from the sample names of grouped files, drawing will fail.

⑤ Modify the color, size and transparency of different sample points.

⑥ Customize the coordinate axis range.

⑦ choose whether to display the grouping name in the graph.

(5) Graphic interpretation: A point in the graph represents a sample, and the similarity between samples is reflected by the aggregation degree of samples in the two-dimensional plane. The more samples are gathered together, the more similar the samples are.