动态工具也支持下载pdf啦!
9月14日服务器升级暂停访问通知
动态桑基图工具上线!!
动态工具也支持下载pdf啦!

GO Enrichment Analysis Advanced

查看更多GO富集系列工具

1:
Add file Example
2:
Current default selection is Model organism, the Ensembl ID background gene file is used on the platform. Please ensure that the Ensembl ID is used for the target genes file in the previous step
Preview the reference document
Of all genes GO annotation files, format as the first listed as gene id, the second to GO annotation results

GO enrichment analysis使用教程

GO enrichment analysis, a tutorial

 

Principle:
Gene Ontology (GO) is an international standardized gene functional classification system which offers a dynamic-updated controlled vocabulary and a strictly defined concept to comprehensively describe properties of genes and their products in any organism. GO has three ontologies: molecular function, cellular component and biological process. The basic unit of GO is GO-term. Each GO-term belongs to a type of ontology. GO functional analysis provides not only GO functional annotations of a given gene set, but also GO enrichment analysis of the given gene set that correspond to biological functions. Firstly all genes in the given gene set are mapped to GO terms in the Gene Ontology database, and gene numbers are calculated for every term. Then significantly enriched GO terms in the given gene set comparing to the genome background are defined by hypergeometric test. The calculating formula of P-value is:
: 生物云平台 Here N is the number of all genes with GO annotation; n is the number of genes in the given gene set in N; M is the number of all genes that are annotated to a certain GO term; m is the number of genes in the given gene set in M. The calculated p-value is then gone through FDR Correction, taking FDR ≤ 0.05 as a threshold. GO terms meeting this condition are defined as significantly enriched GO terms in the gene set.

 

Function:

Input gene set or differential analysis gene set, through preset parameters, perform GO enrichment analysis and visualize the analysis results with exquisite graphics. The output graphics include enrichment bubble chart, enrichment bar chart, enrichment circle chart, and z- Score bubble chart, network chart, secondary classification statistics chart.

 

Scope of application:

The gene sets of 18 common species can be enriched and analyzed, including bovine, zebrafish, human, macaque, mouse, rat, pig, elegans, drosophila, Arabidopsis, rice, tomato, wheat, corn, yeast, goat, chicken and indica rice, and 3 genomic versions are provided.

The background genes of the species can also be prepared by themselves for enrichment analysis.

 

Input: (The file format is the same as the GO enrichment analysis)

① The enriched target gene list, that is, the gene list you want to study, you can choose to enter two file types, with or without difference gene list; two upload methods, manual input/upload of txt files.

Nodiff file format:

 

Unigene117178
Unigene129340
Unigene66777
Unigene78052
Unigene171181

 

Diff file format:

 

Unigene117178 14.433051
Unigene129340 8.27829505
Unigene66777 10.68610256
Unigene77686 8.170500036
Unigene78052 8.083759015

 

Tip: If the difference multiple of the gene is not added (that is, the Nodiff file format is selected), the enriched difference z-score bubble chart will not be output.

 

② Background gene summary table. If it is a model organism with a reference genome, you can directly use the existing reference gene as the background gene file. The species currently provided are rice, Arabidopsis, mice, rats, zebrafish, chickens, C. elegans, fruit flies, and humans. The ID type can be selected from gene ID or transcript ID, and is determined according to the ID type of the enriched target gene. You can click "Preview Reference File" to view the specific ID.

If the species being studied is not in the above range, you need to prepare the GO background gene file by yourself. Two formats are now supported. The first type: the format is that the first column is the gene id, and the second column is the GO annotation result. The second type: All GO numbers of the same gene will be given side by side in the same row. After the task is submitted, the program will automatically judge and deal with it. As shown below:

 

生物云平台生物云平台

 

Parameter:

Choose p-value or q-value to plot: P-value/Q-value

 

Output:

① out.[PFC].html: webpage format result, 3 corresponding to the 3 main categories of GO.

② out.[PFC].xls: the statistical results of GO function classification of genes.

③ out.[PFC].barplot/gradient.png/pdf: statistical graph of GO function classification results of genes (bubble chart/bar graph/directed acyclic network graph) (png/pdf format).

④ out.secLevel2.svg/png: It is the GO secondary classification statistical map, which counts the number of genes used for enrichment in each category in the GO secondary classification. The statistical results are in the xls table. The content of the table includes Ontology, Class (secondary classification of GO), number of genes, and specific gene id.

⑤ out.level2.xls: GO second level classification statistics.

⑥ out.bubble/bubble_sp.png/pdf: z-score bubble chart of 3 categories (png/pdf format).

⑦ out.circular.png/svg: Enrichment circle map (png/svg format).

 

For graphical interpretation and application of enrichment analysis, please click this link to view details:

Come and receive a full set of detailed enrichment analysis related graphics! (Part 1)

Come and receive a full set of detailed enrichment analysis related graphics! (Part 2)

 

 

Sample file: target gene (including multiples of difference)

                     Background gene

 

Input: GO enrichment analysis of the differential genes of non-parameter species

 

Output:

① Out.[PFC].html webpage format results, 3 correspond to the 3 main categories of GO. The result is shown in the figure below, which contains two parts:

 

The first part is the statistical table of GO enrichment results, including GOid, GO function description, gene ratio, background gene ratio, P value, Q value, P value and Q value less than 0.05 will be displayed in red.

 

The second part is the specific genes enriched by GO. Click GOid to link to the official website of http://amigo.geneontology.org to view the specific information of GO.

 

② out.[PFC].png,out.[PFC].pdf,out.[PFC].xls GO enrichment bubble chart, enrichment bar chart, directed acyclic graph, only showing the enrichment GO term (that is, the p value is less than 0.05). If there is no result less than 0.05, these files are not available. You can view the results in xls, and the results correspond to the web results, including GOid, GO function description, gene ratio, background gene ratio, P value, Q value, and corresponding gene id.

 

③ The result of the out.secLevel.svg/png picture is shown in the figure below, which is the GO secondary classification statistics chart, which counts the number of genes used for enrichment in each category in the GO secondary classification. The statistical results are in the xls table . The content of the table includes Ontology, Class (secondary classification of GO), number of genes, and specific gene id.

 

④ The picture result of out.bubble/bubble_sp.png/pdf is shown in the figure below. It is the z-score bubble chart of GO enrichment analysis. The ordinate is -log10 (Pvalue), and the abscissa is the up-down normalization value (differentially up-regulated genes). The difference between the number and the number of differentially down-regulated genes accounted for the proportion of total differential genes); the yellow line represents the threshold of Q/Pvalue=0.05; the right is the term/Pathway list of the top 20 Q/P values, and different colors represent different Ontology/ A class.

 

⑤ The out.circular.png/svg picture result is shown in the figure below, which is the circle diagram of GO enrichment analysis, with a total of four circles from the outside to the inside:

The first circle: the classification of the enrichment, outside the circle is the coordinate ruler of the number of genes. Different colors represent different categories;

The second circle: the number of background genes in the category and the Q value or P value. The more genes, the longer the bar, the smaller the value, the redder the color;

The third circle: the bar graph of the ratio of up-regulated genes, dark purple represents the ratio of up-regulated genes, and light purple represents the ratio of down-regulated genes; the specific value is displayed below; when there is only one column for the number of differential genes entered (up and down is not distinguished), the third The circle shows the total number of foreground genes;

The fourth circle: the RichFactor value of each category (the number of foreground genes in the category divided by the number of background genes), and each cell of the background auxiliary line represents 0.1.