Over Representation Analysis (ORA)



Case study: Non-alcoholic fatty liver disease
Defined as a genetic-environmental-metabolic stress-related disease, Non-alcoholic Fatty Liver Disease (NAFLD) is a pathology characterized by an excessive fat accumulation in the liver even in the absence of alcohol consumption. NAFLD encompasses a spectrum of diseases, from simple steatosis to non-alcoholic steatohepatitis (NASH), which can progress to cirrhosis and hepatocellular carcinoma. There are also increasing evidences that NAFLD represents the hepatic component of a metabolic syndrome characterized by obesity, hyperinsulinemia, peripheral insulin resistance, diabetes, hypertriglyceridemia and hypertension (Zeng et al., 2014). In the last decade, several studies have identified many genetic changes that may be associated with the development of NAFLD (Sookoian and Pirola, 2017).

We retrieved from Phenopedia (Yu et al., 2010) the list of genes that possibly contribute to the development of NAFLD. Among the 408 NAFLD-associated genes, we selected the ones supported at least from five publications. As result, we obtained a list of 28 genes. Defined as a genetic-environmental-metabolic stress-related disease, Non-Alcoholic Fatty Liver Disease (NAFLD) is a pathology characterized by an excessive fat accumulation in the liver even in the absence of alcohol consumption. NAFLD encompasses a spectrum of diseases, from simple steatosis to non-alcoholic steatohepatitis (NASH), which can progress to cirrhosis and hepatocellular carcinoma. There are also increasing evidences that NAFLD represents the hepatic component of a metabolic syndrome characterized by obesity, hyperinsulinemia, peripheral insulin resistance, diabetes, hypertriglyceridemia and hypertension (Zeng et al., 2014). In the last decade, several studies have identified many genetic variations that may be associated with the development of NAFLD (Sookoian and Pirola, 2017).

We retrieved from Phenopedia (Yu et al., 2010) the list of genes that possibly contribute to the development of NAFLD. Among the 408 NAFLD-associated genes, we selected the ones supported at least from five publications. As result, we obtained a list of 28 genes.
The 28 genes (UniprotKB accession number) were investigated by NETGE-PLUS performing ORA over the KEGG-NET resource.We considered statistically enriched terms with a p-value < 0.01, after the correction with the Bonferroni procedure.

The gene set can be found here. To proceed with the analysis follow these steps:
  • select H. sapiens as source organism,
  • select UNIPROT_ACC as gene/protein identifier ,
  • copy and paste the gene/protein set in input box,
  • select KEGG-NET as annotation resource,
  • select the p-value threshold equal to 1e-02,
  • select the Bonferroni corrected,

    Over-represented KEGG pathways and the KEGG-NET resource
    All the 28 genes are mapped over KGEGG-NET sets in NETGE-PLUS. A total of 22 and 24 genes are included in the seed sets and in the network-based modules, respectively. The standard enrichment highlights three pathways (Table 1). The network-based procedure, in addition to them, adds three pathways. Results are reported in Table 1.

    Table 1. NAFLD case study. Gene enrichment analysis over the KEGG-NET resource.

    EnrichmentTermN1N2BackgroundBonferroniDescription
    Shsa0492066965168.44E-05Adipocytokine signaling pathway
    Shsa0421169065164.12E-04Longevity regulating pathway
    Shsa04152612165162.32E-03AMPK signaling pathway
    Nhsa03320813379212.85E-06PPAR signaling pathway
    N**hsa04145942679212.28E-03Phagosome
    Nhsa04659617579216.39E-03Th17 cell differentiation
    Enrichment: Standard (S) and Network-based (N) procedure. N** indicates a new enriched term not directly associated to the input gene/proteins;
    Term: functional annotation identifier;
    N1: Input genes/proteins belonging to the term;
    N2: genes associated to the functional term;
    Background: number of genes used as background of the Fisher’s exact test;
    Bonferroni: p-value corrected by using the Bonferroni procedure;
    Description: brief explanation of the term.


    Figure 1. NAFLD case study: network of pathways. The image shows the ORA results – over the KEGG-NET database – of the NAFLD-related gene set. Circles represent enriched pathways while diamonds the connecting pathways. Circle colour represents the magnitude of enrichment, while the green colour of diamonds highlights the presence of at least one input genes associated to them.



    The pathways over-represented by the standard method are: the “Adipocytokine signaling pathway”, the “Longevity regulating pathway” and the “AMPK signaling pathway”. Considering the graph generated by linking the whole set of enriched pathways (Figure 1), the three pathways are connected within a chain. All of them are also linked to the insulin signalling pathway (not included in the enriched set). Insulin resistance plays an important role in NAFLD and it is caused by adipocytokines, a specific kind of cytokines secreted by the adipose tissue. Among them, adiponectin is an anti-inflammatory and anti-diabetic adipocytokine that exerts its actions by the activation of adenosine monophosphate (AMP)-activated kinase (AMPK) and PPARα (Berlanga et al., 2014). Interestingly, the PPAR signalling pathway is one of the terms enriched with the network based procedure.

    Another important cytokine within the adipocytokine pathway is leptin. It binds the leptin receptor (LEP-R) and triggers a phosphorylation chain resulting in the activation of the MAPK pathway. This is another connecting pathway in the network. One of the members of the MAPK pathway, namely the protein kinase c-Jun N-terminal kinase (JNK), is closely related to insulin resistance. Moreover, rat models with activated JNK present phenotypes related to NAFLD, such as hepatocyte fat accumulation and cell injury (Zeng et al., 2014).

    In the graph of figure 1, the MAPK pathway links the adipocytokine and insulin signalling pathways with the Th17 cell differentiation pathway (enriched with the network-based procedure). Interestingly, Th17 cells have been associated with hepatocellular steatosis and inflammatory processes via the production of IL-17, that is also implicated in insulin resistance. Moreover, secretion of IL-17 is triggered and perpetuated through the nuclear factor-κB (NF-κB) (Procaccini et al., 2013), and the NF-κB pathway is a connecting node.

    The last term enriched with the network procedure is “Phagosome” that is linked to the other nodes through the “Toll-like receptor signalling pathway”. Both these terms computationally derived with NETGE-PLUS suggest the involvement of macrophages in NAFLD, in particular in relation to the reprogramming induced by cytokines. The role of macrophages in NAFDL from initial steatosis to advanced fibrosis has been previously reviewed in Krenkel and Tacke (2017)

    In conclusion, by highlighting the links among the different metabolisms, KEGG-NET clearly helps the user to globally understand the biology at the basis of the phenotype of interest.



    References
  • Berlanga, A. et al. (2014) Molecular pathways in non-alcoholic fatty liver disease. Clinical and Experimental Gastroenterology, 7:221-239.
  • Krenkel, O. and Tacke, F. (2017) Macrophages in Nonalcoholic Fatty Liver Disease: A Role Model of Pathogenic Immunometabolism. Seminars in Liver Disease, 37(3):189-197
  • Polyzos, S.A. et al. (2014) Leptin in nonalcoholic fatty liver disease: a narrative review. Metabolism, 64(1):60-78.
  • Procaccini, C. et al. (2013) Role of adipokines signaling in the modulation of T cells function. Frontiers in Immunology, 4:332
  • Sookoian, S. and Pirola, C.J. (2017) Genetic predisposition in nonalcoholic fatty liver disease. Clinical and Molecular Hepatology, 23(1):1-12.
  • Yu, W. et al. (2010) Phenopedia and Genopedia: disease-centered and gene-centered views of the evolving knowledge of human genetic associations. Bioinformatics. 26(1):145-146.
  • Zeng, L. et al. (2014) Signal transductions and nonalcoholic fatty liver: a mini-review. International Journal of Clinical and Experimental Medicine, 7(7):1624-1631.