Identification of human HK genes and gene expression regulation study in cancer from transcriptomics data analysis.

Meili Chen, Jingfa Xiao, Zhang Zhang, Jingxing Liu, Jiayan Wu, Jun Yu
Author Information
  1. Meili Chen: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.

Abstract

The regulation of gene expression is essential for eukaryotes, as it drives the processes of cellular differentiation and morphogenesis, leading to the creation of different cell types in multicellular organisms. RNA-Sequencing (RNA-Seq) provides researchers with a powerful toolbox for characterization and quantification of transcriptome. Many different human tissue/cell transcriptome datasets coming from RNA-Seq technology are available on public data resource. The fundamental issue here is how to develop an effective analysis method to estimate expression pattern similarities between different tumor tissues and their corresponding normal tissues. We define the gene expression pattern from three directions: 1) expression breadth, which reflects gene expression on/off status, and mainly concerns ubiquitously expressed genes; 2) low/high or constant/variable expression genes, based on gene expression level and variation; and 3) the regulation of gene expression at the gene structure level. The cluster analysis indicates that gene expression pattern is higher related to physiological condition rather than tissue spatial distance. Two sets of human housekeeping (HK) genes are defined according to cell/tissue types, respectively. To characterize the gene expression pattern in gene expression level and variation, we firstly apply improved K-means algorithm and a gene expression variance model. We find that cancer-associated HK genes (a HK gene is specific in cancer group, while not in normal group) are expressed higher and more variable in cancer condition than in normal condition. Cancer-associated HK genes prefer to AT-rich genes, and they are enriched in cell cycle regulation related functions and constitute some cancer signatures. The expression of large genes is also avoided in cancer group. These studies will help us understand which cell type-specific patterns of gene expression differ among different cell types, and particularly for cancer.

References

  1. Nat Genet. 2002 Jun;31(2):180-3 [PMID: 11992122]
  2. Nucleic Acids Res. 2011 Mar;39(4):1197-207 [PMID: 20965966]
  3. Nat Rev Cancer. 2006 Jul;6(7):506-20 [PMID: 16794634]
  4. Nat Rev Genet. 2003 Sep;4(9):741-9 [PMID: 12951575]
  5. Bioinformatics. 2009 Jul 1;25(13):1662-8 [PMID: 19417058]
  6. Br J Cancer. 2010 Apr 13;102(8):1284-93 [PMID: 20197764]
  7. Genome Res. 2011 Jan;21(1):95-105 [PMID: 21088282]
  8. PLoS One. 2010 Apr 12;5(4):e10144 [PMID: 20419085]
  9. Physiol Genomics. 2001 Dec 21;7(2):97-104 [PMID: 11773596]
  10. BMC Genomics. 2008 Apr 16;9:172 [PMID: 18416810]
  11. Genome Biol Evol. 2011;3:667-73 [PMID: 21398425]
  12. Trends Genet. 2003 Jul;19(7):362-5 [PMID: 12850439]
  13. Nat Genet. 2008 Dec;40(12):1413-5 [PMID: 18978789]
  14. Genome Biol. 2010;11(12):R124 [PMID: 21182765]
  15. Nat Rev Cancer. 2010 Jun;10(6):415-24 [PMID: 20495575]
  16. Stat Appl Genet Mol Biol. 2006;5:Article7 [PMID: 16646871]
  17. Proc Natl Acad Sci U S A. 2001 Dec 18;98(26):15203-8 [PMID: 11752463]
  18. Nat Genet. 2007 Jan;39(1):41-51 [PMID: 17173048]
  19. Cold Spring Harb Perspect Biol. 2010 Dec;2(12):a000752 [PMID: 20961978]
  20. Cell. 2011 Mar 4;144(5):646-74 [PMID: 21376230]
  21. Physiol Genomics. 2001 Dec 21;7(2):95-6 [PMID: 11773595]
  22. BMC Genomics. 2005 May 20;6:75 [PMID: 15907206]
  23. Biol Direct. 2009 Jun 16;4:20 [PMID: 19531225]
  24. Bioinformatics. 2009 Apr 15;25(8):1026-32 [PMID: 19244387]
  25. Genome Res. 2011 Apr;21(4):545-54 [PMID: 21173033]
  26. Nat Protoc. 2009;4(1):44-57 [PMID: 19131956]
  27. Oncogene. 2007 Feb 26;26(9):1324-37 [PMID: 17322918]
  28. Curr Opin Genet Dev. 2009 Feb;19(1):25-31 [PMID: 19181515]
  29. Nature. 2008 Nov 27;456(7221):470-6 [PMID: 18978772]
  30. PLoS One. 2010 May 03;5(5):e10398 [PMID: 20454660]
  31. BMC Bioinformatics. 2005 May 16;6:120 [PMID: 15904488]
  32. Genome Res. 2002 Feb;12(2):292-7 [PMID: 11827948]
  33. Bioinformatics. 2002 Dec;18(12):1585-92 [PMID: 12490442]
  34. Cell. 2008 Jul 25;134(2):215-30 [PMID: 18662538]
  35. PLoS One. 2011 Mar 25;6(3):e18055 [PMID: 21464961]
  36. Nat Rev Cancer. 2010 Jan;10(1):51-7 [PMID: 20029423]
  37. Trends Genet. 2008 Oct;24(10):481-4 [PMID: 18786740]
  38. BMC Bioinformatics. 2010 Apr 29;11 Suppl 3:S6 [PMID: 20438653]
  39. Nucleic Acids Res. 2006 Jan 31;34(2):564-74 [PMID: 16449200]
  40. Nat Methods. 2008 Jul;5(7):621-8 [PMID: 18516045]
  41. Genome Res. 2008 Nov;18(11):1851-8 [PMID: 18714091]
  42. Nat Biotechnol. 2010 Apr;28(4):322-4 [PMID: 20379172]
  43. Comput Biol Chem. 2011 Jun;35(3):126-30 [PMID: 21704257]
  44. Bioinformatics. 2008 Sep 15;24(18):2057-63 [PMID: 18632747]
  45. Physiol Genomics. 2000 Apr 27;2(3):143-7 [PMID: 11015593]
  46. Traffic. 2009 Sep;10(9):1199-208 [PMID: 19552647]
  47. PLoS Comput Biol. 2009 Dec;5(12):e1000598 [PMID: 20011106]
  48. Trends Genet. 2006 Feb;22(2):101-9 [PMID: 16380191]
  49. J Biotechnol. 1999 Oct 8;75(2-3):291-5 [PMID: 10617337]
  50. J Immunol Methods. 2010 Apr 15;355(1-2):76-9 [PMID: 20171969]
  51. Bioinformatics. 2003 Jan 22;19(2):185-93 [PMID: 12538238]
  52. J Mol Biol. 1965 May;12:88-118 [PMID: 14343300]
  53. Genes Chromosomes Cancer. 2008 Sep;47(9):755-65 [PMID: 18506748]
  54. PLoS One. 2007 Sep 19;2(9):e898 [PMID: 17878933]
  55. PLoS Genet. 2010 Apr 08;6(4):e1000906 [PMID: 20386747]
  56. Genome Res. 2002 Aug;12(8):1185-9 [PMID: 12176926]
  57. PLoS One. 2010 Oct 27;5(10):e13696 [PMID: 21060876]

MeSH Term

Algorithms
Base Sequence
Cell Lineage
Gene Expression Profiling
Gene Expression Regulation, Neoplastic
Genes, Essential
Humans
Neoplasms
Sequence Analysis, RNA

Word Cloud

Similar Articles

Cited By