SOX2-OT

From LncRNAWiki
Jump to: navigation, search

Annotated Information

Genomic organization of Sox2ot locus in mouse.[1]
SOX2-OT dynamic expression in mouse adult neurogenesis.[1]
Expression of Sox2ot orthologs in different vertebrate species.[1]

Approved Symbol

SOX2-OT (SOX2 overlapping transcript)

Previous Symbols

SOX2OT

Synonyms

DKFZp761J1324, NCRNA00043

Chromosome

3q26.33

RefSeq ID

NR_004053

OMIM ID

616338

Ensembl ID

ENSG00000242808

Disease

Esophageal squamous cell cancer, Alzheimer's disease,Neurodevelopmental syndromes associated with the SOX2 locus

Characteristics

SOX2-OT is originally reported as a spliced ncRNA mapping to human chromosome 3q26.3-q27, with an intron overlapping the SOX2 gene in the same transcriptional orientation.[1] SOX2-OT is transcribed by RNA polymerase II (RNAP II), and therefore represents a putative “mRNA-like” ncRNA.[1]

Expression

SOX2-OT is dynamically expressed in different developmental processes.[1] SOX2-OT is expressed in pluripotent ES cells and is initially down-regulated upon EB differentiation.[1] SOX2-OT is highly enriched in the human brain and its expression is spatially regulated.[1]

Function

The expression profiles of SOX2-OT, as well as its evolutionarily preserved genomic association with Sox2, suggest that it has conserved functions in vertebrate development, and that it may participate in the regulation of Sox2 or related processes.[1]

Labs working on this lncRNA

  • ARC Special Research Centre for Functional and Applied Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia,QLD 4072, Australia.

References

  1. 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 Amaral PP, Neyt C, Wilkins SJ, Askarian-Amiri ME, Sunkin SM, Perkins AC,Mattick JS. Complex architecture and regulated expression of the Sox2ot locus during vertebrate development. RNA. 2009 Nov;15(11):2013-27. doi:10.1261/rna.1705309. Epub 2009 Sep 18.

Sequence

>gi|449020153|ref|NR_004053.3| Homo sapiens SOX2 overlapping transcript (SOX2-OT), transcript variant 4, long non-coding RNA

000001 AAAAATGTTG AATATCAACA ATGGGTTTCA GTGCTGTATT GTTGAGTTTT GTATGAACTA ATCTAGCTTC TCTAAAAAAA 000080
000081 AAAAAAAAAA AAAAAAAAAG AAAGAAAAGA AAGAAAGAAA AGAAAGAAAG AAAGAAAAAA AAAGAAAAGA AAAAAAAGAG 000160
000161 AACCACAGCT TTGGACAAAA GTCTATAGCC AGTGCATTTG AAAACTCCCC TTGCACCAGG GCTGACTGTA GCTGGGGGAA 000240
000241 GACATTTTAC TCATTTGCTC TTTCCTTTCA TACAGTCAGC TGGTCCCTCT TCTCAGGTGT AGATCACCTA TTATAATTTT 000320
000321 ACCTAATTTT GATACTCTGA TAAGGAAAGT CCCAGGACTT ATCAGCTGGG ATAGGCCTCA CTTACAAGAC AGCTCTGTTC 000400
000401 AGTATTTGGA AGAAAGCCTG ACAGGTTTCT TCGGAAAAGT GGCCATCCAT GGAATGAATG AAATGTTCTC TTTCTATTCC 000480
000481 AGGGATTGCA GTGGCAAAGC TAGGCTAGGT CTTGGAGGCT GGTGTAAGGC GATGTGGGTG AAGGCAGGAG GCTGATGGAA 000560
000561 AGACTGGGGG GAAGAAAAGC CGAAATGGAT TCACGGTGCC TTGGATGAAG GACGAGAGGG GAACTGCAAG CTCCTTCAAC 000640
000641 TGGTTCTGTC CGGTGAGAAG TGATCAAGCT TGGGCTGACA AGAGGCTCAG GGAGCCCTCA CGTTCTTTCG CTTTTTTACC 000720
000721 TGCCAATCAA ACTGCTACAA GACAACACCC TGATCTGGCA TGGACATCGC GGGTCCAAGC CTGTAGCCCC AAATCGGATA 000800
000801 ATCTCTGCAG CTGATAACAA GCAAAAGAGA AGCCAGGCAA CAGCCATATT AAAGAAGAAA ACAATCAACT CTGAGATCCA 000880
000881 ACTTAGAAAT AATGTTTCAT TCAAGATAAG GCTCGTGGCT TAGGAGATTG TGACCTGGCT TGCATCATTC TAAGACTTCT 000960
000961 ATCGTCTGTT TTCAAAACCC AAGGAGGACC TCCTTTTCTG TGTGATAGTT CCTCATGCTT TGCCAGCCAC TGGGTCTTTA 001040
001041 AGGAAGCTTG CCAAGAGTTC CCGGCTGGGA AGGACAGTTC GAGTTCATCA TAGACAATGA GCTGGCACCA CAGTTTAAGT 001120
001121 GACTCACCAA CCTGGTTGCC CACTTCAAAA TGTCAAGAAC CACACTATCT AATTGAAAGA ACTGTGTTTT AGCAAAGTGT 001200
001201 AATTGGATCG CCTGGCAAGA TCATCTCAAA CTATGTCTCT GACCTTGCCT TTCAGTGGCT ATTTTGAGCA CAAGAGGAGA 001280
001281 AAGGAAATGA TTCCTGAATT ATATCTTGTT TTTAAAAATT AAGATTAATT AATACTTGGT TTTATGTTCA TTCCATTGTA 001360
001361 GGAACTTAGT GATTATTACA AAACAGTCAA AATAGGTCAT AGCAAAATGA ACGTTTAAGA TTAGGTTAAA ACATTCCTCT 001440
001441 GAAATTTTAG TATTCACTTT TGAAGTGTCA ATTACTATAT TTCTGTAATT TCTCATTTTT TTCTTGGCTT AGAGTCAAGA 001520
001521 TGAAAAATGA TATTTGAATC AAAGACATTG TTTTGTTATA TAGCCACTCC TTTCTAAAAT GCAAATCCAC CCTCCCCTCA 001600
001601 CCAATGCTTT ATTCTTTGTG CGCGCACACA CACACACACA CACACACACA CACACACACA CACACAGACA TACACAGAAC 001680
001681 CAGCACCCAA ATGGTAGAGG AGGTCCTTCC AAGACTTCTA GGACCTGGAT CATGTAAGCA GTAGATGAGA AAATGAAGTA 001760
001761 AGAGATATAG AGTAATAACA ATAAAAGTCA ATGATTTCAT TTTACAAAAA AGTCATTGTG TAGTACTCAG ATATTTTTAA 001840
001841 GAGCAAGTAT TATTATGACT AAAGTATATT ATACTTCTAA ATTATTATGA TTCTCATCTA ATTTCAAGCC ACACATAGGC 001920
001921 GCCATTTGAA AGCATTTTGT GTTACAACCT ATCAACAAAA ATCAACTAAT AGTTTTATAA CGTGCATAGG TTTACCCCAT 002000
002001 GGGGAATTAA AATTTTTATC TGCAAGTATT TCTGTTGTTG TGAGACTTTA GGACAAGACA ACATTTGGTG TTAACATAGA 002080
002081 AGAATATTCT ATGGCTTTTT TAGCTTTAAA GTTCATTACT CTAAAATAAA GTACACGTGG TATTTTTTTC TTCCTAGTGT 002160
002161 TCTCTTATTA TATGTAGAGT ATTTTCGGTT GTTGTTTTTA ACGTATACAT TATTTTTATT AAGTCATTTC ATAAAAGATA 002240
002241 AGATATATAC TAATAGACCC TTGGTTTTCT CCTAATTTAA TTATTTCAAG TCAAGCAAAA TTGGGCAATT TTCTTTCAGG 002320
002321 AAAAAAAAAA TATTGAGCAT GATTTTGAGG TAAAACTGAT TATTTACAGC ATCTTTTCTC TAATAGGCTA ACATTTATGA 002400
002401 TGGATCTTAT TTCAAAGTTG ATTATTCTCA AAGAAAAGTG GAGATATTAG AATTTCCGAG TAAATTGCAT TGTTTTAAAA 002480
002481 AGTACTTTAA TGCATTGTTC ACATTGTTGC CAGCTATTGA CCAAAAAAGA GAAGAAGAAA AAAGAAATAA AAAAAGCAGT 002560
002561 ATCTGCTCCA AACGTCAAGT TTTGCAGTTT GAAAGACTTG TACATTATAA TTGTTTTTTT CTTTTTTGGA AAGCAGTAGT 002640
002641 AATTAATACC AAAGGCATAG ACAATTTAGG GCCATAACCT CCCTTCCCCC CTCTCCACCT TTTTTTTTTT TTTTTTTTTT 002720
002721 TTTTAACTTT AACCATGATA GGCATCTCTG CAAGGTAGGT TTTAACATAT TTTTTTCCTA ATTTTTTTCC CTCCATTCAG 002800
002801 TAAGAAGGAT AAGCTTTTAG AGTACCCAGT CACCGTGGTA ACTGCCAAAA AAATTCTCCT GAAGTTTGAA GAGGTCAGCT 002880
002881 AGCAGGGTCA ACAACCCTGG CCTCTCTCTG AGCTCGGTCT AGCATGCCCC AGCCTGCGGT TGGAGTGTCA GGCCAGATCT 002960
002961 GACCCTGGGA GGTGGACCAT TCTGGCTCTG ATATAAATTT TTCGAGTCAG TTCATGGCCT GGACTCTCCA GGTGGCCTCC 003040
003041 AAAATCGATT TTGACCCCCT GACCCTGTCT CACGGAGGCA TAGCATCCAC CCTAAGTAAG GATTTAGCCA GGAATGACAG 003120
003121 CTTTCCACTG AGTGTTATTG GGGACAGTAT AAACAGTTTG GAAAGCAAGT AAGGAGGTGT GAGGGCTAGC AATCATTAAA 003200
003201 AGCACATTAA AAAACAATTT TTAAAAATCT TTATTAGAGA AAGTTGTAAA TGCATATCTG TTACAGAATT CCTGTGATTT 003280
003281 GTAGAGTTTT CTAGTCAGTT TTGAATATAA ATAGGTCACA TATCTTTATC TTTTTGCATA CTTTGTTACA AATATGCAAA 003360
003361 TAAGATAGGC ATACTTGCTA CAAAATAGGT ACAGCAGGAA TGTAATATGT TCAGAAAAGG CAAACTGGTT ATTAAAACAC 003440
003441 ATTAACAGGT AAGGAGTTTC TAATATTTTA AATACTAAAA TTTTACATGT GTATTTTGAA GTTTTTAAAA TGGAAAAATA 003520
003521 AAGAACCTTT AAAAA

Predicted Small Protein

Name SOX2-OT_smProtein_2489:2638
Length 49
Molecular weight 5749.9873
Aromaticity 0.122448979592
Instability index 25.1142857143
Isoelectric point 9.65399169922
Runs 9
Runs residual 0.0216216216216
Runs probability 0.0694027360694
Amino acid sequence MHCSHCCQLLTKKEKKKKEIKKAVSAPNVKFCSLKDLYIIIVFFFFGKQ
Secondary structure LLLHHHHHHHLLHHHHHHHHHHHHLLLLLEEELLLLLEEEEEEEEELLL
PRMN LLLLLLLLLLLLLLLLLLLLLLLLLLLLLHHHHHHHHHHHHHHHHHHLL
PiMo oooooooooooooooooooooooooooooTTTTTTTTTTTTTTTTTTii