Difference between revisions of "Help:Contents"

Latest revision as of 02:16, 19 October 2016

FAQ

Introduction

What is LncRNAWiki?

LncRNAWiki is a wiki-based, publicly editable and open-content platform for community curation of human long non-coding RNAs (lncRNAs), viz., a community-curated lncRNA knowledgebase. Unlike conventional biological databases based on expert curation, lncRNAWiki harnesses collective intelligence to collect, edit and annotate information about lncRNAs, quantifies users' contributions in each annotated lncRNA and provides explicit authorship to encourage more participation from the whole scientific community.

You can perform different types of contributions to make LncRNAWiki the online encyclopedia for lncRNA.

If you are a researcher, please share your knowledge and curate genes in your area of expertise.
If you are a teacher/investigator, community curation of lncRNA genes in LncRNAWiki can be incorporated as student assignments, where contribution can be quantified as a score.
If you are a student, you can work as a volunteer, e.g., data collection, content formatting.
If you are a journal publisher, please consider community curation as a compulsory post-publication when any lncRNA-related paper is accepted by the journal.
At the very least, please spread this news to any one who might be of interest.

You can perform different types of contributions to make LncRNAWiki the online encyclopedia for lncRNA.

How do I cite lncRNAWiki?
- The citation for lncRNAWiki is:
  - LncRNAWiki: harnessing community knowledge in collaborative curation of human long non-coding RNAs, submitted and under review.
- Other related publications:
  - On the classification of long non-coding RNAs, RNA Biology, 2013, 10(6),925-933.
  - AuthorReward: increasing community curation in biological knowledge wikis through automated authorship quantification, Bioinformatics, 2013, 29(14):1837-1839.
  - RiceWiki: a wiki-based database for community curation of rice genes, Nucleic Acids Research, 2014,42(D1),D1222-D1228.
  - Community intelligence in knowledge curation: an application to managing scientific nomenclature, PLoS ONE, 2013, 8(2):e56961.

LncRNAWiki Accounts

Is it allowed for any user to provide edits in lncRNAWiki?

LncRNAWiki allows any user to view and search but only registered users can add and edit content.

Why do I open my identity and become a registered user?

Open identity provided by registration not only improves content reliability, increases users’ collaborations and communications, but is also supportive to reward community-curated efforts by giving explicit authorship. It is of crucial significance for lncRNAWiki that would like to give credits to all contributors in reward for community-provided contents.

How do I acquire a LncRNAWiki account?

Please email the LncRNAWiki Team at lncwiki@big.ac.cn to tell us your preferred login name, real name, research interests, etc., and we will set up an account for you.

How do I update my account information or change password?

To update your account information, please log on first and then you may find a link named "My preferences" at the top right.

Why I can't log into lncRNAWiki?
- Make sure that the Caps Lock key is not depressed. Passwords are case sensitive.
- Make sure your browser is set to accept cookies.
- Contact us at lncwiki@big.ac.cn.

Editing tips

Formatting: http://www.mediawiki.org/wiki/Help:Formatting
Images: http://www.mediawiki.org/wiki/Help:Images
Links: http://www.mediawiki.org/wiki/Help:Links
Tables: http://www.mediawiki.org/wiki/Help:Tables
Lists: http://www.mediawiki.org/wiki/Help:Lists

Database content

Annotated Information

Annotated Information is organized as free text and of great helpfulness for users who share their knowledge and contribute edits without training in the curation or wiki techniques. It can also fall into several sub-sections, such as Function, Evolution, Expression, making it convenient to direct users to the sub-section(s) of interest. Although these sub-sections are preset, new sub-section can be easily added and irrelevant sub-section(s) can be deleted.

The nomenclature of locus-specific transcripts

Example: N07QT0022001-SABC-LHPXX02 indicates the first non-coding transcript on the q-arm of chromosome 7 positioned on the 22th bin toward telomere, which resides in ABC gene locus in a sense direction, is a long transcript and highly expressed, and has an alternative promoter.

A locus-specific nomenclature is proposed as accession number for transcripts. It contains four consecutive segments joined by hyphens.

Segment 1 contains 12 positions that display genomic information. The first three positions define the nature of a transcript and its residing chromosome. Capital letter N and C are dedicated to non-coding and coding transcripts, respectively. Chromosome numbers are indicated with two positions and 0 can be added to take the vacant position when the chromosome number is less than two digits (such as 01 and 07) or when X and Y are encountered (0X and 0Y are used). Transcription direction is, defined by the fourth and fifth positions, indicated with Q for the q-arms, P for the p-arms (when the transcript is on the centromere or telomere as well as subtelomeric and centromeric regions, the position is labeled as 0 rather than the definite Q or P), T for toward telomere, and C for toward centromere. The next seven digits are devoted to position the transcript; the first four define the bin number where the gene locus resides and the rest three define the number of transcripts within the bin based on first-come-first-name rule. The bins of a given chromosome are named from centromere to telomere in a nominal size of 100 Kb.To obtain the centromere location of each chromosome, you can refer to Media:Hg19cytoBand.txt.
Segment 2 has four positions that define a gene locus. At the first position, S is used to indicate a non-coding transcript overlaps with a protein-coding gene on the same strand or in the same direction. A is used to indicate a non-coding transcript overlapping with the protein-coding gene on the antisense strand or in the opposite direction. For a protein-coding locus, the gene name is directly adapted. For a non-coding transcript, either the name of a nearby gene or a named non-coding RNA locus is abbreviated. When non-coding RNA has both S and A relationships with different protein-coding genes, S relationship is considered with priority.
The last segment is used to define the characteristics of all transcripts in a locus. Seven positions are included: (1) transcript size is defined as large (L, >500 bp), mediate (M, from 100 bp to 500 bp), and small (S, from 20 bp to 100 bp). (2) Expression level is defined in three categories that include highly (H, RPKM or similar units>100), moderate (M, from 10 to 100), and low (L, <10). For lncRNAs that are differentially expressed, the highest expression level is considered. (3) Alternative promoter (P), alternative exon (E), alternative poly (A) are also indicated for all transcripts in the locus. When one or two of the three alternative forms are absent, X can be filled in to take the position(s). (4) Two digits are dedicated to accommodate the number of transcriptional variants.

Basic Information

Basic information includs 10 sub-sections, ‘Transcript ID’, ‘Source’, ‘Same with’, ‘Classification’, ‘Length’, ‘Genomic location’, ‘Exon number’, ‘Exons’, ‘Genome context’, and ‘Sequence’.

Transcript ID: The original lncRNA ID in each database.
Source: The database and version that this lncRNA is from.
Same with: LncRNAs that have the same sequence and also the same genomic location in other databases.
Classification: Classification based on genomic location and context. We obtained genome location information from GENCODE, NONCODE and LNCipedia. Based on the categories of Derrien et al. (Derrien, T. et al. (2012) The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression. Genome Res), we classified lncRNAs into seven groups (Intergenic, Intronic (S),Intronic (AS),Overlapping (S),Overlapping (AS), Sense and Antisense) based on their genomic location in respect to protein-coding genes. The difference between our classification and Derrien’s is that we classified lncRNAs that intersect protein-coding genes into Sense or Antisense by considering the whole transcript sequence instead of exonic region only.Most of the lncRNAs belong to only one category, a small number (1,264) belong to more than one category.
- Intergenic: lncRNAs are transcribed from intergenic regions.
- Intronic (S): lncRNAs are transcribed entirely from introns of protein-coding genes.
- Intronic (AS): lncRNAs are transcribed from antisense strand of protein-coding genes and the entire sequences are covered by introns of protein-coding genes.
- Overlapping (S): lncRNAs that contain coding genes within an intron on the sense strand.
- Overlapping (AS): lncRNAs that contain coding genes within an intron on the antisense strand.
- Sense: lncRNAs are transcribed from the sense strand of protein-coding genes and the entire sequence of lncRNAs are covered by protein-coding genes (Intronic lncRNAs are not included), or the entire sequence of protein-coding genes are covered by lncRNAs (Overlapping lncRNAs are not included ), or both lncRNAs and protein-coding genes intersect each other partially.
- Antisense: lncRNAs are transcribed from the antisense strand of protein-coding genes and the entire sequence of lncRNAs are covered by protein-coding genes (Intronic lncRNAs are not included), or the entire sequence of protein-coding genes are covered by lncRNAs (Overlapping lncRNAs are not included ), or both lncRNAs and protein-coding genes intersect each other partially.
Length, Genomic location, Exon number and Exons: The length, genomic location, exon number of lncRNA, and genomic location of each exon. These information is obtained from GENCODE, NONCODE and LNCipedia annotation.
Genome context: We integrated JBrowse (version 1.11.4) (http://jbrowse.org/) into LncRNAWiki to facilitate visualization of the genomic context and transcript structure of each lncRNA.
Sequence: The transcript sequence of this lncRNA.

Genomic location and context of lncRNAs Protein-coding genes and their exons are represented by blue color, while lncRNAs and their exons are represented by red color.

Predicted Small Protein

Predicted small protein (proteins of 100 amino acids or less in the absence of processing) includs 13 sub-sections, 'Name', 'Length(aa)', 'Molecular weight', 'Aromaticity', 'Instability index', 'Isoelectric point', 'Runs', 'Runs residual', 'Runs probability', 'Amino acid sequence', 'Secondary structure', 'PRMN', and 'PiMo'.

Name: The name of predicted small proteins.
Length (aa), Molecular weight, Aromaticity, Instability index, Isoelectric point: The length (aa), molecular weight, aromaticity, instability index, isoelectric point of predicted small proteins.
Runs: The runs of secondary structure of predicted small proteins.
Runs residual: The residual between runs/length of the predicted small protein and the average runs/length of mRNAs which have the same length with the predicted small protein.
Runs probability: The average probability of the predicted small protein runs.
Secondary structure: The amino acid sequence and its secondary structure of the predicted small protein.
PRMN: Refined predicted transmembrane helix.
PiMo: Predicted transmembrane region.

@@ Line 15: / Line 15: @@
 * How do I cite lncRNAWiki?
 ** The citation for lncRNAWiki is:
-*** Unavailable
+*** LncRNAWiki: harnessing community knowledge in collaborative curation of human long non-coding RNAs, submitted and under review.
 ** Other related publications:
-*** [http://www.ncbi.nlm.nih.gov/pubmed/24136999 RiceWiki: a wiki-based database for community curation of rice genes], ''Nucleic Acids Research'', 2014,42(D1),D1222-D1228.
 *** [http://http://www.ncbi.nlm.nih.gov/pubmed/23696037 On the classification of long non-coding RNAs], ''RNA Biology'', 2013, 10(6),925-933.
 *** [http://www.ncbi.nlm.nih.gov/pubmed/23732274 AuthorReward: increasing community curation in biological knowledge wikis through automated authorship quantification], ''Bioinformatics'', 2013, 29(14):1837-1839.
+*** [http://www.ncbi.nlm.nih.gov/pubmed/24136999 RiceWiki: a wiki-based database for community curation of rice genes], ''Nucleic Acids Research'', 2014,42(D1),D1222-D1228.
 *** [http://www.ncbi.nlm.nih.gov/pubmed/23451119 Community intelligence in knowledge curation: an application to managing scientific nomenclature], ''PLoS ONE'', 2013, 8(2):e56961.
@@ Line 38: / Line 38: @@
 ** Make sure that the Caps Lock key is not depressed. Passwords are case sensitive.
 ** Make sure your browser is set to accept cookies.
-** Contact us at lncrnawiki@big.ac.cn.
+** Contact us at lncwiki@big.ac.cn.
 == Editing tips ==
@@ Line 50: / Line 50: @@
 == Database content ==
 ====Annotated Information====
+Annotated Information is organized as free text and of great helpfulness for users who share their knowledge and contribute edits without training in the curation or wiki techniques. It can also fall into several sub-sections, such as Function, Evolution, Expression, making it convenient to direct users to the sub-section(s) of interest. Although these sub-sections are preset, new sub-section can be easily added and irrelevant sub-section(s) can be deleted.
 * '''The nomenclature of locus-specific transcripts'''
-Example: '''hsa-N07QT0022001-SABC-LHPXX02''' indicates the first non-coding transcript on the q-arm of chromosome 7 positioned on the 22th bin toward telomere, which resides in ABC gene locus in a sense direction, is a long transcript and highly expressed, and has an alternative promoter.
+Example: '''N07QT0022001-SABC-LHPXX02''' indicates the first non-coding transcript on the q-arm of chromosome 7 positioned on the 22th bin toward telomere, which resides in ABC gene locus in a sense direction, is a long transcript and highly expressed, and has an alternative promoter.
 A locus-specific nomenclature is proposed as accession number for transcripts. It contains four consecutive segments joined by hyphens.
-:#Segment 1 contains a three-letter code for species name. The standard naming convention is adapted: the first letter of genus name + the first two letter of the species name. For instance, Homo sapiens and Mus musculus are abbreviated as hsa and mmu, respectively.
+:#Segment 1 contains 12 positions that display genomic information. The first three positions define the nature of a transcript and its residing chromosome. Capital letter N and C are dedicated to non-coding and coding transcripts, respectively. Chromosome numbers are indicated with two positions and 0 can be added to take the vacant position when the chromosome number is less than two digits (such as 01 and 07) or when X and Y are encountered (0X and 0Y are used). Transcription direction is, defined by the fourth and fifth positions, indicated with Q for the q-arms, P for the p-arms (when the transcript is on the centromere or telomere as well as subtelomeric and centromeric regions, the position is labeled as 0 rather than the definite Q or P), T for toward telomere, and C for toward centromere. The next seven digits are devoted to position the transcript; the first four define the bin number where the gene locus resides and the rest three define the number of transcripts within the bin based on first-come-first-name rule. The bins of a given chromosome are named from centromere to telomere in a nominal size of 100 Kb.To obtain the centromere location of each chromosome, you can refer to [[Media:Hg19cytoBand.txt]].
-:#Segment 2 contains 12 positions that display genomic information. The first three positions define the nature of a transcript and its residing chromosome. Capital letter N and C are dedicated to non-coding and coding transcripts, respectively. Chromosome numbers are indicated with two positions and 0 can be added to take the vacant position when the chromosome number is less than two digits (such as 01 and 07) or when X and Y are encountered (0X and 0Y are used). Transcription direction is, defined by the fourth and fifth positions, indicated with Q for the q-arms, P for the p-arms (when the transcript is on the centromere or telomere as well as subtelomeric and centromeric regions, the position is labeled as 0 rather than the definite Q or P), T for toward telomere, and C for toward centromere. The next seven digits are devoted to position the transcript; the first four define the bin number where the gene locus resides and the rest three define the number of transcripts within the bin based on first-come-first-name rule. The bins of a given chromosome are named from centromere to telomere in a nominal size of 100 Kb.
+:#Segment 2 has four positions that define a gene locus. At the first position, S is used to indicate a non-coding transcript overlaps with a protein-coding gene on the same strand or in the same direction. A is used to indicate a non-coding transcript overlapping with the protein-coding gene on the antisense strand or in the opposite direction. For a protein-coding locus, the gene name is directly adapted. For a non-coding transcript, either the name of a nearby gene or a named non-coding RNA locus is abbreviated. When non-coding RNA has both S and A relationships with different protein-coding genes, S relationship is considered with priority.
-:#Segment 3 has four positions that define a gene locus. At the first position, S is used to indicate a non-coding transcript overlaps with a protein-coding gene on the same strand or in the same direction. A is used to indicate a non-coding transcript overlapping with the protein-coding gene on the antisense strand or in the opposite direction. For a protein-coding locus, the gene name is directly adapted. For a non-coding transcript, either the name of a nearby gene or a named non-coding RNA locus is abbreviated. When non-coding RNA has both S and A relationships with different protein-coding genes, S relationship is considered with priority.
+:#The last segment is used to define the characteristics of all transcripts in a locus. Seven positions are included: (1) transcript size is defined as large (L, >500 bp), mediate (M, from 100 bp to 500 bp), and small (S, from 20 bp to 100 bp). (2) Expression level is defined in three categories that include highly (H, RPKM or similar units>100), moderate (M, from 10 to 100), and low (L, <10). For lncRNAs that are differentially expressed, the highest expression level is considered. (3) Alternative promoter (P), alternative exon (E), alternative poly (A) are also indicated for all transcripts in the locus.  When one or two of the three alternative forms are absent, X can be filled in to take the position(s). (4) Two digits are dedicated to accommodate the number of transcriptional variants.
-:#The last segment is used to define the characteristics of all transcripts in a locus. Seven positions are included: (1) transcript size is defined as large (L, >500 bp), mediate (M, from 100 bp to 500 bp), and small (S, from 20 bp to 100 bp). (2) Expression level is defined in three categories that include highly (H, RPKM or similar units>100), moderate (M, from 10 to 100), and low (L, <10). For lncRNAs that are differentially expressed, the highest expression level is considered. (3) Alternative promoter (P), alternative exon (E), alternative poly (A) (A) are also indicated for all transcripts in the locus.  When one or two of the three alternative forms are absent, X can be filled in to take the position(s). (4) Two digits are dedicated to accommodate the number of transcriptional variants.
 ====Basic Information====
-* '''Genomic location''': This information is obtained from GENCODE and LNCipedia annotation.
+Basic information includs 10 sub-sections, ‘Transcript ID’, ‘Source’, ‘Same with’, ‘Classification’, ‘Length’, ‘Genomic location’, ‘Exon number’, ‘Exons’, ‘Genome context’, and ‘Sequence’.
-* '''Same with''': LncRNAs that have the same sequence and also the same genomic location in other databases. For lncRNAs that do not have genomic location annotation, same with means having the same sequence.
-* '''Classification''': Classification based on genomic location and context. We obtained genome location information from Gencode and LNCipedia annotation and classified lncRNAs into four groups ('''Intergenic''', '''Intronic''', '''Sense''' and '''Antisense''') based on their genomic location in respect to protein-coding genes. As it is more complicated for the condition of sense and antisense, we also listed different kinds of sense and antisense lncRNAs, which can be viewed through different categories.
+* '''Transcript ID''': The original lncRNA ID in each database.
+* '''Source''': The database and version that this lncRNA is from.
+* '''Same with''': LncRNAs that have the same sequence and also the same genomic location in other databases.
+* '''Classification''': Classification based on genomic location and context. We obtained genome location information from GENCODE, NONCODE and LNCipedia. Based on the categories of Derrien et al. (Derrien, T. et al. (2012) The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression. Genome Res), we classified lncRNAs into seven groups ('''Intergenic''', '''Intronic (S)''','''Intronic (AS)''','''Overlapping (S)''','''Overlapping (AS)''', '''Sense''' and '''Antisense''') based on their genomic location in respect to protein-coding genes. The difference between our classification and Derrien’s is that we classified lncRNAs that intersect protein-coding genes into Sense or Antisense by considering the whole transcript sequence instead of exonic region only.Most of the lncRNAs belong to only one category, a small number (1,264) belong to more than one category.
 ** Intergenic: lncRNAs are transcribed from intergenic regions.
-** Intronic: lncRNAs are transcribed entirely from introns of protein-coding genes.
+** Intronic (S): lncRNAs are transcribed entirely from introns of protein-coding genes.
-** Sense(contained): lncRNAs are transcribed from the sense strand of protein-coding genes and the entire sequence of lncRNAs are covered by protein-coding genes. Note, intronic lncRNAs are not included.
+** Intronic (AS): lncRNAs are transcribed from antisense strand of protein-coding genes and the entire sequences are covered by introns of protein-coding genes.
-** Sense(contain): lncRNAs are transcribed from the sense strand of protein-coding genes and cover the entire sequences of protein-coding genes.
+** Overlapping (S): lncRNAs that contain coding genes within an intron on the sense strand.
-** Sense(overlap): lncRNAs are transcribed from the sense strand of protein-coding genes and part of their sequences are overlapped with part of the protein-coding genes.
+** Overlapping (AS): lncRNAs that contain coding genes within an intron on the antisense strand.
-** Antisense(intronic): lncRNAs are transcribed from antisense strand of protein-coding genes and the entire sequences are covered by introns of protein-coding genes.
+** Sense: lncRNAs are transcribed from the sense strand of protein-coding genes and the entire sequence of lncRNAs are covered by protein-coding genes (Intronic lncRNAs are not included), or the entire sequence of protein-coding genes are covered by lncRNAs (Overlapping lncRNAs are not included ), or both lncRNAs and protein-coding genes intersect each other partially.
-** Antisense(contained): lncRNAs are transcribed from the antisense strand of protein-coding genes and the entire sequence of lncRNAs are covered by protein-coding genes. Note, antisense intronic lncRNAs are not included.
+** Antisense: lncRNAs are transcribed from the antisense strand of protein-coding genes and the entire sequence of lncRNAs are covered by protein-coding genes (Intronic lncRNAs are not included), or the entire sequence of protein-coding genes are covered by lncRNAs (Overlapping lncRNAs are not included ), or both lncRNAs and protein-coding genes intersect each other partially.
-** Antisense(contain): lncRNAs are transcribed from the antisense strand of protein-coding genes and cover the entire sequences of protein-coding genes.
+* '''Length, Genomic location, Exon number and Exons''': The length, genomic location, exon number of lncRNA, and genomic location of each exon. These information is obtained from GENCODE, NONCODE and LNCipedia annotation.
-** Antisense(overlap): lncRNAs are transcribed from the antisense strand of protein-coding genes and part of their sequences are overlapped with part of the protein-coding genes.
+* '''Genome context''': We integrated JBrowse (version 1.11.4) (http://jbrowse.org/) into LncRNAWiki to facilitate visualization of the genomic context and transcript structure of each lncRNA.
+* '''Sequence''': The transcript sequence of this lncRNA.
-[[File:web_figure3.png|center|550px]]
+[[File:web3.png|center|550px]]
-<center>'''Figure 1. Genomic location and context of lncRNAs. Protein-coding genes and their exons are represented by blue color, while lncRNAs and their exons are represented by red color.''' </center>
-===Annotation (From GENCODE)===
+<center>'''Genomic location and context of lncRNAs'''</center>
-The annotation section now includes at most three sub-sections, namely, ‘Annotation from GENCODE’, ‘Annotation from LNCipedia’, and ‘Annotation from LncRNAdb’. We extracted genomic location information from GENCODE and LNCipedia and put this information in section ‘Basic Information’. Therefore, genomic location information is removed from the corresponding annotation sections. Also, sequence information is transferred to the section of ‘Basic Information’. For users who want to find detailed information of lncRNA exons, they can refer to ‘Annotation from GENCODE’.
+<center>Protein-coding genes and their exons are represented by blue color, while lncRNAs and their exons are represented by red color. </center>
-===Annotation (From LNCipedia)===
-If users want to find information such as lncRNA secondary structure and miRNA binding, they can refer to ‘Annotation from LNCipedia’.
-===Annotation (From LncRNAdb)===
+====Predicted Small Protein====
-If users want to find literature reported information such as expression, evolution, and biological functional, they can refer to ‘Annotation from lncRNAdb’
+Predicted small protein (proteins of 100 amino acids or less in the absence of processing) includs 13 sub-sections, 'Name', 'Length(aa)', 'Molecular weight', 'Aromaticity', 'Instability index', 'Isoelectric point', 'Runs', 'Runs residual', 'Runs probability', 'Amino acid sequence', 'Secondary structure', 'PRMN', and 'PiMo'.
+* '''Name''': The name of predicted small proteins.
+* '''Length (aa)''', '''Molecular weight''', '''Aromaticity''', '''Instability index''', '''Isoelectric point''': The length (aa), molecular weight, aromaticity, instability index, isoelectric point of predicted small proteins.
+* '''Runs''': The runs of secondary structure of predicted small proteins.
+* '''Runs residual''': The residual between runs/length of the predicted small protein and the average runs/length of mRNAs which have the same length with the predicted small protein.
+* '''Runs probability''': The average probability of the predicted small protein runs.
+* '''Secondary structure''': The amino acid sequence and its secondary structure of the predicted small protein.
+* '''PRMN''': Refined predicted transmembrane helix.
+* '''PiMo''': Predicted transmembrane region.

Difference between revisions of "Help:Contents"

Latest revision as of 02:16, 19 October 2016

Contents

FAQ

Introduction

LncRNAWiki Accounts

Editing tips

Database content

Annotated Information

Basic Information

Predicted Small Protein

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

About

Science Wikis

Links

Tools