Frequently Asked Questions

For the gene pages, PomBase curators have chosen default UTR features using three data sources and a set of precedence criteria:

  1. Highest priority is given to data from low-throughput "conventional" experiments preformed on individual mRNAs and reported in publications or submitted to EMBL. Where low-throughput data are not available, one of two high-throughput datasets is used.
  2. The Broad data published in 2011 by Rhind et al. (PMID:21511999) is given precedence because it is the most recent, is higher resolution and detected splicing within the UTRs.
  3. For genes not covered by (1) or (2), start/end data from Lantermann et al. (PMID:20118936) based on transcriptome data from Dutrow et al. (PMID:18641648) are used.


More information is available in the mailing list archive for both HTP datasets (Broad: http://lists.sanger.ac.uk/pipermail/pombelist/2011/000856.html ; Lanterman/Dutrow: http://lists.sanger.ac.uk/pipermail/pombelist/2011/000814.html).

Transcript start and end coordinates from all sources will be available as individual data tracks in the Ensembl genome browser in the near future, which will allow you to view and evaluate them. PomBase will also curate splice and transcript variants as data become available.

Subscribe to this list: http://publists.sanger.ac.uk/mailman/listinfo/yeast_orthologous_groups

In the future we will make the manually curated list of orthologs and orthologous groups identified between fission and budding yeast available for download from the PomBase web site.

Each gene is assigned exactly one characterisation status that reflects how much is known about the gene, whether it is conserved, etc. Specific status descriptions:

  • Experimentally characterised: Completely or partially characterised in a small scale experiment, with some published information about the biological role (corresponding to any of the fission yeast GO slim biological process terms)
  • Role inferred from homology: A biological role (as above, a fission yeast GO slim term) is inferred from homology to an experimentally characterised gene product
  • Conserved protein (unknown biological role): Conserved outside the Schizosaccharomyces, but nothing known about the biological role in any organism
  • S. pombe specific families: Unpublished and found only in fission yeast (S. pombe, S. octospurus or S. japonicus); nothing known about biological role, but are not single copy (duplications in fission yeast)
  • Sequence orphans: Unpublished and found only in fission yeast (S. pombe, S. octospurus or S. japonicus); nothing known about biological role
  • Dubious: Unlikely to be protein coding

A current summary of gene characterisation status for the S. pombe genome is available, as well as a table of historical characterisation status counts.

The list is on the Priority Unstudied Genes page. As genes are annotated, each is assigned a status (see Gene Characterisation for status descriptions), and genes with "conserved unknown" status are evaluated for inclusion in the Priority Unstudied Genes list. Species distribution is assigned manually on a case-by-case basis, taking into account multiple criteria; additional information is available from PomBase curators upon request.

Most non-coding RNAs in PomBase are based on transcriptome data, either from Jurg Bähler's lab (Solexa/deep sequencing; PMID:18488015) or Nick Rhind's lab (RNA sequencing; PMID:21511999). For any ncRNA, the source should be linked as a publication in the "Literature" section at the bottom of the PomBase gene page. To get an idea of the transcription in a region, you can look at the Bähler Lab Transcriptome Viewer, which is linked from most gene pages, e.g. SPNCRNA.200. Unfortunately some genes, such as SPNCRNA.1115, post-date the viewer and therefore do not have entries, but you can look at transcription in this region by accessing a neighboring gene.

Yes. At present the best way is to search for the Fission Yeast Phenotype Ontology term "inviable", using its unique ID (FYPO:0000049). Go to the "Find" tab, and click "Advanced Search" to go to the query builder (or go directly to http://www.pombase.org/spombe/query/builder). In the "Select Filter" pulldown, choose "FYPO Accession", and then type or paste "FYPO:0000049" into the box. Click the Submit button (below the filter pulldown) to run the search.

At present you'll get a list of 1275 genes; this list includes all genes that showed inviable phenotypes in the HTP deletion project as well some manually annotated genes, and therefore should be comprehensive. You can download the list in plain text or a few other formats from the query results page.

In the near future we hope to make this and other frequently-used gene lists available for download without requiring users to do a search.

The reference sequence was last updated in January 2007; only feature coordinates and annotation have changed since then. See Sequence Updates and Sequence Updates Pending for more information.

On the Genome Statistics page: http://www.pombase.org/status/statistics

The Fission Yeast GO slim terms page provides a generic GO slim for S. pombe, and shows total genes annotated directly or by inference to each term. For further information on using the generic S. pombe slim, or on creating your own GO slim, please see the Fission Yeast GO slimming tips page.

Yes, there is a file that lists GO macromolecular complex assignments for fission yeast gene products in the FTP directory:

ftp://ftp.ebi.ac.uk/pub/databases/pombase/pombe/Complexes/

There is some redundancy in the list, because some gene products are annotated to both complexes and subcomplexes. For example, subunits of the DASH complex (GO:0042729) are annotated to 'condensed nuclear chromosome kinetochore' (GO:0000778) as well as GO:0042729. Additional notes are available in a README file: ftp://ftp.ebi.ac.uk/pub/databases/pombase/pombe/Complexes/README

We plan to add a filter to the Advanced Search that will let you retrieve all genes at once. In the meantime, you can construct a compound query by performing a series of queries, one for each type of gene, and then combining them in the Query Management panel:

  1. Go to the New Query panel (http://www.pombase.org/spombe/query/builder), select the filter 'Genes By Type', and leave the default 'protein_coding' option selected in the second pulldown. Submit the query.
  2. Repeat for the feature types 'snoRNA', 'snRNA', 'tRNA', 'rRNA', 'ncRNA' and (if desired) 'pseudogene'.
  3. Go to the Query Management tab, and select all of the feature type queries you have just done. Click the 'Join (OR)' button. The result of this last query is all S. pombe genes. You can combine this result set with other queries in Query Management.


Note that if you instead search for 'Genes By Chromosome' you will not retrieve genes on the separate contig representing the h+ mating type region (see the Mating Type Region page).

The current S. pombe genome assembly does not include the complete telomeric regions or the telomeric short repeats. These omissions are beyond the control of PomBase curators. Subtelomeric repeats are also not explicitly defined at present, although we hope to provide this information in the future. Additional information about S. pombe telomeres is available at on the Telomeres page (linked via the Genome Status menu).

Browse for Chromosome II:2129208-2137121, and see the Mating Type Region page.

The mating type region will soon be annotated as a feature, and refer to a Sequence Ontology term.

Centromeres can be retrieved in the PomBase Ensembl browser; the coordinates are:
Chromosome I:3753687-3789421
Chromosome II:1602264-1644747
Chromosome III:1070904-1137003

Sequence features within the centromeres, such as repeats, will soon be annotated with Sequence Ontology terms. In the meantime, see the diagram on the Centromere sequencing status page.

The reference genome sequence excludes most of the ribosomal DNA (rDNA) repeats, which are present in two tandem arrays on chromosome III. These arrays are estimated to be 1225 kb and 240 kb in size for the sequenced strain (972 h-). The reference sequence includes two partial and one complete representative rDNA repeats:

In the Ensembl genome browser, click the "Configure this page" button in the left-hand bar. A pop-up box will appear. Note that this box has several tabs along its top, and the exact selection of tabs and configuration options depends on whether you are configuring the "Location", "Gene", or "Transcript" tab of the main browser.

To turn a track on or off, click the small box to the left of its description. Note that some tracks simply toggle on and off, whereas for others a small popup appears, in which you can select from a set of options controlling exactly how the track appears. The left-hand bar of the configuration popup organizes available tracks into subsets, and offers a few additional options (including "Reset configuration", which restores the default display).

For example, to show or hide repeat regions, make sure you have the "Location" tab selected. The tabs for this configuration allow you to configure the "Region" (lower) and "Overview" (lower) images separately. In the "Configure Region Image" tab, click "Repeat regions" in the popup's left-hand bar. You can then check one box to show all repeats, or select specific types of repeat to display.

When you are finished choosing tracks, click the tick/check mark in the upper right corner of the configuration popup.

No; only the Ensembl genome browser is available via the PomBase web site.

If you want to browse the S. pombe genome in the Artemis environment, it is fairly easy to download and run locally:

Once you have loaded the file(s), you can do many different things, e.g.:

  • Find features by name or ID
  • Find all features of a given type (e.g. see the "can I find transposons" FAQ)
  • Find matches to a specific nucleotide sequence (e.g. see the "restriction enzyme map" FAQ)
  • View the nucleotide or amino acid sequence of a region or feature
  • Export selected sequences

Also see the Artemis manual (pdf) for additional information.

Yes. In the Advanced Search, the Gene Systematic IDs and Gene Names filters both accept lists. You can type or paste lists of IDs/names into the box, separated by commas or with one ID or name per line.

At present, there is a fixed set of data retrieved when you execute the search. We plan to offer more flexible options in the near future. Later, we also hope to allow you to upload a file containing your gene list.

For convenience, there is a direct link to a search page pre-configured to accept a list of systematic IDs available in the Find menu, on the Find page, and here:

http://www.pombase.org/spombe/query/builder?filter=12

Once you have done a search for your genes, the list of results will be available in the Query Management section of the Advanced Search, allowing you to combine the list with other lists or with additional search criteria.

Go to the Genome Browser (in the Tools menu), and enter coordinates in the 'Search for:' box. The format is 'I:100000..200000' or 'I:100000-200000' (i.e. use Roman numerals to specify the chromosome, and don't include the word "chromosome"; use either '..' or '-' between the start and end coordinates.)

We plan to make conveniently downloadable intron datasets available in the near future. In the meantime, you can find genes with introns using the PomBase Advanced Search.

At either site, you should search for genes with a specified number of exons, and use the range 2 (i.e. at least one intron) to 20 (more than the maximum known, 16 introns). You can also restrict the search to protein-coding genes. Note that the PomBase count includes introns in UTRs (the count in the old GeneDB search didn't).

Instructions for searching PomBase

  1. Go to the Advanced Search - http://www.pombase.org/spombe/query/builder
  2. Under "Select Filter" choose "Genes That Have N Exons"
  3. Enter values: Minimum 2, Maximum 20
  4. Optional: to restrict to protein-coding genes, click "+". Leave the operator set to "AND", and choose "Genes by Type", then choose "protein_coding".
  5. Click "Submit". The results page has links to download the resulting list of genes or the genomic, cDNA or protein sequences. Note that we plan to offer additional download options, including coordinates, in the future.

We do not have up-to-date intron branch site data available on the PomBase web site yet; we plan to create and maintain files of these data in the future. In the meantime, PomBase curators can provide files created in July 2011 upon request. Note that the files were generated using old scripts and therefore may not be comprehensive.

Transcript start and end coordinates from all sources will be available as individual data tracks in the Ensembl genome browser in the near future, which will allow you to view, evaluate and download them. We also provide downloadable UTR data sets that are updated periodically.

Also see the precedence criteria used to choose default UTR features to display on gene pages.

You can search for genes annotated to a Fission Yeast Phenotype Ontology term in the Advanced Search (http://www.pombase.org/spombe/query/builder or go to the Find tab and click "Advanced Search").

In the "Select Filter" pulldown, if you know the ID (for example, "inviable" is FYPO:0000049, and "elongated cells" is FYPO:0000017) choose "FYPO Accession", and then type or paste the ID into the box. Otherwise, choose "FYPO (Partial) Name" and start typing; the autocomplete feature will suggest phenotypes. Choose one, and click the Submit button to run the search. You can download the list in plain text or a few other formats from the query results page.

In PomBase, S. cerevisiae orthologues are curated for S. pombe genes; note that other types of homologues (e.g. paralogues) are not curated. Ortholog curation in PomBase includes individual orthologs, and also orthologous groups that have been identified using multiple sources and manually curated in the pombe/cerevisiae ortholog resource.

To find S. pombe orthologs for a budding yeast gene, you can search for the systematic name (ORF name) of the S. cerevisiae gene in the Simple Search (go to http://www.pombase.org/search/ensembl or use the search box in the page header). For example, S. cerevisiae LRP1 has the systematic name YHR081W, and a search on this in PomBase will retrieve the S. pombe gene cti1. To find systematic names of S. cerevisiae genes, you can search SGD.

Yes: In the Advanced Search (http://www.pombase.org/spombe/query/builder), choose the "Conserved in ..." filter option. Then choose one of the descriptions, and submit.

Go to the Advanced Search - http://www.pombase.org/spombe/query/builder
Under "Select Filter" choose "Genes By Type", then choose "snoRNA". Click "Submit". You can download the resulting list of genes or the genomic sequences. We plan to offer additional download options, including coordinates, in the future.

Note that there are likely a number of snoRNAs that have not yet been identified and annotated in S. pombe; we hope to investigate further soon.

In the Advanced Search (http://www.pombase.org/spombe/query/builder), choose "Proteins That Have Transmembrane Domains" in the "Select Filter" pulldown, and submit.
We plan to add the ability to restrict the number of transmembrane domains (e.g. find proteins with 7 transmembrane domains) soon.

There are various ways you can find protein family members.

  1. If you know the Pfam, PRINTs, PROSITE, or InterPro accession for the family or domain you want, you can use the Advanced Search (http://www.pombase.org/spombe/query/builder). Go to the New Query tab, choose "Proteins That Have Specific Protein Domains" in the "Select Filter" pulldown, enter the accession, and submit.
  2. If you don't have an accession, but do know any member of the family, go directly to its gene page. In th "Protein Features" section of the gene page there is a table of protein domains and families, which includes a link to a list of all family members in S. pombe.
  3. If you know neither accessions nor family members, you can search for keywords in  the InterPro database (http://www.ebi.ac.uk/interpro/), which combines signatures from a number of member databases, including Pfam. Record the accession number(s) of the family, and use them in the PomBase advanced search as described in item 1 above. (If necessary, you can use Query Management to combine the results of several queries.)

You can also try a keyword search in the PomBase advanced search, but this is much less reliable, because a keyword search may retrieve some proteins that don't have the domain or aren't family members due to coincidentally matching words in gene product descriptions. In the future, we plan to add the ability to search the full text of gene pages, which will provide another option for finding protein family information.

Unfortunately, no; a wild card can only be used in the middle or at the end of a search string.

Further info: The simple search uses Lucene, which cannot use a wild card at the start of a search string. Additional limitations on the EBI search implemented for PomBase preclude workarounds. We apologise for the inconvenience to users.

Yes. At present the best way is to search for the Fission Yeast Phenotype Ontology term "inviable", using its unique ID (FYPO:0000049). Go to the "Find" tab, and click "Advanced Search" to go to the query builder (or go directly to http://www.pombase.org/spombe/query/builder). In the "Select Filter" pulldown, choose "FYPO Accession", and then type or paste "FYPO:0000049" into the box. Click the Submit button (below the filter pulldown) to run the search.

At present you'll get a list of 1275 genes; this list includes all genes that showed inviable phenotypes in the HTP deletion project as well some manually annotated genes, and therefore should be comprehensive. You can download the list in plain text or a few other formats from the query results page.

In the near future we hope to make this and other frequently-used gene lists available for download without requiring users to do a search.

Go to the Genome Browser (in the Tools menu), and enter coordinates in the 'Search for:' box. The format is 'I:100000..200000' or 'I:100000-200000' (i.e. use Roman numerals to specify the chromosome, and don't include the word "chromosome"; use either '..' or '-' between the start and end coordinates.)

S. pombe GO annotations are available in browsers that use the GO repository, notably AmiGO and QuickGO. Both browsers have extensive documentation available:

Hint: to find S. pombe annotations, use Taxon: 4896 (Schizosaccharomyces pombe) or Source: PomBase. 

Also see the FAQ on obtaining consistent AmiGO and PomBase query results.

In the future, we plan to make Fission Yeast Phenotype Ontology (FYPO) terms and annotations available in a browser analogous to AmiGO or QuickGO. Until such a browser becomes available, FYPO is accessible in these external resources:

NCBO BioPortal - search on the BioPortal home page or go to the FYPO page.

EBI's Ontology Lookup Service (OLS) - search on the OLS home page or go to the FYPO page.

No; only the Ensembl genome browser is available via the PomBase web site.

If you want to browse the S. pombe genome in the Artemis environment, it is fairly easy to download and run locally:

Once you have loaded the file(s), you can do many different things, e.g.:

  • Find features by name or ID
  • Find all features of a given type (e.g. see the "can I find transposons" FAQ)
  • Find matches to a specific nucleotide sequence (e.g. see the "restriction enzyme map" FAQ)
  • View the nucleotide or amino acid sequence of a region or feature
  • Export selected sequences

Also see the Artemis manual (pdf) for additional information.

We plan to add a filter to the Advanced Search that will let you retrieve all genes at once. In the meantime, you can construct a compound query by performing a series of queries, one for each type of gene, and then combining them in the Query Management panel:

  1. Go to the New Query panel (http://www.pombase.org/spombe/query/builder), select the filter 'Genes By Type', and leave the default 'protein_coding' option selected in the second pulldown. Submit the query.
  2. Repeat for the feature types 'snoRNA', 'snRNA', 'tRNA', 'rRNA', 'ncRNA' and (if desired) 'pseudogene'.
  3. Go to the Query Management tab, and select all of the feature type queries you have just done. Click the 'Join (OR)' button. The result of this last query is all S. pombe genes. You can combine this result set with other queries in Query Management.


Note that if you instead search for 'Genes By Chromosome' you will not retrieve genes on the separate contig representing the h+ mating type region (see the Mating Type Region page).

In AmiGO, the "term association" search retrieves gene products annotated to a GO term and to any of its child terms, following all relationships in the ontology, including regulates.

The PomBase GO search excludes regulates relationships by default, so annotation totals will differ from those in AmiGO for any terms that have child terms connected by regulates links. For example, a search for "cytokinesis" in AmiGO will include genes annotated to "regulation of cytokinesis", whereas a search in PomBase will not. To include regulation, and have results that match AmiGO, you must search for both terms -- see instructions here.

For more information on regulates, see the GO Ontology Relations documentation.

In the Advanced Search, you can search for two GO terms, one for the process and another for regulation of the process, joined by the "OR" operator. For example, to find genes involved in cytokinesis or its regulation, search for "'cell cycle cytokinesis' (GO:0033205) OR 'regulation of cell cycle cytokinesis' (GO:0071775)".

You can use "OR" in the Query Management interface. First search for each term in a separate New Query, then click the Query Management tab. Check the boxes beside the two queries you want to combine, and then click the "Join (OR)" button to submit the compound query. The new query will appear in the Query Management list itself, and you can do further combinations with additional queries.

To search PomBase for transposable elements:

  1. Go to the Advanced Search - http://www.pombase.org/spombe/query/builder
  2. Under "Select Filter" choose "Gene Annotation Status" and then choose "Transposon".
  3. Click "Submit". The results page has links to download the resulting list of genes or the genomic, cDNA or protein sequences. Note that we plan to offer additional download options, including coordinates, in the future.

At present, there are 11 full-length transposons annotated, and two frameshifted copies.

Lone LTRs are also annotated as sequence features. They cannot yet be retrieved by the simple or advanced searches, but they can be displayed on a track in the Ensembl browser.

Finally, if you wish to install Artemis (available from http://www.sanger.ac.uk/resources/software/artemis/), you can use it to view LTRs in more detail. Read in the EMBL format files of sequence and annotation (available from the DNA Datasets page). To see LTRs,

  1. In the Select menu, choose "By Key".
  2. In the pulldown that pops up, choose "LTR".

See the Artemis manual (pdf) for additional information.

Genome sequence files can be downloaded from the DNA Datasets page in several different formats.

Go to the Genome Browser (in the Tools menu), and find the region of interest by searching for a gene name, a systematic ID, or a set of coordinates. Then click the Export Data button on the left-hand side.

Select the number of bases up- and downstream (even if you have searched using coordinates, you can add flanking sequence to what you download), which strand and the features you would like. Click "next". Select your download option. Your browser will save or display the data, depending on which format you select.

You can retrieve sequences from a gene page or in the Genome Browser.

On the gene page: Scroll down or click the quick link to the Sequence section of the page, where there is a set of pre-set one-click options and a Custom option. For protein-coding genes, there are pre-set options to retrieve the coding sequence (CDS), CDS + UTRs, CDS + UTRs + introns, or a translation of the CDS; for non-coding RNA genes only the relevant options are offered. Under Display Options, you can choose whether to retrieve plain text or add color highlighting of different regions.

To include flanking sequences, use the Custom Sequence option.Clicking the View button takes you to a page where you can specify whether to include UTRs and introns, and how much upstream and downstream sequence to include. Click the Download button to see the sequence. You can save by copying and pasting from the browser.

To use the Genome Browser: Click the "View in Genome Browser" link under the map graphic on a gene page, or go directly to the Genome Browser via the Tools menu, and search for a gene name or systematic ID. Click Export Data (a button on the left hand side). Select the number of bases up- and downstream, which strand and the features you would like. Click "next". Select your download option. Your browser will save or display the data, depending on which format you select.

There is no single transcriptome sequence file available from PomBase at present. We suggest using the data available from the Broad Institute: http://www.broadinstitute.org/annotation/genome/schizosaccharomyces_grou...

We plan to make conveniently downloadable intron datasets available in the near future. In the meantime, you can find genes with introns using the PomBase Advanced Search.

At either site, you should search for genes with a specified number of exons, and use the range 2 (i.e. at least one intron) to 20 (more than the maximum known, 16 introns). You can also restrict the search to protein-coding genes. Note that the PomBase count includes introns in UTRs (the count in the old GeneDB search didn't).

Instructions for searching PomBase

  1. Go to the Advanced Search - http://www.pombase.org/spombe/query/builder
  2. Under "Select Filter" choose "Genes That Have N Exons"
  3. Enter values: Minimum 2, Maximum 20
  4. Optional: to restrict to protein-coding genes, click "+". Leave the operator set to "AND", and choose "Genes by Type", then choose "protein_coding".
  5. Click "Submit". The results page has links to download the resulting list of genes or the genomic, cDNA or protein sequences. Note that we plan to offer additional download options, including coordinates, in the future.

We do not have up-to-date intron branch site data available on the PomBase web site yet; we plan to create and maintain files of these data in the future. In the meantime, PomBase curators can provide files created in July 2011 upon request. Note that the files were generated using old scripts and therefore may not be comprehensive.

Transcript start and end coordinates from all sources will be available as individual data tracks in the Ensembl genome browser in the near future, which will allow you to view, evaluate and download them. We also provide downloadable UTR data sets that are updated periodically.

Also see the precedence criteria used to choose default UTR features to display on gene pages.

Go to the Advanced Search - http://www.pombase.org/spombe/query/builder
Under "Select Filter" choose "Genes By Type", then choose "snoRNA". Click "Submit". You can download the resulting list of genes or the genomic sequences. We plan to offer additional download options, including coordinates, in the future.

Note that there are likely a number of snoRNAs that have not yet been identified and annotated in S. pombe; we hope to investigate further soon.

The reference genome sequence excludes most of the ribosomal DNA (rDNA) repeats, which are present in two tandem arrays on chromosome III. These arrays are estimated to be 1225 kb and 240 kb in size for the sequenced strain (972 h-). The reference sequence includes two partial and one complete representative rDNA repeats:

The genomic GTF file available on the DNA Datasets page includes all gene features. We plan to replace this with a GFF file in the future, and will add all other annotated genome features (such as repeats) to the file at that time.

The EMBL format files also contain all annotated features.

Another option for extracting all annotated features (or if you need to specify which feature types to include) is to use the Ensembl API. See the FAQ "Can I access PomBase via an API?" for more information on using the API.

Although old cosmid sequences used in the reference assembly are not available in PomBase directly, they are all stored in the International Nucleotide Sequence Database Collaboration database (ENA, GenBank, DDBJ) archives. For ease of searching, PomBase curators recommend finding the accession, e.g. AL137130,  for a cosmid, and using GenBank to retrieve the sequence:

Go to http://www.ncbi.nlm.nih.gov/nucleotide/ (or choose "Nucleotide" in the search pull-down menu on any NCBI search page). Enter the accession. The resulting page will inform you that the sequence has been replaced by one of the whole-chromosome entries, but offers links to both the current chromosome entry and the obsolete contig entry.

To search PomBase for transposable elements:

  1. Go to the Advanced Search - http://www.pombase.org/spombe/query/builder
  2. Under "Select Filter" choose "Gene Annotation Status" and then choose "Transposon".
  3. Click "Submit". The results page has links to download the resulting list of genes or the genomic, cDNA or protein sequences. Note that we plan to offer additional download options, including coordinates, in the future.

At present, there are 11 full-length transposons annotated, and two frameshifted copies.

Lone LTRs are also annotated as sequence features. They cannot yet be retrieved by the simple or advanced searches, but they can be displayed on a track in the Ensembl browser.

Finally, if you wish to install Artemis (available from http://www.sanger.ac.uk/resources/software/artemis/), you can use it to view LTRs in more detail. Read in the EMBL format files of sequence and annotation (available from the DNA Datasets page). To see LTRs,

  1. In the Select menu, choose "By Key".
  2. In the pulldown that pops up, choose "LTR".

See the Artemis manual (pdf) for additional information.

In the future, we plan to make Fission Yeast Phenotype Ontology (FYPO) terms and annotations available in a browser analogous to AmiGO or QuickGO. Until such a browser becomes available, FYPO is accessible in these external resources:

NCBO BioPortal - search on the BioPortal home page or go to the FYPO page.

EBI's Ontology Lookup Service (OLS) - search on the OLS home page or go to the FYPO page.

No; only the Ensembl genome browser is available via the PomBase web site.

If you want to browse the S. pombe genome in the Artemis environment, it is fairly easy to download and run locally:

Once you have loaded the file(s), you can do many different things, e.g.:

  • Find features by name or ID
  • Find all features of a given type (e.g. see the "can I find transposons" FAQ)
  • Find matches to a specific nucleotide sequence (e.g. see the "restriction enzyme map" FAQ)
  • View the nucleotide or amino acid sequence of a region or feature
  • Export selected sequences

Also see the Artemis manual (pdf) for additional information.

The genomic GTF file available on the DNA Datasets page includes all gene features. We plan to replace this with a GFF file in the future, and will add all other annotated genome features (such as repeats) to the file at that time.

The EMBL format files also contain all annotated features.

Another option for extracting all annotated features (or if you need to specify which feature types to include) is to use the Ensembl API. See the FAQ "Can I access PomBase via an API?" for more information on using the API.

Yes, the Ensembl API can be used with PomBase, as documented here:

  1. Ensembl Perl API installation instructions
  2. Ensembl core database API documentation
  3. Tutorial for using the API with the core database - Includes examples about connecting to the database, retrieving chromosomes, genes, transcript and translations along with the corresponding xrefs.

We will add examples for common API uses soon.

The Fission Yeast GO slim terms page provides a generic GO slim for S. pombe, and shows total genes annotated directly or by inference to each term. For further information on using the generic S. pombe slim, or on creating your own GO slim, please see the Fission Yeast GO slimming tips page.

PomBase does not have its own tool for ID conversion. We suggest you try the EBI's PICR web service (http://www.ebi.ac.uk/Tools/picr/), which can convert between UniProtKB, RefSeq, Ensembl Genomes (including S. pombe systematic IDs) and many other common database IDs.

For UniProt IDs, we provide a static mapping file of PomBase systematic IDs and UniProtKB accessions, available on the Data Mapping page and by FTP from ftp://ftp.ebi.ac.uk/pub/databases/pombase/pombe/Mappings/gp2swiss.txt.

To search PomBase for transposable elements:

  1. Go to the Advanced Search - http://www.pombase.org/spombe/query/builder
  2. Under "Select Filter" choose "Gene Annotation Status" and then choose "Transposon".
  3. Click "Submit". The results page has links to download the resulting list of genes or the genomic, cDNA or protein sequences. Note that we plan to offer additional download options, including coordinates, in the future.

At present, there are 11 full-length transposons annotated, and two frameshifted copies.

Lone LTRs are also annotated as sequence features. They cannot yet be retrieved by the simple or advanced searches, but they can be displayed on a track in the Ensembl browser.

Finally, if you wish to install Artemis (available from http://www.sanger.ac.uk/resources/software/artemis/), you can use it to view LTRs in more detail. Read in the EMBL format files of sequence and annotation (available from the DNA Datasets page). To see LTRs,

  1. In the Select menu, choose "By Key".
  2. In the pulldown that pops up, choose "LTR".

See the Artemis manual (pdf) for additional information.

In the Ensembl genome browser, click the "Configure this page" button in the left-hand bar. A pop-up box will appear. Note that this box has several tabs along its top, and the exact selection of tabs and configuration options depends on whether you are configuring the "Location", "Gene", or "Transcript" tab of the main browser.

To turn a track on or off, click the small box to the left of its description. Note that some tracks simply toggle on and off, whereas for others a small popup appears, in which you can select from a set of options controlling exactly how the track appears. The left-hand bar of the configuration popup organizes available tracks into subsets, and offers a few additional options (including "Reset configuration", which restores the default display).

For example, to show or hide repeat regions, make sure you have the "Location" tab selected. The tabs for this configuration allow you to configure the "Region" (lower) and "Overview" (lower) images separately. In the "Configure Region Image" tab, click "Repeat regions" in the popup's left-hand bar. You can then check one box to show all repeats, or select specific types of repeat to display.

When you are finished choosing tracks, click the tick/check mark in the upper right corner of the configuration popup.

See the Citing PomBase page, which lists papers to cite for PomBase, the S. pombe genome sequence, and Compara. Additional key papers may be added as needed.

Clones included in the final sequence are available from archives@sanger.ac.uk

PomBase does not offer a GFF-to-GTF converter. There is a perl script on SEQanswers, which uses the module Bio::Tools::GFF from the BioPerl library, available from http://seqanswers.com/forums/showpost.php?p=22529&postcount=4

PomBase does not offer a converter. The Sequence Ontology site has a conversion script that can be used via a web form (at http://www.sequenceontology.org/cgi-bin/converter.cgi) or checked out from their CVS repository.

No, but this can be done within Artemis.

Install Artemis (available from http://www.sanger.ac.uk/resources/software/artemis/).

You can then read in the EMBL format files of sequence and annotation (available from the DNA Datasets page). To generate a restriction map:

  1. Create a new entry using the "Create" menu item "New Entry"
  2. Toggle off the main annotation by un-checking the chromsome contig file (this will make your new file "no name" the active entry).
  3. Save your new file with your preferred name.
  4. Use the Create menu option "Mark From Pattern" to create features for any restriction patterns of interest and save them into your file.
  5. You can add "color" labels to distinguish the different restriction sites. See the Artemis manual (pdf) for additional information.

At present PomBase does not have its own GO enrichment tool. We recommend using the Generic GO Term Finder at Princeton with the current PomBase GO annotation dataset (there is also a "slimming" tool, GO Term Mapper, which is useful for a broad overview).

There is no single transcriptome sequence file available from PomBase at present. We suggest using the data available from the Broad Institute: http://www.broadinstitute.org/annotation/genome/schizosaccharomyces_grou...