-
Can I submit sequences to scan simply by telling the
software the accession id of the genes I'm interested
in?
-
Can I submit sequences saved as a Word document?
-
My sequence file is not being read in by the software?
-
Are sequences scanned for repeats before they are
analysed?
-
How are the results of the scans reported?
-
What is GFF format?
-
Can I view my results using other software?
-
Using the most stringent scans cut-off, I get no results?
-
When I run my sequences I get a lot of hits.
How can I reduce them?
-
Do I need to change any of the parameter values
for the individual algorithms to get them to
run?
-
The results in the PDF file are difficult to
see as they are so dense?
-
In Argo, why do the arrows for the predictions point in two
different directions?
-
The results in the Argo browser are mixed up.
How do I create a clearer picture?
-
Can I scan using my own matrices that I have?
-
How long will my results be saved on the server?
-
The Argo display does not work for me. What is
Java Web Start and where do I get it from?
-
Are promoter sequences scanned on both strands?
-
I get an error message when trying to run Clover
with sequences of greater than 5K basepairs. Why?
-
What's different about the stringent matrices?
-
How many matrices are there per set?
-
What does Mogul stand for? Is it an acronym?
1. Can I submit sequences to scan simply by telling the
software the accession id of the genes I'm interested
in?
No. MotifMogul is not connected to any sequence database
and so does not 'understand' identifiers from say EntrezGene,
RefSeq, EnsEMBL etc. Sequences to be scanned must be
input in FASTA format only.
This decision was made to avoid problems arising when
trying to scan genes with multiple translations. Questions
over which sequence is the correct promoter become
tricky to answer. Also, there are differing views as
to whether UTR regions should be included in the analyses
too. Rather than having to deal with these issues, the
user determines exactly what sequence is scanned. Therefore,
if an 'expected' site is not reported, then this may be
due to the wrong sequence being scanned and is not
the fault of the software.
[Back to top of page]
2. Can I submit sequences saved as a Word document?
No. Sequences should be saved as simple text files.
If you are using Word, use Save As to create a simple
text file.
[Back to top of page]
3. My sequence file is not being read in by the software?
Check that you are not trying to submit a Word document
or something with formatted text. MotifMogul ONLY
understands simple, text files .
[Back to top of page]
4. Are sequences scanned for repeats before they are
analysed?
No. The algorithms should however run with masked sequences.
[Back to top of page]
5. How are the results of the scans reported?
Results are reported in 3 ways.
[Back to top of page]
6. What is GFF format?
GFF stands for General Feature Format and is designed
to report genomic features in a standard way.
Amongst other data, it saves key details such as start and
end coordinates of features and the scores
assigned to them (if applicable). The full specification can be
read
here.
[Back to top of page]
7. Can I view my results using other software?
One reason for choosing GFF as the format in which to report
results, is that it is a format that can be read by a number
of genomic visualisation tools. Below is a list of other
standalone software tools that read and display GFF files.
Hence, if you wanted to download the raw GFF file from
a set of scans and change the colours, number of scan
tracks, etc., then a standalone package may be the way
to go.
We do not promote any one particular tool here,
but in-house at the ISB we have found that Argo and
Apollo have worked well for our needs.
[Back to top of page]
8. Using the most stringent scans cut-off, I get no results?
This is entirely possible as the most stringent cut-off is
quite severe - selecting only the top 0.01% of predictions for
a matrix's scores compared to random. Try using a lower stringency threshold.
[Back to top of page]
9. When I run my sequences I get a lot of hits.
How can I reduce them?
Try selecting a more stringent cut-off threshold
and/or adapting the parameters of the individual matrix
scanning algorithms. For example, increasing the value of
MotifLocator's Motif Threshold parameter value.
[Back to top of page]
10. Do I need to change any of the parameter values
for the individual algorithms to get them to
run?
No. The parameter values that initially appear in
the windows are the default values that the algorithms
run with out-of-the-box.
[Back to top of page]
11. The results in the PDF file are difficult to
see as they are so dense?
The PDF visualisation of results is only intended
to be a quick-and-easy way to view the results. It
works well when there around 20-30 predictions. For
only single hits, one tends to get large single
blocks with large text. With tens to hundreds of
hits, the display becomes too complex to read. Trying
to find some middle-ground is difficult!!
If your results are too complex to interpret in PDF, try
running scans one analysis at a time and view each of
the individual PDFs separately. Or look at the results
in Argo (or some other
standalone GFF display tool) where
you are able to interactively change the way the
results are displayed.
[Back to top of page]
13. In Argo, why do the arrows for the predictions point in two
different directions?
Arrows pointing to the right of the screen, indicate
matrix hits predicted on the forward strand of the
sequence. Arrows pointing to the left, are predictions
on the reverse strand.
[Back to top of page]
14. The results in the Argo browser are mixed up.
How do I create a clearer picture?
In a blank area of the sequence display window,
right click on your mouse button to bring up an
on-screen menu. Select the Track Table..
item. In the new menu that appears, in the Stack
column, change the option for each track from
Integrated to Segregated. This then
separates the results into their own individual
tracks.
[Back to top of page]
15. Can I scan using my own matrices that I have?
At the current time, no. This is something
that we may explore in the near future for
use with the individual scanning algorithms.
[Back to top of page]
16. How long will my results be saved on the server?
The raw GFF files remain on the server for a
week before they are removed. PDF and Argo
specific files are removed daily as they
can accumulate very rapidly.
[Back to top of page]
17. The Argo display does not work for me.
What is Java Web Start and where do I get it from?
Java Web Start is a software technology from Sun that allows
applications to be launched on an end-users desktop machine, using
any standard web browser.
If it isn't installed on your machine, here's a
simple tutorial guiding you through installation. And this
is a
more detailed one.
[Back to top of page]
18. Are promoter sequences scanned on both strands?
Yes. In the GFF file, a prediction on the forward
strand is represented as a +. Predictions on the
reverse strand are represented as a -.
[Back to top of page]
19. I get an error message when trying to run Clover
with sequences of greater than 5K basepairs. Why?
To assess the significance of predictions Clover uses
a set of background sequences as a reference set.
This set of sequences must be at least as long as
the promoter sequence(s) being analysed. This then
becomes a trade-off between the maximum length of
sequence to analyse and the time that it takes
Clover to run: the longer the promoter sequence
the longer the run time. The maximum sequence is
set to 5K.
[Back to top of page]
20. What's different about the stringent matrices?
Stringent matrices are derived from a hand-curated
list of matrices used in the
paper. These are a subset of all the matrices
on a species basis.
[Back to top of page]
21. How many matrices are there per set?
For a full description of the matrix sets
please see
here.
[Back to top of page]
22. What does Mogul stand for? Is it an acronym?
Mogul is gaelic for the mesh of a net.
It is not an acronym.
[Back to top of page]
|