mRNA Stability Analyzer

Frequently asked questions list:

How does it work?
What are the advantages of having an account?
Which algorithms are implemented and using what language?
What is the correct format of the input file?
How can the operation data (input and result) be downloaded?
What is the output of an operation?

How does it work? [ Top ]
This site offers the possibility to analyze the microarray data obtained on different time points from mRNA experiments. The user has the possibility to use algorithms from 3 distinct groups: analyzation, clustering and filtering. This application uses the concept of experiment to denote an overall container, having a clear purpose, for a suite of executions called operations. Each algorithm implemented here is considered an operation. It is an individual execution of a program having an input, a certain number of settings and an output. An experiment is composed by 0 or more operations. Once the user decided to use this site, he/she has 2 possibilities: either to create an experiment as an anonymous person or to create an account and start experiment. The main difference between the 2 cases is that in the latter a MySQL database is used to save all the experiments for a certain user, thus offering the chance of loading old experiments and examining the results. Either way, the results (text or image) can be saved to a local destination after the execution of the desired algorithm has completed.

What are the advantages of having an account? [ Top ]
An user account has the advantage of retaining the data (experiments, operations files) associated with a certain user. He/she is able to manage his/her own data and can keep track of the operations performed. The session limit is still in place, but after the countdown will stop, no executed operation will be lost. Every time an operation ends succesfully, it is saved (along with the data related to it) in the database

Which algorithms are implemented and using what language? [ Top ]
The site has 3 categories of algorithms: analysation, clustering and filter. The first category contains C programs which are using the values from input file to generate a new file from the combination between the values of TR and RA. It also employs a specific algorithm to determine the Spearman coefficient and slope for each gene. The clustering group has methods to analyse the data and generate both a graphical output and text information if necessary. They are written in R and make use extensively of available packages from bioconductor. The filtering section constains methods which can apply some modification on the values in the file and will create a new set of data, depending of the chosen method. The language used for these ones is PHP.

What is the correct format of the input file? [ Top ]
The input file must respect some rules to be accepted by the system. the following list presents them all:
- Currently there are 4 types of delimiters supported: "\t", ";", ",", "|" - without the double quotes. The program will automatically detect the correct one.
- First row will be the header and it must contain as many cells as the number of columns in the file (including the column with the names of the genes)
- It is very important not to use the delimiters in the name of the columns in the header. Otherwise, the parsing will fail.
- First Column will contain the name of the genes (names must be unique) and it will need a title
- Each column starting with the second will have a title and then the actual values (first column must have a title but it doesn't matter what is the format)
- The title of the columns must be built from the followed by a number, where the type can be either TR or RA and the number will represent the point in time at which the values in the respective column were obtained (example TR 0, TR 2, RA 0, RA 10 etc.)
- Both the TR and RA values must be in the same file and the columns representing one or another type must be adjacent (not RA, TR, RA, RA, TR)
- The time values for each column of either TR or RA group must be ordered ascendingly
- It is possible to use a third column type, namely the K followed by a mean time between curly braces. These kind of columns are generated by the analysis algorithms. The only difference between this type of file and the other is the columns names. It will always have the first column as K0 which is static it is used to calculate all the others. The next columns will have the format K() where is the arithmetic mean of the times of 2 adjacent columns as selected by the user in the settings screen (e.g. "RA 2" and "RA 4" will generate a column "K(3.00)")
- The file should have all the genes names unique; If there are 2 or more instances with the same name, the application will fail to upload the file and it will inform the user about the location of the duplicates. The application won't offer more than 2 locations for a gene which has multiple copies in the file.
Download example 1 (tr/ra)
Download example 2 (k)

How can the operation data (input and result) be downloaded? [ Top ]
On the right side of the site, a column with the history will appear from the moment an user will start an experiment. Each algorithm page will have a column like this. It contains all the operations associated with an eperiment and it offers the possibility to download a ZIP archive containing the input and output of an operation if clicked.

What is the output of an operation? [ Top ]
The output of an operation can be divided in 2 main classes: text and image. The former is generated by all algorithms (without UPGMA when there is no modification in the initial input file). Moreover, Filter missing values, Genarray and Stabilogen will replace the current working file with their result. The latter is obtained only by clustering methods and it can be a dendrogram (Sota, UPGMA), cluster plot (Kmeans) or Heatmap (all clustering methods). The user has the ability to download them even if he/she doesn't have an account.