We chose to put a sample output of the website on the front page so that the user can test the interaction with the results produced by the website without effort.
***Please Note: If the entire page is taken up by the data table, and there are no selection boxes on the left side of the webpage, then you'll need to zoom out in your browser.
This is generally done by holding 'ctrl' and pressing '-' (minus). Afterwards, if you'd like to reset your browser's page to normal zoom, you may hold 'ctrl' and press '0' (zero).
There are a few things to note on using the data table:
At the top of the data table- the Organism, Gene, and Number of Genes to be displayed are shown; as selected by the user.
Below these- there are the brackets that categorize the species into their appropriate taxa (Animals, Insects, etc; all color coded) and subtaxa. The first two columns of the heatmap (colored in yellow-red hue) represent Pearson correlation coefficient and Z-score for the similarity of the hit to the query profile. Z-score is calculated based on the simplistic null model of random shuffling of profile values across species (columns), disregarding the actual structure of phylogenetic relationships between these species. As a general rule of thumb, we observe that marginal Pearson correlation coefficient of 0.5 typically corresponds to a Z-score of approximately 5.0, whereas highly significant Pearson correlation coefficient of 0.95 typically corresponds to a Z-score of approximately 8.0. These Z-score values can be used as approximate cutoffs for insignificant and extremely significant similarities, respectively.
The data table is visibly divided by thin black lines so that the user actively knows where one taxa ends and another begins.
At the bottom of the data table (you may have to scroll), on the x-axis, is the list of species.
At the left of the data table, on the y-axis, is the top K number of genes most correlating with the gene you chose, with K being the number of genes you chose to display. (For example: if you chose 100 genes to be displayed, the graph will generate and show the top 100 genes that have the closest correlation with the gene the user has selected).
The squares of color range from white to dark-blue, with white showing no homology towards the chosen gene, and dark-blue showing a high sequence similarity towards the chosen gene. The squares representing Pearson correlation coefficient and Z-score for the similarity to the query are colored from yellow to dark-red, with darker color representing higher similarity
Hovering over the squares will bring up the user's inputted gene, the organisms we blast against and the sequence similarity value between 0 and 1. The color of the square is based on that value, with white scoring a "0", and dark-blue scoring a "1".
Hovering over a gene name will bring up a short description. Clicking on the gene name will bring up a new tab, connecting the user to the ncbi website with more information on the gene.
If you'd like to return to the data table generated, you may simply copy and paste the URL.
Tabach, Y., A. C. Billi, G. D. Hayes, M. A. Newman, O. Zuk, H. Gabel, R. Kamath, K. Yacoby, B. Chapman, S. M. Garcia, M. Borowsky, J. K. Kim and G. Ruvkun, 2013, Identification of small RNA pathway genes using patterns of phylogenetic conservation and divergence. Nature 493(7434):694-8.
Tabach, Y., T. Golan, A. Hernandez-Hernandez, A. R. Messer, T. Fukuda, A. Kouznetsova, J. G. Liu, I. Lilienthal, C. Levy and G. Ruvkun, 2013, Human disease locus discovery and mapping to molecular pathways through phylogenetic profiling. Mol Syst Biol 9: 692.