Login Into Tox-Profiler

Login Into Tox-Profiler

Tox-Profiler manual

Login into Tox-Profiler

Before you can start with Tox-Profiler, you will need to login. Fill in the supplied username and password and press “login” (see top-right in Figure 1).

Figure 1: Login screen Tox-Profiler

Screen-Layout of Tox-Profiler

The default screen of Tox-Profiler is divided into different areas that allow quick access to several functionalities. Figure 2 shows a typical Tox-Profiler screen, divided in three columns and different boxes in each column. The middle column consists of a fixed Tox-Profiler “Title” at the top and a fixed “powered by” bar at the bottom and, most importantly the “main screen” in the middle. The main screen will provide the most important information, while the boxes in the left and right column provide the means for quick navigation. The left column, contains a button to go back to the last page, quick access to the “experiments” that are currently available to you, access to custom “analysis” tools and control over the “genesets” that are customly defined. The right column provides access to the “home” menu, the bugreport “tool”, the user “groups” you are part of and a history of previous pages you’ve visited.

In this course, instructions such as: “EXPERIMENTS -> upload” mean that you should click on “upload” in the “EXPERIMENTS” box, while “->EXPERIMENTS” means you should click on the header of the “EXPERIMENTS” box.

Furthermore, throughout this labwork, questions are asked (indicated in bold by Q1, Q2, …). Please, write down briefly the answers to these questions.

Figure 2: Tox-Profiler Navigation Layout

Available Experiments

Let’s look at which experiments are already available for you in Tox-Profiler:

“->EXPERIMENTS” (click on the EXPERIMENTS header in the top-left)

The main screen now shows that there are approximately 10 datasets available, among which several “<organ>_hcb”, “High Fat Mouse Adipose”, “HCA_liver”, “Spleen_HCA” and “toxic_compounds_liver_Spicker”. The “<organ>_hcb”, “HCA_liver”, “Spleen_HCA” are datasets taken from the example data that is used in all labworks of this PET course. The main screen shows some basic informationof each experiment, such as the no. samples, the author, the organism, etc. You can always press the i-icon on the right to obtain more detailed information about an experiment. In addition, if a dataset is stored in GEO (a public repository of array data) a link is provided to GEO.

Q1. How many spleen samples of the example dataset are stored in Tox-Profiler?

Q2. What are the two groups of samples in the Spicker dataset?

Q3. What type of microarray was used in the Spicker dataset?

Click on “Spleen_HCA” to see which samples are stored in the spleen dataset. Note that Tox-Profiler calls these sub-experiments rather than arrays because datasets are loaded into Tox-Profiler as ratios and it is up to the user to decide how these are constructed (averaged over replicates? ratios between which samples? what is defined as control?).

In this case, control was defined as the median of the 5 control arrays and ratios where taken between each array and this median control.

Q4. What compound was used in this example dataset? How many groups of samples are there in “Spleen_HCA”, what do they represent and how many samples/group?

Q5. How many genes where measured on each array?

Differentially expressed genes and genesets

Click on “C1” to see the log2 values of the genes in the first control array against the median control.

Q6. Which gene is downregulated the most in C1 compared to control and what is its z-score?

You can go back to “Spleen_HCA” by clicking on the “Back”-button or on “HCA_Spleen” in the title of the main screen, or by clicking “HCA_Spleen” in the history box on the right.

Q7. Which gene is downregulated the most in H5 compared to control and what is its z-score?

Q8. Which array (C1/H5) shows the strongest deviation from control? Can you confirm this based on the Std. Dev./Variance? Please give a biological explanation and a “data analysis”-based explanation for this difference.

Go back to “Spleen_HCA”.

The main feature of Tox-Profiler is that it is able to determine which sets of genes are particulary up- or down-regulated in an array/sub-experiment. For example, if there are 10.000 genes measured on array H5 and we have defined 30 genes to be associated to “cell cycle”, Tox-profiler computes a T-Value between H5 and “cell cycle” based on whether the ratios corresponding to these 30 genes are higher/lower than the background distribution (10000 ratios).

Q9. Make a schematic drawing of this example situation in the case that the set of cell cycle genes are downregulated in H5. Do this by drawing distributions and indicate in the drawing which characteristics cause a strong negative t-score.

Click on “Analyse” next to H5 and subsequently on (the now visible) WikiPathways. We now get an overview of the strongest up/downregulated genesets that correspond to the pathways stored in WikiPathways.

Q10. Which type of pathways are upregulated and which type of pathways are downregulated in H5?

Only pathways with an E-value < 5% are shown by default, but you can click on “ALL” in the title to see all pathway scores.

Q11. Which pathways are not significantly up/down at the current setting, but would be if the E-value threshold would be less stringent? Are these still biologically relevant?

Q12. What is the average ratio of the “Cell cycle” genes and how many genes (Orfs) are associated to the cell cycle pathway in WikiPathways?

Click on “Cell Cycle”. You now see the T-Values for Cell Cycle in all sub-experiments, including a graphical representation and a list of the genes that are associated with the “Cell Cycle” pathway.

Q13. Describe the general relationship between the compound and the Cell Cycle (ignoring some deviations).

Q14. Which sub-experiment(s) shows deviating behavior in terms of “Cell Cycle”?

Q15. How many Cell Cycle genes are downregulated in H5?

Q16. Compare log2 values of H5 and the deviating array and explain the technical reason for their difference in T-score.

Go back to Spleen_HCA 1) using several times Back-button or 2) by clicking on the topbar of the main screen (names with :: in between) or 3) by clicking “EXPERIMENTS” and then “Spleen_HCA”. You can obtain an overview of which pathways are up/down-regulated in which sub-experiments by clicking on “WikiPathways” at the top of the main-screen.

Q17. Which pathway, besides “Cell Cycle” is also upregulated in the high dose group in Spleen?

Q18. Are there any pathways clearly downregulated at high dose in Spleen? If so, which pathways and do they show a monotonic dose-response? If not, check which pathways come close and discuss a situation in which a pathway may not show up in the overview, but still might be biologically relevant.

Q19. Which BioCarta pathway is most strongly effected by the compound?

Q20. Which of the Human-Tissue (CMGS) genesets has the highest/lowest T-Value for individual sub-experiments and is not compound/dose-related. At which sub-experiments does this geneset peak? Can you think of a biological/technical reason that explains this result?

Q21. Which Netpath pathway responds particulary to low dose conditions?

Besides spleen, the example dataset also contains measurements on the liver. Load “HCA_Liver” and perform a WikiPathway analysis on all sub-experiments. It is possible to download the entire table of T-values for all WikiPathways that have been significantly up/down-regulated in at least one sub-experiment. To do so, click “Download whole analysis” at the top of the main screen and save file as “liver_wiki.txt” [ASK WHICH PATH]

To be able to import this file into GenePattern, we need to make a few modifications in Excel. Load “liver_wiki.txt” into Excel and insert a column between the first (with pathway names) and the second column (with first set of t-values). You can fill in whatever you want, e.g. “GeneSet” and all “Wiki”. Make sure that on the second row of the excel sheet the two numbers are in column A and B (See Figure 3). Now save this file as tab-delimited and change extension to gct, i.e. “liver_wiki.gct”.

Now open GenePattern and click on “HierarchicalClustering” under “Clustering” in the left column. Enter “liver_wiki.gct” as dataset, select Pearson correlation for both column and row distance measure, select “Pairwise complete linkage” and the rest “no” (except leave output base name <input.filename_basename>) and press “Run”. Click “Return to Modules & Pipelines Start”. GenePattern has clustered the rows (pathways) and columns (sub-experiments) based on the T-values. To see the results, click on the downward arrow next to “liver_wiki.atr” under “Recent Jobs” and “HierarchicalClustering” in the right column and select “HierarchicalClusteringViewer” and press “Run”.

You now see a heatmap of the T-values, where the rows are reordered according to the clustering indicated by the dendogram to the left and the columns are reordered according to the clustering indicated by the dendogram at the top. Additionally, the pathway/row-profiles are plotted as lines at the far right.

Q22. What colors are used in the heatmap to represent positive/high and negative/low T-values?

Q23. Which sub-experiments behave differently compared to the others in their group?

Q24. Describe the four main groups of pathways and describe their particular behavior in relation to the compound. (define a representative process per group and describe roughly up/down for C, L and H).

Q25. Compare these findings to the “Results and Discussion” part of the corresponding paper and describe at least one finding in agreement with the paper.

Additional Exercise 1

Tox-profiler also allows to merge (sub-)experiments. Try to merge all high doses sub-experiments of all different tissues in one new experiment. Study which genesets are effected the same in all tissues or particulary in a specific set of tissues.

Additional Exercise 2

Analyze the Spicker dataset, which Wikipathways are related to toxicity?

Additional Exercise 3

Confirm the results shown in the lecture this morning with the High Fat Mouse Adipose dataset