This is a README file for Differential Expression Analysis of the Aging project. Below I've described what each folder and file contains. This is quite an extensive report so please don't hesitate to ask any questions! If any point of this README file is confusing, please let me know and I can adjust accordingly. :) Please note that for DEG analysis I used 4 comparisons: Target---Reference Dp16_old---Dp16_yng 2N_old---2N_yng Dp16_old---2N_old Dp16_yng---2N_yng Whenever a gene or pathway is Up-regulated that is referring to the Target. For example, if GeneA has a positive log2FC in Dp16_old vs. Dp16_yng, then it is Up-regulated in Dp16_old, and Down-regulated in Dp16-yng. ~~~~~ CP_OUT/ - this folder contains three main databases for enriched pathways: GO, KEGG, and REACT csv_tables/ - CSV files are broken down for each cluster number and for each cluster number differential expression was performed for every comparison (4 total). csv_tables/GO/ - each comparison and cluster number used three gene set libraries from EnrichR called BP (Biological Process), CC (Cellular Component), and MF (Molecular Function). Each comparison displays three different regulation directions UP (up), DOWN (dwn), and COMBINED (cmb - contains both UP and DOWN regulated DEGs). For example, cluster 0 notation is as follows: "0_Dp16_old-2N_old_cmb_BP_go.csv" 0 = cluster number Dp16_old-2N_old = Target-Reference cmb = Combined regulation BP = gene set library; in this case Biological Process go = GO database In total, there are 4 pairwise-comparisons, 3 gene set libraries, and 3 regulation directions. Please note however that not every cluster will have DEGs detected for each comparison/library/regulation. csv_tables/KEGG/ - same notation and file format as the GO folder, except KG = KEGG and that is the only gene set library available. cvs_tables/REACT/ - same notation and file format as the GO folder, except RCT = REACT and that is the only gene set library available. html_tables/ - this folder follows the same organization as csv_tables/ except that it's in an interactive HTML format. plots/ - These dot plots display top N enriched pathways from the GO database, including BP, CC, and MF gene set libraries. ~~~~~ DESeq_out/ - This folder contains the DESeq2 output from the differential expression analysis. For each cluster number there should be 4 files total, one for each pairwise-comparison. This file also contains both UP- and DOWN-regulated genes. The first condition listed in the file is the Target followed by the Reference condition. ~~~~~ DEreports/ - This folder contains volcano plots of DEGs that passed the Log2 Fold-Change and adj. p-value cutoffs. Each cluster has its own file with 2 volcano plots per pairwise-condition. The table on the right side displays how many DEGs passed various adj. p-value cutoffs. This file can be used to help determine whether we like the cutoffs or whether we should be more strict/loose. ~~~~~ Filtered_Matrices - This folder contains 4 filtered bulk matrices (4 pairwise-comparisons) per cluster number. These bulk matrices can be used in future analyses, like generating CPMs. ~~~~~ DEG_countSummary.txt - This file contains general information regarding DEG analysis. The columns are as follows: cluster number, regulation direction, pairwise comparison (Target-Reference), DEG count, what variables/co-variates were used in the DESeq2 formula