Transcription Start Site (TSS) Data from Drosophila S2 Cells

These data are provided as a service to the Drosophila community so that investigators could potentially examine the TSSs of their favorite genes. The recommendation citations are Vo ngoc et al. (Genetics, in preparation) and Gene Expression Omnibus (GEO) (accession number GSE68677).

5'GRO-seq data from Drosophila S2 cells were retrieved from the Gene Expression Omnibus (GSE68677). FASTQ sequencing files were retrieved for each replicate (samples GSM1678911, GSM1678912) and mapped to the dm3 assembly with Bowtie2 (default parameters). Pile-ups were built from reads with a MAPQ score ≥ 10. Focused transcription start sites (TSSs) were then called as described in Vo ngoc et al. (2017) by using FocusTSS.py (parameters: RCmin = 14, FImin = 0.67) on each 5'GRO-seq dataset separately. TSS peaks common in both replicates were selected and further analyzed.

Pile-ups of the 5'GRO-seq signal are provided for each replicate in BedGraph and BigWig format. For each dataset, the signal on the top and bottom strands is separated into two files (plus and minus). We also provide the BED file containing the location of 3,355 focused TSSs that were observed in both 5'GRO-seq replicates. In addition, we have provided an Excel file that indicates, for each TSS, the presence or absence of TATA, TATA-like, Inr, TCT, MTE, and DPE motifs as well as the identity of the closest gene. BED files containing all TSSs with specific core promoter motifs (TATA, TATA-like, Inr, TCT, MTE, and DPE, as identified by HOMER in Figure 4 of Vo ngoc et al., in preparation) are also included.

UCSC Genome Browser Track

Focused_TSSs.bed //BED file for the focused TSSs
GSM1678911_S2_GRO5-r1_minus.bedGraph //BedGraph file for the first replicate, bottom strand
GSM1678911_S2_GRO5-r1_plus.bedGraph //BedGraph file for the first replicate, top strand
GSM1678911_S2_GRO5-r1_minus.bw //BigWig file for the first replicate, bottom strand
GSM1678911_S2_GRO5-r1_plus.bw //BigWig file for the first replicate, top strand
GSM1678912_S2_GRO5-r2_minus.bedGraph //BedGraph file for the second replicate, bottom strand
GSM1678912_S2_GRO5-r2_plus.bedGraph //BedGraph file for the second replicate, top strand
GSM1678912_S2_GRO5-r2_minus.bw //BigWig file for the second replicate, bottom strand
GSM1678912_S2_GRO5-r2_plus.bw //BigWig file for the second replicate, top strand

FocusedTSSs_motif_sequence.xlsx Excel file for focused TSSs, their core promoter motifs, and nearby gene
INRpeaks.bed BED file for INR+ focused TSSs
TCTpeaks.bed BED file for TCT+ focused TSSs
MTEpeaks.bed BED file for MTE+ focused TSSs
DPEpeaks.bed BED file for DPE+ focused TSSs
TATApeaks.bed BED file for TATA-box+ focused TSSs
TATAlikepeaks.bed BED file for TATA-like+ focused TSSs

Reference:

Vo ngoc, L., Cassidy, C.J., Huang, C.Y., Duttke, S.H.C., and Kadonaga, J.T. (2017). The human initiator is a distinct and abundant element that is precisely positioned in focused core promoters. Genes Dev.31, 6-11. PMCID: PMC5287114

Vo ngoc, L., Kassavetis, G. A., and Kadonaga, J.T. The RNA polymerase II core promoter from the Drosophila perspective. Commissioned review article to be submitted to Genetics as part of the Flybook series.