Hotspot and the SPOT data quality metric

Hotspot is a program for identifying regions of local enrichment of short-read sequence tags mapped to the genome using a binomial distribution model. Regions flagged by the algorithm are called "hotspots." The algorithm utilizes a local background model that automatically normalizes for large regions of elevated tag levels due to, for example, copy number effects. Hotpsot is otherwise able to detect regions of enrichment of highly-variable size, making it applicable to both broad and highly-punctate signals. We have applied it extensively to DNase-seq and ChIP-seq data, including transcription factor (CTCF) and histone modification (H3K4me3, H3K36me3, H3K27me3) data.

Hotspot was originally conceived and implemented by Mike Hawrylycz. Additional contributors and developers include Bob Thurman, Eric Haugen, and Scott Kuehn.

This distribution also includes scripts for computing SPOT (Signal Portion of Tags), a quality measure for short-read sequence experiments. SPOT is simply the percentage of all tags that fall in hotspots.

Documentation for hotspot (Word and Powerpoint documents are slightly out-of-date)

Sam John, Peter J Sabo, Robert E Thurman, Myong-Hee Sung, Simon C Biddie, Thomas A Johnson, Gordon L Hager, John A Stamatoyannopoulos, Chromatin accessibility pre-determines glucocorticoid receptor binding patterns. Nature Genetics, 43, 264-268 (2011).
Brief methods-type description (Word)
Powerpoint presentation

Feb 2014

Hotspot is currently hosted on Github

See the hotspot Github repository for the current version of hotspot.

5 Jul 2013

Hotspot-SPOT distribution (v4)

This version has been superceded by the current version hosted on Github. See above.
Gzipped tar-ball (254Mb). After unpacking, start with the top-level README file.
The changes between v3 and v4 are detailed in the CHANGES file in the top level of the distribution. Some highlights are as follows.
1. For ChIP-seq data, an input tags file is now accommodated. This will trigger subtracting input tags from the ChIP tags in hotspots in the final scoring of hotspots.
2. A final cleanup script is now provided which creates a simplified output directory with more intuitive file-naming conventions, and includes a "clean" option that removes all intermediate directories and files.
3. Auxiliary scripts are now provided for generating mappability files, so that you can do this yourself, provided you have access to the bowtie aligner (Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 10:R25, http://bowtie-bio.sourceforge.net). This should remove the need for updating the "Mappability files" section, below (v3), but we will continue to entertain requests to generate new files as well.

25 Jan 2013

Hotspot-SPOT distribution (v3)

This version is now superceded. Although I will continue to try to answer qustions is the current, maintained version, and includes code for computing hotspots (including SPOT scores) and peaks, and performing FDR thresholding. (Version 2, below, does not include peak-finding nor FDR thresholding capabilities.) The code has only been tested for Linux.
Gzipped tar-ball (304Mb). After unpacking, start with the top-level README file.

Mappability files

Below find files containing coordinates of uniquely-mappable regions of the genome for various read-lengths and genomes. These files would be used for the _MAPPABLE_FILE_ variable defined in runall.tokens.txt. NOTE: the .starch files are bed files compressed using the starch tool, which is part of the BEDOPS suite. The file used in _MAPPABLE_FILE_ must be uncompressed (you can use unstarch from BEDOPS for this purpose). If you have need for a particular combination not available below, feel free to contact ehaugen(at)altiusinstitute.org. For help with running hotspot, please contact rsandstrom(at)altiusinstitute.org

hg38 (_CHROM_FILE_)

27bp reads (starch file, 23 Mb)
36bp reads (starch file, 14 Mb)
50bp reads (starch file, 6.5 Mb)
76bp reads (starch file, 1.9 Mb)

hg19 (_CHROM_FILE_)

20bp reads (bed file, 787 Mb, starch file, 48 Mb)
22bp reads (bed file, 484 Mb, starch file, 32 Mb)
26bp reads (bed file, 334 Mb, starch file, 24 Mb)
27bp reads (bed file, 326 Mb, starch file, 23 Mb)
32bp reads (bed file, 236 Mb, starch file, 17 Mb)
36bp reads (bed file, 182 Mb, starch file, 14 Mb)
40bp reads (bed file, 182 Mb, starch file, 14 Mb)
42bp reads (bed file, 119 Mb, starch file, 10 Mb)
50bp reads (bed file, 72 Mb, starch file, 6.0 Mb)
58bp reads (bed file, 42 Mb, starch file, 3.9 Mb)
72bp reads (bed file, 20 Mb, starch file, 2.0 Mb)
76bp reads (bed file, 17 Mb, starch file, 1.7 Mb)
100bp reads (bed file, 9 Mb, starch file, 900 kb)

mm9 (_CHROM_FILE_)

36bp reads (bed file, 117 Mb, starch file, 9.7 Mb)
40bp reads (bed file, 95 Mb, starch file, 8.1 Mb)
48bp reads (bed file, 66 Mb, starch file, 5.9 Mb)
51bp reads (bed file, 58 Mb, starch file, 5.3 Mb)
100bp reads (bed file, 15 Mb, starch file, 1.5 Mb)

mm10 (_CHROM_FILE_)

36bp reads (bed file, 120 Mb, starch file, 9.9 Mb)
50bp reads (bed file, 64 Mb, starch file, 5.7 Mb)
60bp reads (bed file, starch file)

dm3 (_CHROM_FILE_)

27bp reads (bed file, 4.3 Mb, starch file, 346 kb)
36bp reads (bed file, 3.8 Mb, starch file, 306 kb)

dm6 (_CHROM_FILE_)

35bp reads (bed file, starch file)
36bp reads (bed file, starch file)
45bp reads (bed file, starch file)

sacCer2 (_CHROM_FILE_)

36bp reads (bed file, 51 kb, starch file, 9.7 kb)

sacCer3 (_CHROM_FILE_)

36bp reads (bed file, 51 kb, starch file, 12 kb)
100bp reads (bed file, 18 kb, starch file, 8 kb)

ce4 (_CHROM_FILE_)

36bp reads (bed file, 1.5 Mb, starch file, 157 kb)

ce10 (_CHROM_FILE_)

50bp reads (bed file, 921 kb, starch file, 103 kb)

rn5 (_CHROM_FILE_)

30bp reads (bed file, 139 Mb, starch file, 11 Mb)

TAIR9 (_CHROM_FILE_)

36bp reads (bed file, starch file)

11 June 2010

Hotspot-SPOT distribution (v2)

Gzipped tar-ball (176Mb). After unpacking, start with the top-level README file. The documentation linked above is in the doc directory.
UW utils (gzipped tar-ball, 102 kb): a selection of bed file-oriented utilities for computing various set operations, sorting, etc., required by many of the scripts included in the SPOT distribution.
Mappability files for hg19 (.tgz, 47Mb). Above distribution is equipped for hg18 coordinates. Use the files here to work with hg19 coordinates, substituting for the token variables _CHROM_FILE_, _MAPPABLE_10KB_FILE_, and _MAPPABLE_FILE_.