Set up Juicer using the information below! If you have any questions at all, please do not hesitate to contact us!
brew install ant
sudo apt-get install antor
sudo yum install ant
/opt/juicer. You can also access a public mirror of these files by going to
https://s3.amazonaws.com/juicerawsmirror/opt/juicer/[paths_below], for example: https://s3.amazonaws.com/juicerawsmirror/opt/juicer/work/HIC003/fastq/HIC003_S2_L001_R1_001.fastq.gz.
# tmp directory /opt/juicer/tmp # sample work directory is /opt/juicer/work/HIC003 /opt/juicer/work/HIC003/fastq /opt/juicer/work/HIC003/fastq/HIC003_S2_L001_R1_001.fastq.gz /opt/juicer/work/HIC003/fastq/HIC003_S2_L001_R2_001.fastq.gz # another sample work directory is /opt/juicer/work/MBR19 /opt/juicer/work/MBR19/fastq /opt/juicer/work/MBR19/fastq/chr19_R1.fastq.gz /opt/juicer/work/MBR19/fastq/chr19_R2.fastq.gz # Core Juicer scripts from github in /opt/juicer/scripts /opt/juicer/scripts/chimeric_blacklist.awk /opt/juicer/scripts/statistics.pl /opt/juicer/scripts/stats_sub.awk /opt/juicer/scripts/split_rmdups.awk /opt/juicer/scripts/countligations.sh /opt/juicer/scripts/juicebox /opt/juicer/scripts/juicebox_tools.jar /opt/juicer/scripts/juicer_postprocessing.sh /opt/juicer/scripts/dups.awk /opt/juicer/scripts/juicer.sh /opt/juicer/scripts/LibraryComplexity.class /opt/juicer/scripts/hicInternalMenu.properties /opt/juicer/scripts/abnormal.awk /opt/juicer/scripts/check.sh /opt/juicer/scripts/fragment.pl /opt/juicer/scripts/makemega_addstats.awk /opt/juicer/scripts/mega.sh /opt/juicer/scripts/relaunch_prep.sh /opt/juicer/scripts/cleanup.sh # Sequence reference files in /opt/juicer/references # hg19 and mm9 reference files in mirror /opt/juicer/references/Homo_sapiens_assembly19.fasta /opt/juicer/references/Mus_musculus_assembly9_norandom.fastaMake sure to copy the appropriate scripts from the github repo to your cluster as well as the fastq reads and appropriate reference files.
bwa index <fasta file>:
# after running BWA indexing /opt/juicer/references/Homo_sapiens_assembly19.fasta.sa /opt/juicer/references/Homo_sapiens_assembly19.fasta.ann /opt/juicer/references/Homo_sapiens_assembly19.fasta.amb /opt/juicer/references/Homo_sapiens_assembly19.fasta.pac /opt/juicer/references/Homo_sapiens_assembly19.fasta.bwt /opt/juicer/references/Mus_musculus_assembly9_norandom.fasta.bwt /opt/juicer/references/Mus_musculus_assembly9_norandom.fasta.amb /opt/juicer/references/Mus_musculus_assembly9_norandom.fasta.pac /opt/juicer/references/Mus_musculus_assembly9_norandom.fasta.ann /opt/juicer/references/Mus_musculus_assembly9_norandom.fasta.saAfter building the restriction sites files
python generate_site_positions.py <enzyme> <genome ID> <fasta>
# restriction sites files in /opt/juicer/restriction_sites /opt/juicer/restriction_sites/mm9_HindIII.txt /opt/juicer/restriction_sites/mm10_MboI.txt /opt/juicer/restriction_sites/mm10_DpnII.txt /opt/juicer/restriction_sites/hg19_MboI.txt /opt/juicer/restriction_sites/hg38_MboI.txt /opt/juicer/restriction_sites/hg38_DpnII.txt /opt/juicer/restriction_sites/hg19_DpnII.txt /opt/juicer/restriction_sites/hg19_HindIII_new.txt /opt/juicer/restriction_sites/mm9_DpnII.txt
Start using Juicer using the information below!
Usage: juicer.sh [-g genomeID] [-d topDir] [-q queue] [-l long queue] [-s site] [-a about] [-R end] [-S stage] [-p chrom.sizes path] [-y restriction site file] [-z reference genome file] [-C chunk size] [-D Juicer scripts directory] [-Q queue time limit] [-L long queue time limit] [-r] [-h] [-x] * [genomeID] must be defined in the script, e.g. "hg19" or "mm10" (default "hg19"); alternatively, it can be defined using the -z command * [topDir] is the top level directory (default "/Users/nchernia/Downloads/neva-muck/UGER") [topDir]/fastq must contain the fastq files [topDir]/splits will be created to contain the temporary split files [topDir]/aligned will be created for the final alignment * [queue] is the queue for running alignments (default "short") * [long queue] is the queue for running longer jobs such as the hic file creation (default "long") * [site] must be defined in the script, e.g. "HindIII" or "MboI" (default "MboI") * [about]: enter description of experiment, enclosed in single quotes * -r: use the short read version of the aligner, bwa aln (default: long read, bwa mem) * [end]: use the short read aligner on read end, must be one of 1 or 2 * [stage]: must be one of "merge", "dedup", "final", "postproc", or "early". -Use "merge" when alignment has finished but the merged_sort file has not yet been created. -Use "dedup" when the files have been merged into merged_sort but merged_nodups has not yet been created. -Use "final" when the reads have been deduped into merged_nodups but the final stats and hic files have not yet been created. -Use "postproc" when the hic files have been created and only postprocessing feature annotation remains to be completed. -Use "early" for an early exit, before the final creation of the stats and hic files * [chrom.sizes path]: enter path for chrom.sizes file * [restriction site file]: enter path for restriction site file (locations of restriction sites in genome; can be generated with the script (misc/generate_site_positions.py) ) * [reference genome file]: enter path for reference sequence file, BWA index files must be in same directory * [chunk size]: number of lines in split files, must be multiple of 4 (default 90000000, which equals 22.5 million reads) * [Juicer scripts directory]: set the Juicer directory, which should have scripts/ references/ and restriction_sites/ underneath it (default /broad/aidenlab) * [queue time limit]: time limit for queue, i.e. -W 12:00 is 12 hours (default 1200) * [long queue time limit]: time limit for long queue, i.e. -W 168:00 is one week (default 3600) * -x: exclude fragment-delimited maps from hic file creation * -h: print this help and exit
mkdir -p /custom/filepath/MyHIC)
/local/path/scripts/juicer.sh [options]where /local/path refers to the folder containing the scripts folder bundling the necessary files included with this distribution. Do not exit the screen or kill the script until you see a message saying that all jobs have been submitted.
squeueto check the status of jobs. Eventually there will be no more jobs in the queue, and the ./debug folder will have a "Pipeline successfully completed" message.
NOTE: the Juicer pipeline under the Complete Pipeline automatically builds the Hi-C maps and automatically annotates features mentioned below. The additional documentation below is intended for advanced customization of the post-processing algorithms.
The Juicer postprocessing tools below require a valid .hic file and the Juicebox Command Line Tool jar file. Additional references may be needed for some of the specific tools (e.g. CUDA and GPU for finding loops)
arrowhead [-c chromosome(s)] [-m matrix size] [-r resolution] [-k normalization (NONE/VC/VC_SQRT/KR)] " + "<HiC file(s)> <output_file> [feature_list] [control_list]
arrowhead https://hicfiles.s3.amazonaws.com/hiseq/ch12-lx-b-lymphoblasts/in-situ/combined_30.hic contact_domains_list
arrowhead https://hicfiles.s3.amazonaws.com/hiseq/gm12878/in-situ/combined_30.hic contact_domains_list
hiccups [-m matrixSize] [-c chromosome(s)] [-r resolution(s)] [-k normalization (NONE/VC/VC_SQRT/KR)] [-f fdr] [-p peak width] [-i window] [-t thresholds] [-d centroid distances] <HiC file(s)> <outputLoopsList>
hiccups HIC006.hic all_hiccups_loops
hiccups -m 500 -r 5000,10000 -f 0.1,0.1 -p 4,2 -i 7,5 -d 20000,20000,0 -c 22 HIC006.hic all_hiccups_loops
apa [-n minval] [-x maxval] [-w window] [-r resolution(s)] [-c chromosome(s)] [-k NONE/VC/VC_SQRT/KR] <HiC file(s)> <PeaksFile> <SaveFolder>
apa HIC006.hic all_loops.txt results1
apa https://hicfiles.s3.amazonaws.com/hiseq/gm12878/in-situ/combined.hic all_loops.txt results1
apa -r 10000,5000 -c 17,18 HIC006.hic+HIC007.hic all_loops.txt results
motifs <genomeID> <bed_file_dir> <looplist> [custom_global_motif_list]
motifs hg19 /path/to/local/bed/files gm12878_hiccups_loops.txt hg_19_custom_motif_list.txt
If you use Juicer in your research, please cite:
Neva C. Durand*, Muhammad S. Shamim*, Ido Machol, Suhas S. P. Rao, Miriam H. Huntley, Eric S. Lander, and Erez Lieberman Aiden. "Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments." Cell Systems 3(1), 2016.
Suhas S.P. Rao*, Miriam H. Huntley*, Neva C. Durand, Elena K. Stamenova, Ivan D. Bochkov, James T. Robinson, Adrian L. Sanborn, Ido Machol, Arina D. Omer, Eric S. Lander, Erez Lieberman Aiden. "A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping." Cell 159, 2014.
* contributed equally