snakemake tutorial - griote · 09/20/2016 snakemake tutorial 9 snakefle (1/2) import glob, ntpath...

50
09/20/2016 Snakemake Tutorial 1 Snakemake Tutorial Snakemake Tutorial

Upload: others

Post on 24-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 1

Snakemake TutorialSnakemake Tutorial

Page 2: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 2

SummarySummary● Context

● Common uses (4 steps)

● Specifc uses (2 steps)

● Still in progress

Page 3: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 3

● An easy scripts scheduler in python

● Can be used on computing grid

● Can be used with virtual environments

Context Common uses Specific uses Still in progress

Page 4: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 4

Context Common uses Specific uses Still in progress

● What for?

For example, you need to compute the 1500 fastq files of the Lanner experiment in a succession of parallelised treatments.

Page 5: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 5

Context Common uses Specific uses Still in progress

● Te pipeline we will buildRule 'unzip'

Rule 'miniQC'

Rule 'fastqc'

Fichier ERR1307260

Fichier ERR1307264

Rule 'unzip'

Rule 'all'

Rule 'fastqc'

... ...

Fichier ERR130726...

Rule 'htseq'

Rule 'tophat' Rule 'htseq'

Rule 'tophat'

... ...

Page 6: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 6

● Prerequisites

conda installed

access to the BiRD grid

Context Common uses Specific uses Still in progress

Page 7: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 7

● 1st Step

– Virtual environment creation– Basic snakefile creation (first part)

Context Common uses Specific uses Still in progress

Page 8: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 8

● Environment (shell commands)

conda create -n myTutoEnv python=3.5 snakemake fastQC bioconductor-deseq2 -c bioconda

qlogin

source activate myTutoEnv

touch snakefile

Context Common uses Specific uses Still in progress

Page 9: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 9

● Snakefle (1/2)import glob, ntpath

inFiles = set() ## set of all input files (only file names)

## extract file names from pathsinPaths = glob.glob( "/sandbox/ylelievre/singlecell/tuto/fastq/*.fastq.gz" ) for p in inPaths: inFiles.add( os.path.basename(p).replace(".fastq.gz", "") )

##############

Context Common uses Specific uses Still in progress

Basename extractionsfrom the files to be

computed

Page 10: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 10

● Snakefle (2/2)rule all: input: I1 = expand("fastQC/{filename}_fastqc.zip",filename=inFiles)

##############

rule fastqc: input: "/sandbox/ylelievre/singlecell/tuto/fastq/{afile}.fastq.gz" output: "fastQC/{afile}_fastqc.zip" shell: "fastqc -o fastQC {input}"

Context Common uses Specific uses Still in progress

The 1st rule (all) definesthe target of the pipeline

The subsequent rules define the differentsteps to reach the target (here just one step)

Page 11: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 11

● Launch the pipeline (shell command)

snakemake -p -n -s mySnakefileName

Context Common uses Specific uses Still in progress

-p : print the commands executed

-n : dryrun

-s : if the Snakefile name is not 'Snakefile'

Page 12: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 12

● What can be observed ?– The rules are executed sequentially (SGE not used)– The 5 rules fastqc are executed before the rule all– There is not a specific order between the fastqc rules

Context Common uses Specific uses Still in progress

Rule 'fastqc'

Rule 'fastqc'

Rule 'all'...

Fichier ERR1307260

Fichier ERR1307265

Fichier ERR130726...

Page 13: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 13

● Preparation of the next step

– Delete a zip file (ERR1307264) from the directory ./fastQC

Context Common uses Specific uses Still in progress

Page 14: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 14

● 2nd Step

– Basic snakefile creation (second part)

Context Common uses Specific uses Still in progress

Page 15: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 15

● Snakefle updaterule all: input: I1 = expand("fastQC/{filename}_unzip_fastqc.done",filename=inFiles)

##############...

rule unzip_fastqc: input: zips = "fastQC/{zipfile}_fastqc.zip" output: touch("fastQC/{zipfile}_unzip_fastqc.done") shell: "unzip -o {input.zips} -d ./fastQC"

Context Common uses Specific uses Still in progress

Input modified

Rule unzip_fastqc added

Page 16: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 16

● Launch the pipeline (shell command)

snakemake -p

Context Common uses Specific uses Still in progress

Page 17: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 17

● What can be observed ?– All the rules fastqc are not necessarily executed before

the unzip rules

Context Common uses Specific uses Still in progress

Rule 'unzip'

Rule 'unzip'Rule 'all'

...

Rule 'fastqc'

Fichier ERR1307260

Fichier ERR1307263

Fichier ERR130726...

Fichier ERR1307264 Rule 'unzip'

Page 18: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 18

● Preparation of the next step

– Delete 2 zip files (ERR1307263 and ERR1307264) from the directory ./fastQC

– Delete 2 done files (ERR1307262 and ERR1307264) from the directory ./fastQC

Context Common uses Specific uses Still in progress

Page 19: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 19

● 3rd Step

– Basic snakefile creation (second part)– Missing files and snakemake

Context Common uses Specific uses Still in progress

Page 20: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 20

● Snakefle updaterule all: input: "QC/miniQC.done"

##############...

rule miniQC: input: I1 = expand("fastQC/{filename}_unzip_fastqc.done",filename=inFiles) output: touch("QC/miniQC.done") shell: "./miniQC.sh"

Context Common uses Specific uses Still in progress

Input modified

Rule miniQC added

Page 21: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 21

● Launch the pipeline (shell command)

snakemake -p

Context Common uses Specific uses Still in progress

Page 22: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 22

● What can be observed ?

Context Common uses Specific uses Still in progress

Rule 'unzip' Rule 'miniQC'

Rule 'fastqc'

Fichier ERR1307260

Fichier ERR1307263

Fichier ERR1307262

Fichier ERR1307264 Rule 'unzip'

Fichier ERR1307261

Rule 'all'

Page 23: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 23

● Preparation of the next step

– Delete 2 zip files (ERR1307263 and ERR1307264) from the directory ./fastQC

– Delete 2 done files (ERR1307263 and ERR1307264) from the directory ./fastQC

– Delete the miniQC.done file from the directory ./QC

Context Common uses Specific uses Still in progress

Page 24: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 24

● 4th Step

– Snakefile with a config file

Context Common uses Specific uses Still in progress

Page 25: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 25

● Snakefle with a confg fle (1/3)import glob, ntpath

configfile: "config_tuto.json"

inFiles = set() ## set of all input files (only file names)

## extract file names from pathsinPaths = glob.glob( config["FASTQ_PATH"]+"*.fastq.gz" )for p in inPaths: inFiles.add( os.path.basename(p).replace(".fastq.gz", "") )

##############

Context Common uses Specific uses Still in progress

Load the config file

Call of the fieldFASTQ_PATH

from the json file

Page 26: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 26

● Snakefle with a confg fle (2/3)rule all: input: config["ANALYSIS_PATH"]+"QC/miniQC.done"

##############rule fastqc: input: config["FASTQ_PATH"]+"{afile}.fastq.gz" output: config["ANALYSIS_PATH"]+"fastQC/{afile}_fastqc.zip" params: analysis_path=config["ANALYSIS_PATH"] shell: "fastqc -o {params.analysis_path}fastQC {input}"

Context Common uses Specific uses Still in progress

params: is required in order toinject config field into the shell script

Page 27: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 27

● Snakefle with a confg fle (3/3)rule unzip_fastqc: input: zips = config["ANALYSIS_PATH"]+"fastQC/{zipfile}_fastqc.zip" output: touch(config["ANALYSIS_PATH"]+"fastQC/{zipfile}_unzip_fastqc.done") params: analysis_path=config["ANALYSIS_PATH"] shell: "unzip -o {input.zips} -d {params.analysis_path}fastQC"

rule miniQC: input: expand(config["ANALYSIS_PATH"]+"fastQC/{filename}_unzip_fastqc.done",filename=inFiles) output: touch(config["ANALYSIS_PATH"]+"QC/miniQC.done") params: fastq_type=config["FASTQ_TYPE"] shell: "./miniQC.sh {params.fastq_type}"

Context Common uses Specific uses Still in progress

Page 28: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 28

● Launch the pipeline (shell command)

snakemake -p

Context Common uses Specific uses Still in progress

Page 29: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 29

● What can be observed ?

Context Common uses Specific uses Still in progress

Rule 'unzip'

Rule 'miniQC'

Rule 'fastqc'

Fichier ERR1307260

Fichier ERR1307263

Fichier ERR1307262

Fichier ERR1307264 Rule 'unzip'

Fichier ERR1307261

Rule 'all'

Rule 'fastqc'

Page 30: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 30

● Preparation of the next step

– Delete all zip files from the directory ./fastQC– Delete all done files from the directory ./fastQC– Delete the miniQC.done file from the directory ./QC– Shell commands :

source deactivateexit

Context Common uses Specific uses Still in progress

Page 31: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 31

Context Common uses Specific uses Still in progress

● 5th Step

– Snakemake and computing grid– Snakemake and wrappers

Page 32: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 32

Context Common uses Specific uses Still in progress

Use of the option –cluster to perform snakemake on SGE

● Environment (shell commands)

source activate myTutoEnvmkdir logssnakemake -p --cluster "qsub -o ./logs/ -e ./logs/" --jobs 50

-o and -e define the repository where SGE send the output and error

--jobs define the number of threads allocated

Page 33: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 33

● What can be observed ? (1/2)If you were lucky, you would observe this!

Rule 'unzip'

Rule 'miniQC'

Rule 'fastqc'

Fichier ERR1307260

Fichier ERR1307264 Rule 'unzip'

Rule 'all'

Rule 'fastqc'

... ...Fichier ERR130726...

Context Common uses Specific uses Still in progress

Page 34: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 34

● What can be observed ? (2/2)Except for the persons who are doing this tutorial directly in their home directory (…), your jobs should be in error because your current working directory has not been passed to the computing grid.

More over, your virtual environment variables, also, have not been transmitted to the grid.

Context Common uses Specific uses Still in progress

Page 35: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 35

Context Common uses Specific uses Still in progress

We will use a wrapper in order to transmit the environment variables

● Environment (shell commands)

ctrl ctouch myGridWrapper.sh

Page 36: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 36

Context Common uses Specific uses Still in progress

Transmit the virtual environment variables on the node

● myGridWrapper.sh (shell commands)

#$ -S /bin/bash#$ -cwd#$ -V

{exec_job}

Transmit the current working directory

Execute the script

Page 37: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 37

Context Common uses Specific uses Still in progress

● Shell commands

snakemake -p --cluster "qsub -o ./logs/ -e ./logs/" --jobs 50 --jobscript myGridWrapper.sh

Page 38: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 38

● What can be observed ?Missed again!

You should get an error like 'OSError: Missing files after 5 seconds:'

Context Common uses Specific uses Still in progress

Page 39: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 39

Context Common uses Specific uses Still in progress

● Shell commands (try again)ctrl csnakemake -p --latency-wait 60 --cluster "qsub -o ./logs/ -e

./logs/" --jobs 50 --jobscript myGridWrapper.sh

Snakemake needs some latency to properly work with a grid engine

Page 40: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 40

● What can be observed ?Finally!!!!

Rule 'unzip'

Rule 'miniQC'

Rule 'fastqc'

Fichier ERR1307260

Fichier ERR1307264 Rule 'unzip'

Rule 'all'

Rule 'fastqc'

... ...Fichier ERR130726...

Context Common uses Specific uses Still in progress

Page 41: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 41

Context Common uses Specific uses Still in progress

● 6th Step

– Snakemake and virtual environmentS

Page 42: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 42

● Let's play with virtual environments!Often, we need to use antinomic version of tools in a same pipeline.

One solution is the use of virtual environments.

Example : use snakmake with python 3 and tophat2 with python 2

Context Common uses Specific uses Still in progress

Page 43: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 43

Context Common uses Specific uses Still in progress

Preparation of a virtual environment which is compliant with tophat and htseqcount

● Environment (shell commands)

conda create --name myTophatTutoEnv python=2.7.8 tophat htseq pysam=0.8.3 -c bioconda

Page 44: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 44

● Snakefle with a confg fle (1/2)rule all: input: I1 = config["ANALYSIS_PATH"]+"QC/miniQC.done", I2 = expand(config["ANALYSIS_PATH"]+"counts/{filename}.counts",filename=inFiles)

rule tophat2_SE: input: config["FASTQ_PATH"]+"{bamname}.fastq.gz" output: config["ANALYSIS_PATH"]+"BAM/{bamname}/accepted_hits.bam" params: cpu=config["TOPHAT_CPU"], gene=config["GENE_REFERENCE"], bowtie=config["BOWTIE_INDEX"], analysis_path=config["ANALYSIS_PATH"] shell: """ source activate myTophatTutoEnv tophat2 -p {params.cpu} -G {params.gene} -o {params.analysis_path}BAM/{wildcards.bamname} {params.bowtie} {input} """

Update rule all

Context Common uses Specific uses Still in progress

Add the following rules

Page 45: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 45

● Snakefle with a confg fle (2/2)

rule htseqcount: input: config["ANALYSIS_PATH"]+"BAM/{htseqname}/accepted_hits.bam" output: config["ANALYSIS_PATH"]+"counts/{htseqname}.counts" params: gene=config["GENE_REFERENCE"], analysis_path=config["ANALYSIS_PATH"] shell: """ source activate myTophatTutoEnv htseq-count -f bam -r pos -i gene_name -s no {input} {params.gene} > {params.analysis_path}counts/{wildcards.htseqname}.counts """

Context Common uses Specific uses Still in progress

Page 46: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 46

● Preparation of the next step

– Delete all zip files from the directory ./fastQC– Delete all done files from the directory ./fastQC– Delete the miniQC.done file from the directory ./QC

Context Common uses Specific uses Still in progress

Page 47: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 47

Context Common uses Specific uses Still in progress

● Shell commands

snakemake -p --latency-wait 60 --cluster "qsub -o ./logs/ -e ./logs/" --jobs 50 --jobscript myGridWrapper.sh

Page 48: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 48

● What can be observed ?Rule 'unzip'

Rule 'miniQC'

Rule 'fastqc'

Fichier ERR1307260

Fichier ERR1307264

Rule 'unzip'

Rule 'all'

Rule 'fastqc'

... ...

Fichier ERR130726...

Context Common uses Specific uses Still in progress

Rule 'htseq'

Rule 'tophat' Rule 'htseq'

Rule 'tophat'

... ...

Page 49: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 49

● A young technology

● Problem with the NFS fle system (solved with BeeGFS)

Context Common uses Specific uses Still in progress

Page 50: Snakemake Tutorial - GRIOTE · 09/20/2016 Snakemake Tutorial 9 Snakefle (1/2) import glob, ntpath inFiles = set() ## set of all input files (only file names) ## extract file names

09/20/2016 Snakemake Tutorial 50

Tank you for your perseverance!