Usage

ALoFT can be invoked as follows:

$ ./aloft --vcf=/path/to/file --data=path/to/data/dir --output=path/to/dir [--option4=arg1]...

Alternatively, to use ALoFT on an annotated VAT file, use the --vat option:

$ ./aloft --vat=vat_file.vcf (VAT annotated vcf file)

For a VCF file taken as input, only the first five columns are required.

Input

  1. VCF file containing unannotated variants, passed in using the --vcf option. Alternatively, a VAT annotated VCF file can also be used as input with the --vat option. See options section below.
  2. Reference files (defaults already in place after installation). For complete list, see section below regarding options and default values.

Output

An output directory may be specified with the --output option. Otherwise the output files will be written in the ./aloft_output directory.

The following files will be created in this output directory:

  1. A VAT-annotated VCF file. The output file will be input_file_name.vat.
  2. Three files pertaining to LoF variants:
    1. An output VCF file named input_file_name.aloft.vcf which contains a culled list of formatted information calculated by VAT and features calculated by ALoFT in Variant Call Format for putative LoF variants (premature stop variants, frameshift-causing indels and variants in canonical splice sites).
    2. An output file named input_file_name.aloft.lof which is a tab-delineated file and includes extensive annotations for premature stop variants and frameshift-causing indels.
    3. An output file named input_file_name.aloft.splice which is a tab-delineated file and includes extensive annotations for variants affecting canonical splice sites.

Options

ALoFT recognizes the following options for altering input and reference files:

--version
Outputs the ALoFT version number.

--vcf=input.vcf
Specifies path to VCF input file. If none specified, ALoFT will try to skip VAT and run directly on the file given to the --vat option. --vcf or --vat option is needed for proper execution.

--vat=input.vat
Specifies path to VAT output file to run ALoFT on. If none specified, then a --vcf option must be specified.

--cache=cache/ (Default)
Specifies path to directory containing cache of GERP score information and protein-protein interaction information. Directory will be created if it doesn't already exist.

--nmd_threshold=50 (Default)
Distance from premature stop to last exon-exon junction; used to predict NMD. Default distance is 50bp.

--output=aloft_output/ (Default)
Specifies path to tabbed output files and VCF file from ALoFT.

--data=data/ (Default)
Specifies path to data directory containing a data.txt file and other data dependencies. data.txt contains paths to all data files that ALoFT requires. See data/data.txt bundled with ALoFT for more information on these files.

--verbose (Optional)
Runs ALoFT in verbose mode.

Example Workflow

Obtain variant calls and vcf files from 1000 Genomes ftp site.

Download to home directory and uncompress.

$ wget ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase1/analysis_results/integrated_call_sets/ALL.wgs.integrated_phase1_v3.20101123.snps_indels_sv.sites.vcf.gz

Run the vcf file through aloft, which is installed in the directory aloft in the home directory and output this to the default directory aloft_output/

$ cd aloft
$ ./aloft --vcf=../ALL.wgs.integrated_phase1_v3.20101123.snps_indels_sv.sites.vcf.gz

To see the aloft output, enter the output directory.

$ cd aloft_output
$ ls
ALL.wgs.integrated_phase1_v3.20101123.snps_indels_sv.sites.vcf.gz.aloft.lof
ALL.wgs.integrated_phase1_v3.20101123.snps_indels_sv.sites.vcf.gz.aloft.splice
ALL.wgs.integrated_phase1_v3.20101123.snps_indels_sv.sites.vcf.gz.aloft.vcf

Check the annotated features for more information.