Usage

Before you can use Kyos to call variants, you will need to prepare the input datasets. The workflow begins with one or more BAM files. If you don’t already have BAM files, you could use the CFSAN SNP Pipeline to create the BAM files.

When extracting features from BAM files, you will need to supply the known-truth if you intend to use the tabulated features for training and testing the neural network. The -t command line option to the tabulate command adds an extra Truth column to the output tsv file.

Kyos is currently dependent upon SNP Mutator to generate the known-truth datasets for supervised learning. A future version will use VCF files instead.

To extract tabular data from a BAM file:

kyos tabulate -t TRUTH_FILE input.bam output.tsv ref.fasta

To merge multiple tabulated files:

kyos merge file1.tsv file2.tsv file3.tsv ... > train.tsv

To train a neural network model:

kyos train train.tsv validate.tsv model.h5

To test a neural network model:

kyos test model.h5 test.tsv