This sample demonstrates how to run a parametric sweep application that matches inputted nucleotide sequences to the human genome. The sample also demonstrates how to submit a job using the HPC Job Scheduler’s REST API and how to upload and download files from Windows Azure blob and table storage.
The National Center for Biotechnology Information’s Basic Local Alignment Search Tool (NCBI BLAST) HPC Sample demonstrates how to run a nucleotide match algorithm on the human genome using an HPC parametric sweep application.
The parametric sweep application uses a set of input files that contain sequences of nucleotides, comparing them to the human genome database. The application creates output files containing sequence similarities and uploads these files to a BLAST output visualizer (BOV) website.
To run the nucleotide match, the sample uses the blastn utility, which is a part of the BLAST+ application.
The architecture of the sample and the steps of its execution are described in Figure 1:
Architecture of the BLAST sample
To use this sample application, you will need to download the human genome compressed database from the NCBI FTP server, extract the database, and copy it to a Windows Azure blob storage as described later on in the document.
This sample demonstrates some of the new features offered by Windows HPC Server 2008 R2 Service Pack 2 (SP2). Refer to the What's New in Windows HPC Server 2008 R2 Service Pack 2 article on TechNet for the complete list of new features offered in this version.
This sample demonstrates the following features: