BioHPC Site Logo

Computational Biology Service Unit
BioHPC Web Computing Resources
(compute nodes status)

User: guest: | Only registered users can use the resource below. | Login | Apps Home | FAQ |

APPLICATIONS
(click on a category below
to access programs)

   


MISCELLANEOUS
Subscribe
Apps Home
Clusters Status
Applications Statistics
BioHPC Home
CBSU Home
CBSU ftp server
CBSU SeqDB
CTC Windows Bioinformatics Applications
DISTRUCT
T-REX (T-RFLP manager)
Next-Gen@BioHPC
CBSU Survey
Read Survey (adm)
Reset Password
F A Q
Contact Us

Version 1 Rev 454
(2011/12/21 10:37:26)

HapMap Reads @ BioHPC

Please send comments to biohpc@cornell.edu.

This tool retrieves maize HapMap paired-end reads that can be aligned to your genome region of interest. We restrict the maximum input region to 1 kb.

The procedure is the following:

  1. BLAST is run using the input sequence against the Hapmap reads of the selected germplasms to retrieve all aligned reads. If only one member of a pair can be aligned, both reads will be retrieved.
  2. Each of the pulled reads is BLAST-ed against the maize genome. Only pairs with at least one end uniquely aligned to the specified region are kept.
There are dozens of files in the output directory (which will be delivered to you as a tgz archive; the graphical representation of the results may also be viewed online):
  1. There is an html file: [chr]_[start]_[end].html. Double clicking this file will give you an overview of the coverage of reads in this region. Each pair of red-blue lines of the same row represent a pair of reads ( blue is on 5' plus strand of the genome, red is on 3' minus strand of the genome, if you see the word "reverse", suggesting a flip between the two reads). If you only see blue or read line in one row, it means one member of the pair is not mapped to this region, suggesting possible genome rearrangement. For some reads, at one or both ends of the reads, you can see a bright red dots, these represent part of the reads cannot be align to the genome. Mouse over the number (read ID), you can see the mapping position of the read pair as well as copy number in the text box. For example: #75 chr6: 1053397:1e-034:3; chr6: 1053607:4e-035:1 means the two reads of read pair #75 are mapped to chromosome 6 position 1053397and 1053607. The left end mapped to 3 loci on the genome with this quality and only one of them is shown here, the right end mapped to 1 unique locus on the genome).
  2. There are multiple Excel files, one for each germplasm (eg. 6_10023_11023_matches_B73.xls). These files contain the blast results of how reads align to the genome. The columns are: Columns A to M: Left read (read ID, chromosome, %identity, alignment length, mismatches, gap openings, read match start, read match end, chr match start, chr match end, evalue, bit score, copy numbers) Columns N to Z: Right read (columns are same as the left reads).
  3. Multiple *_clean.fasta files, one for each germplasm (eg. 6_10023_11023_matches_B73_clean.fasta). They are filtered reads data. You can use this file with assembly software, e.g. DNAstar or Sequencher, to assemble them into contigs.

Calculations will be carried out on the BioHPC compute cluster at CBSU. You will receive e-mail notifications when the job is submitted, when it starts, and when it is finished. Output will be available via links embedded in the notification e-mails. For more information about this program and BioHPC interface in general, please visit our Frequently Asked Questions page.

Please acknowledge us in all publications and presentation of work that used our resources using the following text.


You must be a registered user of BioHPC to run this application. Please login.


Specify maize lines to pull the reads from








Specify chromosome and region from agp_v2 genome   (maximum allowed region: 1kb)

Chromosome:    Region start:    Region end:

Cluster: This application can't run at this time - no suitable clusters
or you are not authorized to use the service.
The service is available only to Cornell students, faculty, and staff.
 
( Show timeout info )


Messages:
Cluster Athena under maintenance:
Cluster biosim operating normally
Cluster biosim2K8 operating normally
Cluster cbsum2k8 operating normally
Cluster cbsusrv05 operating normally
Cluster cbsulm01 under maintenance:
Cluster CAC_v4_lease under maintenance:
Cluster biosim_linux operating normally
Cluster cbsuss04 operating normally
Cluster cbsum1c1b001 operating normally
Cluster cbsum2 operating normally

Application P-IPRSCAN under maintenance: