
Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
This one page sheet provides some of the main Bioinformatics key terms with definitions
Typology: Cheat Sheet
1 / 1
This page cannot be seen from the preview
Don't miss anything!
Bait files: depict sequence capture regions that are typically downloaded from the sequence capture kit manufacturer. Binary Alignment/Map (BAM) file: a file format that is a binary version of SAM, making the file size more compact. Browser Extensible Data (BED) file: a standardized format for a tab-delimited file containing, at a minimum, chromosomal coordinates for depicting regions of the genome. CDCV (complex disease common variant hypothesis ): the rationale behind GWAS (genome-wide association study) designs. CDRV (complex disease rare variant hypothesis) assuming that rare variants make a greater contribution to complex disease than do common variants; requires deep sequencing. Depth of coverage: number of sequencing reads aligned to a specific genomic location. Higher depth of coverage generally increases confidence in variant calls. Edge prioritization: instead of prioritizing genes in isolation generate hypotheses about potential interactions among the top candidates and ‘seed’ genes. Cluster: a group of linked computers, working together thus in many respects forming a single computer. Exome sequencing: sequencing every exon of every gene in the genome. FASTQ: a file format for storing short read massively parallel sequencing data. Galaxy: A web-based platform that makes command-line tools available to biologists, is flexible, sharable, and can be run from (almost) any computer. Massively parallel sequencing: a next-generation DNA sequencing technology that allows millions or billions of base-pairs to be sequenced simultaneously. Multiplexing: the application of nucleotide barcodes followed by subsequent pooling of multiple DNA samples. Allows pooling of samples to increase throughput and take advantage of sequencer output. Sequence Alignment/Map (SAM) file: a standardized and widely accepted format for storing large nucleotide sequence alignments that is designed to be: flexible (accommodates alignment information from various sequencers and alignment programs, compact in size, and easily convertable. Sequence Capture: also known as targeted sequence capture or targeted genomic enrichment. The massively parallel replacement for PCR. The process of simultaneously isolating thousands or millions of regions of the genome prior to massively parallel sequencing. Target Interval File: a file containing any areas of the genome that you think are biologically relevant (can be a bait file, can be a subset of a bait file, or can be user-generated). Variant Call Format (VCF) file: a standardized and widely accepted file type for storing genotype data generated from variant calling algorithms.