PSC Bioinformatics Summer Institute Workshop (NIH MARC)

Minority Access to Research Careers

PSC Bioinformatics Summer Institute Workshop


This two-week intensive training workshop provides a robust background in bioinformatics suitable for teaching and research. Every day participants complete hands-on exercises to practice the concepts learned during lectures using various of the Pittsburgh Supercomputing Center’s massively parallel computers and various software tools such as the Galaxy web-based biomedical research tool.

A Typical Summer Institute Schedule


Week 1

  • Introduction to the Computing Environments at the Pittsburgh Supercomputing Center
  • Bioinformatics Databases
  • Models and Significance in Searching Bioinformatics Databases
  • Sequence Alignment Algorithms (NW, SW, Fasta, BLAST, BW+FM, “Seeded” SW)
  • Multiple Sequence Alignment & Mapping Realignment
  • Computational Tools: Analyzing Data Using Relational Databases & SQL
  • Next Generation Sequencing (NGS) Technologies
  • Pattern Identification
  • Preparing NGS Datasets for Assembly/Mapping
  • Phylogenetics and Reconciliation with Notung
  • DeNovo Genome Assembly
  • The R System for Statistical Analysis


Week 2


Dr. Bienvenido Velez: SQL as a Tool for BioInformatics Data Analysis

  • Functional Annotation for Assembled Genomes
  • Predicting Genes, Identifying Functions
  • Mapping Genome Assemblies
  • RNAseq: DeNovo Assembly of RNA Data
  • Identifying Single-nucleotide Plymorphisms (a.k.a. SNPs) and Other Variants
  • RNAseq: DeNovo Functional Annotation and Other Post-Assembly Analyses
  • Gene Annotation
  • Ribosomal Profiling: Genome-wide Measurements of mRNA Translation Rates


Some of the Computing Tools Used at the Institute

Unix Command Line Programs TCoffee
GeneDoc NCBI Entreez
Notung Velvet (NGS)
Galaxy Trinity (NGS)
SQL Database Language using SQLite BowTie and BowTie2
Python Language Scripting and the BioPython Library Trinotate
The R Statistical Analysis System SAM Tools
MEME edgeR
Muscle Phylip
Clustal Databases incuding PubMed, GeneBank, UniProt, PFAM and others