Paper pub. date
December 2017
ISBN 9780870719264 (paperback)
7.5 x 9.25, 525 pages.

A Primer for Computational Biology


Shawn T. O'Neil
Published by Oregon State University Libraries and Press in Partnership with Open Oregon State
Summary
Preview

 

This book is available as an Open Access Textbook through Open Oregon State in collaboration with the Center for Genome Research and Biocomputing.

A Primer for Computational Biology
aims to provide life scientists and students the skills necessary for
research in a data-rich world. The text covers accessing and using
remote servers via the command-line, writing programs and pipelines for
data analysis, and provides useful vocabulary for interdisciplinary
work. The book is broken into three parts:

  1. Introduction to Unix/Linux:
    The command-line is the “natural environment” of scientific computing,
    and this part covers a wide range of topics, including logging in,
    working with files and directories, installing programs and writing
    scripts, and the powerful “pipe” operator for file and data
    manipulation.
  2. Programming in Python: Python is both a
    premier language for learning and a common choice in scientific software
    development. This part covers the basic concepts in programming (data
    types, if-statements and loops, functions) via examples of DNA-sequence
    analysis. This part also covers more complex subjects in software
    development such as objects and classes, modules, and APIs.
  3. Programming in R:
    The R language specializes in statistical data analysis, and is also
    quite useful for visualizing large datasets. This third part covers the
    basics of R as a programming language (data types, if-statements,
    functions, loops and when to use them) as well as techniques for
    large-scale, multi-test analyses. Other topics include S3 classes and
    data visualization with ggplot2.

 


About the author

Shawn T. O’Neil earned a BS in computer science from Northern Michigan University, and later an MS and PhD in the same subject from the University of Notre Dame. His past and current research focuses on bioinformatics. O’Neil has developed and taught several courses in computational biology at both Notre Dame and Oregon State University.


Read more about this author

Table of Contents

 

        Preface

        Acknowledgements

        Dedication

    Part I:
Introduction to Unix/Linux

        Context

        Logging In

        The Command
Line and Filesystem

        Working with
Files and Directories

        Permissions
and Executables

        Installing (Bioinformatics)
Software

        Command Line
BLAST

        The Standard
Streams

        Sorting, First
and Last Lines

        Rows and
Columns

        Patterns
(Regular Expressions)

        Miscellanea

 

    Part II:
Programming in Python

        Hello, World

        Elementary
Data Types

        Collections
and Looping: Lists and for

        File Input and
Output

        Conditional
Control Flow

        Python
Functions

        Command Line
Interfacing

        Dictionaries

        Bioinformatics
Knick-knacks and Regular Expressions

        Variables and
Scope

        Objects and
Classes

        Application
Programming Interfaces, Modules, Packages, Syntactic Sugar

        Algorithms and
Data Structures

 

    Part III:
Programming in R

        An Introduction

        Variables and
Data

        Vectors

        R Functions

        Lists and
Attributes

        Data Frames

        Character and
Categorical Data

        Split, Apply,
Combine

        Reshaping and
Joining Data Frames

        Procedural
Programming

        Objects and
Classes in R

        Plotting Data
and ggplot2

 

        Files

        Index

        About the
Author

Sign Up for Our Newsletter