Genome RE Sites

Rather than sitting on my hard disk getting dusty, I thought I should start publishing the bioinformatics scripts that I’ve written over the past few years of my PhD.

The first to go up is a Perl script called “Genome RE Sites” - it searches a genome of your choice for a restriction endonuclease recognition site and outputs the co-ordinates of all cut sites.

[ Update ] : You can find an online version of this tool here.

I use a technique called Chromosome Conformation Capture, which uses restriction enzymes, so I frequently need to generate a list of cut sites to help me analyse data.

This script needs to be run on the command line, either on a linux system or using something like ActivePerl on Windows. Usage of this script is:

perl genome_RE_sites.pl [output file] [search string]

..where [output file] is the filename that will hold your results and [search string] is the restriction enzyme recognition site. The latter can be ignored and will default to HindIII (AAGCTT). There are a few other configuration options that you’ll need to edit before using the script, such as location of downloaded genome sequences and chromosome names. Please leave a comment if you have any problems with these.

Download the code here, or copy and paste from below: