MUSC non-Net-based Transcription based searches

If you wish to scan your DNA sequence for transcription factor binding
sites you have two choices in GCG. 

I. Here's the first process

 1) Start GCG and "fetch" tfsites.dat.
   tfsites.dat is a compilation (a bit outdated--who knows why
   it is supplied by GCG and there was NOT a newer release
   available from NCBI...) of known transcription factor 
   binding sites along with their literature references.
   The current tfsites.dat file fetched this way is
   dated 1996. A 2003 version in GCG format(tfsites.gcg) may be 
   retrieved from  this  site.

2) Select your to be scanned sequence and use findpatterns
   with the following option: -data=tfsites.dat

   ie:
   
   findpatterns -dat=tfsites.dat filename.seq
                ^^^^^^^^^^^^^^^^

3) The outputfile ie filename.find contains all the locations
   where the recognition sites for transcription factors have
   been mapped. You can scan through the file to see if anything
   looks useful to you.

4) There is no provision for finding the references to these
   known sites. This is a flaw in GCG. To get around this 
   I wrote some things and Karen Jesmer at CCIT helped a huge amount
   to create a short unix shell script which will read your findpatterns
   output file and then get the references which match your hits.
   

Send an email to Starr Hazard requesting the unix shell program starr

5) Using "starr"
    a) run findpatterns with -dat=tfsites.dat
    b) type sh, starr, the findpatterns output file and the ref file 
       to be created  ie

       sh starr filename.find filename.findref

     c) examine filename.findref for the references which  your findpattern
       search located.

II. Here's the second way
 1) The tfsites.dat file may also be read by any of the map programs
   to locate the tfsites along your sequence.
   type
        mapplot -dat=tfsites.dat filename.seq
                ^^^^^^^^^^^^^^^^

   This will create a "digestion" map showing the places where the tfsites
   finds recognition sites. Of course, the MAP program will use the
   tfsites.dat file as well. Type:
        
       map  -dat=tfsites.dat filename.seq
            ^^^^^^^^^^^^^^^^

7) OR you could use Dan Prestridge's SignalScan program. To use this you
   must add two lines to your .cshrc file. Then save the modifications.
   Finally type "source .cshrc" to activate the changes (or start a new
   shell, or log out then login again). Typing "signal" should initiate
   SignalScan. This is not a better program its just different. You can
   go back and look at the references but only one at a time.
There are the two lines to add to your .cshrc file. Send an e-mail to Starr Hazard to get these two lines.
   
8) OR finally, refer to the following links to Web resources. These do not
   generally work better or faster but they do give you hypertext links
   to the references and are therefore more convenient in that regard.

Net-based Transcription Factor and Promoter Search Services

revised by ESH September 8, 2003