Exons: Relevant bits of information for using the MUSC Shared Resource, the BioMolecular Computing Resource (BCR). EXON_9: Getting a sequence into the GCG system. 1. A quick method is to use the GCG program called SEQED. To use SEQED to create and edit a sequence datafile, type: %seqed filename.ext This will launch SEQED. The text cursor will be placed at the top of the screen ready to enter a series of comments or annotations for the sequence you will be entering. This is the sequence header region. Type control-d to exit the header. The cursor will then be moved to position one(1) of the sequence buffer. Type your sequence in, symbol by symbol. The UNIX screen buffer allows highlighting and copy/paste operations. If you capture the text of a sequence with the screen buffer it may be pasted 'en mass' into the seqed "sequence" buffer at this time, saving you the entry chore. It is possible to define certain keys as for instance A, C, T, or G to aid sequencing gel reading. When you have finished, type control-d again. The cursor will move to a new line with a ':' prompt. At this point type 'EXIT' to save the file under the name you entered as filename.ext above. 2. Nucleic acid files from a remote gopher server or Entrez server are "generally" returned as either GENBANK, FASTA or EMBL format files which are not immediately usable with the GCG system. To convert these files into the GCG format, use the 'FROM...' commands (see GENHELP or GENMANUAL under Sequence Exchange) eg: FROMGENBANK filename.txt FROMEMBL filename.txt FROMFASTA filename.txt These routines read files written in GENBANK, FASTA or EMBL format and create output files with a GCG format. The PIR Gopher at Houston returns protein sequence files in a unique format. Conversion to GCG format is a two step process. a. Open the file with an editor and place two dots just prior to the sequence. eg ".." no spaces! b. Run the GCG program "reformat" with the edited file containing the two dots as input. The output file will be in GCG format.