
                                 isochore 



Function

   Plots isochores in DNA sequences

Description

   isochore plots GC content in windows over a DNA sequence. The data may
   also be written to output file. The window wize and shift increment
   (the number of bases separating the start of each window) may be
   specified. isochore is suitable for use with large sequences such as
   complete chromosomes or large genomic contigs, although interesting
   results can also be obtained from shorter sequences.

Usage

   Here is a sample session with isochore


% isochore tembl:AF129756  -graph cps 
Plots isochores in DNA sequences
Output file [af129756.iso]: 

Created isochore.ps

   Go to the input files for this example
   Go to the output files for this example

Command line arguments

   Standard (Mandatory) qualifiers:
  [-sequence]          sequence   Nucleotide sequence filename and optional
                                  format, or reference (input USA)
  [-outfile]           outfile    [*.isochore] Output file name
   -graph              xygraph    [$EMBOSS_GRAPHICS value, or x11] Graph type
                                  (ps, hpgl, hp7470, hp7580, meta, cps, x11,
                                  tekt, tek, none, data, xterm, png, gif)

   Additional (Optional) qualifiers:
   -window             integer    [1000] Window size (Integer 1 or more)
   -shift              integer    [100] Shift increment (Integer 1 or more)

   Advanced (Unprompted) qualifiers: (none)
   Associated qualifiers:

   "-sequence" associated qualifiers
   -sbegin1            integer    Start of the sequence to be used
   -send1              integer    End of the sequence to be used
   -sreverse1          boolean    Reverse (if DNA)
   -sask1              boolean    Ask for begin/end/reverse
   -snucleotide1       boolean    Sequence is nucleotide
   -sprotein1          boolean    Sequence is protein
   -slower1            boolean    Make lower case
   -supper1            boolean    Make upper case
   -sformat1           string     Input sequence format
   -sdbname1           string     Database name
   -sid1               string     Entryname
   -ufo1               string     UFO features
   -fformat1           string     Features format
   -fopenfile1         string     Features file name

   "-outfile" associated qualifiers
   -odirectory2        string     Output directory

   "-graph" associated qualifiers
   -gprompt            boolean    Graph prompting
   -gdesc              string     Graph description
   -gtitle             string     Graph title
   -gsubtitle          string     Graph subtitle
   -gxtitle            string     Graph x axis title
   -gytitle            string     Graph y axis title
   -goutfile           string     Output file for non interactive displays
   -gdirectory         string     Output directory

   General qualifiers:
   -auto               boolean    Turn off prompts
   -stdout             boolean    Write first file to standard output
   -filter             boolean    Read first file from standard input, write
                                  first file to standard output
   -options            boolean    Prompt for standard and additional values
   -debug              boolean    Write debug output to program.dbg
   -verbose            boolean    Report some/full command line options
   -help               boolean    Report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   -warning            boolean    Report warnings
   -error              boolean    Report errors
   -fatal              boolean    Report fatal errors
   -die                boolean    Report dying program messages

Input file format

   isochore reads a normal nucleic acid USA.

  Input files for usage example

   'tembl:AF129756' is a sequence entry in the example nucleic acid
   database 'tembl'

  Database entry: tembl:AF129756

ID   AF129756; SV 1; linear; genomic DNA; STD; HUM; 184666 BP.
XX
AC   AF129756;
XX
DT   12-MAR-1999 (Rel. 59, Created)
DT   14-NOV-2006 (Rel. 89, Last updated, Version 5)
XX
DE   Homo sapiens MSH55 gene, partial cds; and CLIC1, DDAH, G6b, G6c, G5b, G6d,
DE   G6e, G6f, BAT5, G5b, CSK2B, BAT4, G4, Apo M, BAT3, BAT2, AIF-1, 1C7, LST-1
,
DE   LTB, TNF, and LTA genes, complete cds.
XX
KW   .
XX
OS   Homo sapiens (human)
OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia
;
OC   Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae;
OC   Homo.
XX
RN   [1]
RP   1-184666
RX   DOI; 10.1101/gr.1736803.
RX   PUBMED; 14656967.
RA   Xie T., Rowen L., Aguado B., Ahearn M.E., Madan A., Qin S., Campbell R.D.,
RA   Hood L.;
RT   "Analysis of the gene-dense major histocompatibility complex class III
RT   region and its comparison to mouse";
RL   Genome Res. 13(12):2621-2636(2003).
XX
RN   [2]
RP   1-184666
RA   Rowen L., Madan A., Qin S., Shaffer T., James R., Ratcliffe A., Abbasi N.,
RA   Dickhoff R., Loretz C., Madan A., Dors M., Young J., Lasky S., Hood L.;
RT   "Sequence of the human major histocompatibility complex class III region";
RL   Unpublished.
XX
RN   [3]
RP   1-184666
RA   Rowen L.;
RT   ;
RL   Submitted (22-FEB-1999) to the EMBL/GenBank/DDBJ databases.
RL   Department of Molecular Biotechnology, Box 357730 University of Washington
,
RL   Seattle, WA 98195, USA
XX
RN   [4]
RP   1-184666
RA   Rowen L.;
RT   ;
RL   Submitted (28-OCT-1999) to the EMBL/GenBank/DDBJ databases.
RL   Multimegabase Sequencing Center, University of Washington, PO Box 357730,
RL   Seattle, WA 98195, USA


  [Part of this file has been deleted for brevity]

     aaaccagttt accaccactc ctaacactaa acttaaatct gactctaaat gtaagtccaa    18174
0
     tctgagccac aagcctaaag ttgaacttta tcctgcttta tgaattattc atccattcct    18180
0
     ccatttagtg agtatctgcg tgcctaacac atgctgggca ttgtcctaag gcaggaggga    18186
0
     catggaggca aagggatcag agaaggtacc agcacctgtg gagcttgtat tccagtgagg    18192
0
     ccagacggaa aagaaagaaa ctgaagaaga aattggtact atgagaaaat aagacaggct    18198
0
     gatgttgtaa gagtggcagg gagctacttt taaatacagt agtcagcaaa atcctctttg    18204
0
     agtgtttggg tggcactgga gctgagaccc aaatgacaaa aaatagtgac caggtaaaag    18210
0
     tttgggagca aagcatttca ggtaaaggga gcagctactg caaaggctgg aaggcggaac    18216
0
     caagctgggg gtgttgacga caaacagaag gccagtgtgg ctggagcaga gagagagact    18222
0
     gggaggcggg tgggagatga ggtcagagag gagggcaggg gccaggtcat gcagggccat    18228
0
     gcaagaaggg taaagcctct agatttcatc cagccacagg aagcctttaa aggtcgtcag    18234
0
     agtgtgtggt gcgtgcgtgt gtgtgtgtgt gtgtgtgtgt gttgcagggg agagaggggg    18240
0
     agggagagag agagagagag agagaagagg gaggtgagca gaggtgattg gatttttttt    18246
0
     tcttttgaca tggtgtcttg ctctgtggcc taggctggag tgcagtggca ccatcatagc    18252
0
     ccactgcaac ctcaaaacca tgggctcaag tcatccttcc acctcagctt cccaagtatc    18258
0
     taggactaca ggtgtgtgcc actgtgcctg gctaatttta aaaaatattt taaaattttt    18264
0
     gttgagacag ggtctatgct gctcaggctg gtctcgaact cctggtttca agtgatctgc    18270
0
     ccatcttggc ctcccaaagt ttttttttgt tagtttgaga ggcggtttcg ctcgttgccc    18276
0
     aggctggagt gcaatgactg atctcatctc actgcaacct ctgcctcctg ggttcaagcg    18282
0
     attctcctgc ttcagcctcc caagtagctg ggattacagg tgcatgccac cattcccggc    18288
0
     taattttttg tatttagtag agatggggtt tcaccatgtt agtcaggctg atctcaaact    18294
0
     cctgacctca ggtgatccgc ctgcctcagc ctcccaaagt tttgggatta caggtgtgag    18300
0
     ccaccatgct gggccagcct cccaaagttt tgggattaca ggcatgagtc accacactgg    18306
0
     ccctggattt tttttctttc ttttttttgg agacggagtc tcactctgtt gcccaggctg    18312
0
     gagtgcaatg gcgtaatctc agctcactgc aacctctgct gcccgggttc aaacgattct    18318
0
     cctgtcttag cctcctgagt agctgggatt ataggtgcat gccaccatgc ctggctaatt    18324
0
     tttgtacttt tagtagagaa agtacaccat cttggccagg ctggtctcga actcctgacc    18330
0
     tcaggtgatc cacttgcgtc ggcctcccaa agtgctggga ttacaggcgt gagacaccgc    18336
0
     acccagcctt tttttttttt tttcttttaa gacagaatcg ctctgtcacc caggctggag    18342
0
     tgcagtggca caatctcggc tcactgcaac ctctgcctcc caggtttaag caatccacct    18348
0
     atgtcagtct cccaagtagc tgggattata ggtgcatgtc accatgcctg gctaattttt    18354
0
     gtacttttag tatagaaagt acaccatgtt ggccaggctg gtcttgaact cctgacctca    18360
0
     agtgatccgc ctgcctcagc ctcccgaagt gctggaatta cagacatgtg ccactgcacc    18366
0
     cggcctggtt ttttttttct aagagatgga gtctcacttt tctgcccagg ttggagtgca    18372
0
     atggcaccat catagctcac tgcagccttc aactcttggc ctcaggcaat ccttgcacct    18378
0
     tagcctcgca gtgttgggat tacaggcatg agccactgag ccttgcctgg actttttttt    18384
0
     ttttttgaga tggcgtctcg ctctgttgcc caggttggag tgctacggca tgatcttggc    18390
0
     tcactgcaac ttccacctcc caggttcaag cgattctctt gcctcggccc cccgagtagc    18396
0
     tgggattaca ggcatgcgcc accgtgcctg gctaattttg gtatttttag tagagatagg    18402
0
     gtttcatcat gttgggcagg ctggtcttga actcctgacc tcgtgatcca cccacctcgg    18408
0
     cctcccaaag tgctgggatt ataggcatag ccaacgcgcc cagcctggac ttgtttttaa    18414
0
     aagatcactg tggctcctgt gtttaggctg gctggtagga gacaggtggc agtggcattg    18420
0
     atggtgaaga gaaaatagtg gcagccatgg agatggagag aagtagacaa gtttgggata    18426
0
     tattatacat tccaggggta gaaacaacag gactagatga tggattgatg ggtgggagat    18432
0
     gtagatactg ggagagaagc aggattctga tggatggaaa aactaaaaaa ttctattttg    18438
0
     ggtgtggtaa gtctaagtct attagacatg caagtagaga tgtcactggg cagatacaca    18444
0
     tctggatttc aggggcaagg tccaagctag agaaagaaac ctgggcatgg tcagcatgag    18450
0
     gatggtgttt aaagccatgg aacttatctt gtgcatccct ataagacccc tttgaggcac    18456
0
     ttgtttcccc tcacaatgga tgcagtgcat cttccattct gaattccaga ggcaacaacc    18462
0
     tcctgctcct agaagctaaa ctctccagac ttagtcttct gaattc                   18466
6
//

Output file format

  Output files for usage example

  File: af129756.iso

Position        Percent G+C 1 .. 184666
500     0.471
600     0.485
700     0.482
800     0.482
900     0.475
1000    0.489
1100    0.496
1200    0.499
1300    0.479
1400    0.477
1500    0.466
1600    0.442
1700    0.451
1800    0.455
1900    0.470
2000    0.455
2100    0.443
2200    0.440
2300    0.458
2400    0.467
2500    0.480
2600    0.493
2700    0.501
2800    0.498
2900    0.501
3000    0.508
3100    0.522
3200    0.514
3300    0.518
3400    0.515
3500    0.517
3600    0.530
3700    0.517
3800    0.527
3900    0.509
4000    0.500
4100    0.490
4200    0.496
4300    0.492
4400    0.479
4500    0.470
4600    0.464
4700    0.463
4800    0.460
4900    0.467
5000    0.476
5100    0.477
5200    0.479
5300    0.476


  [Part of this file has been deleted for brevity]

179100  0.406
179200  0.422
179300  0.412
179400  0.402
179500  0.397
179600  0.397
179700  0.398
179800  0.402
179900  0.436
180000  0.456
180100  0.472
180200  0.456
180300  0.458
180400  0.462
180500  0.487
180600  0.477
180700  0.471
180800  0.479
180900  0.477
181000  0.463
181100  0.454
181200  0.448
181300  0.436
181400  0.444
181500  0.425
181600  0.435
181700  0.446
181800  0.459
181900  0.460
182000  0.471
182100  0.485
182200  0.483
182300  0.498
182400  0.495
182500  0.505
182600  0.513
182700  0.514
182800  0.500
182900  0.493
183000  0.500
183100  0.491
183200  0.502
183300  0.508
183400  0.509
183500  0.515
183600  0.517
183700  0.515
183800  0.508
183900  0.500
184000  0.492
184100  0.493

  Graphics File: isochore.ps

   [isochore results]

Data files

   None.

Notes

   The nuclear genomes of vertebrates are mosaics of isochores, very long
   stretches (>300kb) of DNA that are homogeneous in base composition and
   are compositionally correlated with the coding sequences that they
   embed. Isochores can be partitioned in a small number of families that
   cover a range of GC levels (GC is the molar ratio of guanine+cytosine
   in DNA), which is narrow in cold-blooded vertebrates, but broad in
   warm-blooded vertebrates.

References

    1. Bernardi G Isochores and the evolutionary genomics of vertebrates.
       Gene 2000 Jan 4;241(1):3-17
    2. Pesole G, Bernardi G, Saccone C Isochore specificity of AUG
       initiator context of human genes. FEBS Lett 1999 Dec
       24;464(1-2):60-2
    3. Bernardi G The human genome: organization and evolutionary
       history. Annu Rev Genet 1995;29:445-76

Warnings

   None.

Diagnostic Error Messages

   None.

Exit status

   It always exits with a status of 0.

Known bugs

   None.

See also

   Program name                           Description
   banana       Plot bending and curvature data for B-DNA
   btwisted     Calculate the twisting in a B-DNA sequence
   chaos        Draw a chaos game representation plot for a nucleotide sequence
   compseq      Calculate the composition of unique words in sequences
   dan          Calculates nucleic acid melting temperature
   density      Draw a nucleic acid density plot
   freak        Generate residue/base frequency table or plot
   sirna        Finds siRNA duplexes in mRNA
   wordcount    Count and extract unique words in DNA sequence(s)

Author(s)

   Peter Rice (pmr  ebi.ac.uk)
   Informatics Division, European Bioinformatics Institute, Wellcome
   Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK

History

Target users

   This program is intended to be used by everyone and everything, from
   naive users to embedded scripts.

Comments

   None
