Analysis of Col1a1 promoter and parcial intron1 sequences in Homo sapiens/Hs, Bos taurus/Bt, Cervus elaphus/Ce ortologues
Table of Contents
Osx motifs 2.
Promoter analysis with Fuzznuc program 2.
Homo sapiens 2.
Bos taurus 4.
Cervus elaphus 6.
Promoter analysis with Dialign program 8.
Partial intronic sequence analysis with Fuzznuc program 20.
Homo sapiens 20.
Bos taurus 21.
Cervus elaphus 22.
Partial intronic sequence analysis with Dialign program 24.
Osx motifs:
Symbol Binding site Transfac ID
Promoter
########################################
# Program: fuzznuc
# Rundate: Sun 1 Feb 2010 18:29:40
# Commandline: fuzznuc
# -filter
# [-sequence] ../fuzznuc/source/HsPrE1COL1A1_1kE1.fasta
# -complement
# -pattern @SP1_1.pat
# Report_format: seqtable
# Report_file: stdout
########################################
#======
#
# Sequence: HsPrE1COL1A1 from: 1 to: 1223
# HitCount: 31
#
# Pattern_name Mismatch Pattern
# V$SP1_Q4_01 2 NNGGGGCGGGGNN
#
# Complement: Yes
#
#======
Start End Pattern_name Mismatch Sequence
779 791 V$SP1_Q4_01 2 GCTGGGCTGGGGG
780 792 V$SP1_Q4_01 2 CTGGGCTGGGGGG
787 799 V$SP1_Q4_01 1 GGGGGGCTGGGGA
788 800 V$SP1_Q4_01 2 GGGGGCTGGGGAG
877 889 V$SP1_Q4_01 1 TGGGGGCCGGGCC
878 890 V$SP1_Q4_01 2 GGGGGCCGGGCCA
906 918 V$SP1_Q4_01 2 CTGGGGCACGGGC
937 949 V$SP1_Q4_01 1 GAGGGGCAGGGTT
979 991 V$SP1_Q4_01 2 AAGGGGCCCGGGC
980 992 V$SP1_Q4_01 2 AGGGGCCCGGGCC
1022 1034 V$SP1_Q4_01 2 CGGGGTCGGAGCA
145 133 V$SP1_Q4_01 2 CAGAGGTGGGGAT
168 156 V$SP1_Q4_01 2 AGGGGTGGGGGAT
169 157 V$SP1_Q4_01 1 TAGGGGTGGGGGA
247 235 V$SP1_Q4_01 2 GAGGTGCAGGGTG
263 251 V$SP1_Q4_01 2 ATGGAGTGGGGAG
373 361 V$SP1_Q4_01 2 TGAGGGAGGGGTG
545 533 V$SP1_Q4_01 2 TAAGGGAGGGGCA
602 590 V$SP1_Q4_01 2 TGGGGAGGGGGTC
603 591 V$SP1_Q4_01 1 TTGGGGAGGGGGT
617 605 V$SP1_Q4_01 2 AAGGGTTGGGGGT
644 632 V$SP1_Q4_01 2 GCTGGGTGGGGAG
649 637 V$SP1_Q4_01 2 GGGGAGCTGGGTG
717 705 V$SP1_Q4_01 2 GTGGGTAGGGGTG
856 844 V$SP1_Q4_01 2 AGGGGGAGGAGGA
862 850 V$SP1_Q4_01 2 ATGGAGAGGGGGA
938 926 V$SP1_Q4_01 2 TCGGAGAGGGGGA
1087 1075 V$SP1_Q4_01 2 TGGGGAGGGGGTT
1088 1076 V$SP1_Q4_01 1 CTGGGGAGGGGGT
1094 1082 V$SP1_Q4_01 2 TTGTGGCTGGGGA
1175 1163 V$SP1_Q4_01 2 GGAGGGCGGTGGC
#------
#------
#------
# Total_sequences: 1
# Total_hitcount: 31
#------
########################################
# Program: fuzznuc
# Rundate: Sun 1 Feb 2010 18:29:58
# Commandline: fuzznuc
# -filter
# [-sequence] ../fuzznuc/source/BtPrE1COL1A1_1kE1.fasta
# -complement
# -pattern @SP1_1.pat
# Report_format: seqtable
# Report_file: stdout
########################################
#======
#
# Sequence: BtPrE1COL1A1 from: 1 to: 1223
# HitCount: 22
#
# Pattern_name Mismatch Pattern
# V$SP1_Q4_01 2 NNGGGGCGGGGNN
#
# Complement: Yes
#
#======
Start End Pattern_name Mismatch Sequence
782 794 V$SP1_Q4_01 1 GGGGGGCTGGGGA
783 795 V$SP1_Q4_01 2 GGGGGCTGGGGAG
877 889 V$SP1_Q4_01 1 TGGGGGCCGGGCC
878 890 V$SP1_Q4_01 2 GGGGGCCGGGCCA
907 919 V$SP1_Q4_01 2 TGGGGGACGGGCA
908 920 V$SP1_Q4_01 2 GGGGGACGGGCAG
937 949 V$SP1_Q4_01 1 CGGGGGCAGGGTT
979 991 V$SP1_Q4_01 2 AAGGGGCCCGGGC
980 992 V$SP1_Q4_01 2 AGGGGCCCGGGCC
986 998 V$SP1_Q4_01 2 CCGGGCCGGTGGT
1022 1034 V$SP1_Q4_01 2 CGGGGTCGGAGCA
131 119 V$SP1_Q4_01 2 AAGGGCCGTGGTC
447 435 V$SP1_Q4_01 2 GTGTGGCTGGGGT
667 655 V$SP1_Q4_01 2 TCAGGGAGGGGAC
712 700 V$SP1_Q4_01 2 GTGGGTAGGGGCG
856 844 V$SP1_Q4_01 2 AGGGGGAGGAGGA
938 926 V$SP1_Q4_01 2 CGGGAGAGGGGGA
963 951 V$SP1_Q4_01 2 GGAGAGCGGGGAG
1087 1075 V$SP1_Q4_01 2 TGGGGTGGGGGTT
1088 1076 V$SP1_Q4_01 1 CTGGGGTGGGGGT
1094 1082 V$SP1_Q4_01 2 TTGCGGCTGGGGT
1175 1163 V$SP1_Q4_01 2 GGAGGGCGGTGGC
#------
#------
#------
# Total_sequences: 1
# Total_hitcount: 22
#------
########################################
# Program: fuzznuc
# Rundate: Sun 1 Feb 2010 18:30:24
# Commandline: fuzznuc
# -filter
# [-sequence] ../fuzznuc/source/CePrE1COL1A1_1k_Pr_rev.fasta
# -complement
# -pattern @SP1_1.pat
# Report_format: seqtable
# Report_file: stdout
########################################
#======
#
# Sequence: CePrE1COL1A1_1k_Pr_rev from: 1 to: 1175
# HitCount: 25
#
# Pattern_name Mismatch Pattern
# V$SP1_Q4_01 2 NNGGGGCGGGGNN
#
# Complement: Yes
#
#======
Start End Pattern_name Mismatch Sequence
716 728 V$SP1_Q4_01 1 GGGGGGCTGGGGA
717 729 V$SP1_Q4_01 2 GGGGGCTGGGGAG
811 823 V$SP1_Q4_01 1 TGGGGGCCGGGCC
812 824 V$SP1_Q4_01 2 GGGGGCCGGGCCA
841 853 V$SP1_Q4_01 2 TGGGGGACGGGCA
842 854 V$SP1_Q4_01 2 GGGGGACGGGCAG
871 883 V$SP1_Q4_01 1 CGGGGGCAGGGTT
913 925 V$SP1_Q4_01 2 AAGGGGCCCGGGC
914 926 V$SP1_Q4_01 2 AGGGGCCCGGGCC
920 932 V$SP1_Q4_01 2 CCGGGCCGGTGGT
976 988 V$SP1_Q4_01 2 CGGGGTCGGAGCA
54 42 V$SP1_Q4_01 2 AAGGGCCGTGGTG
203 191 V$SP1_Q4_01 2 GGGGTGAGGGGGG
205 193 V$SP1_Q4_01 2 ATGGGGTGAGGGG
381 369 V$SP1_Q4_01 2 GTGTGGCTGGGGT
473 461 V$SP1_Q4_01 2 CAGGGATGGGGAT
600 588 V$SP1_Q4_01 2 TCAGGGAGGGGAC
646 634 V$SP1_Q4_01 2 GTGGGTAGGGGCG
687 675 V$SP1_Q4_01 2 CTGGGGTGCGGCT
790 778 V$SP1_Q4_01 2 AGGGGGAGGAGGA
872 860 V$SP1_Q4_01 2 CGGGAGAGGGGGA
897 885 V$SP1_Q4_01 2 GGAGAGCGGGGAG
1039 1027 V$SP1_Q4_01 2 TGGGGTGGGGGTT
1040 1028 V$SP1_Q4_01 1 TTGGGGTGGGGGT
1127 1115 V$SP1_Q4_01 2 GGAGGGCGGTGGC
#------
#------
#------
# Total_sequences: 1
# Total_hitcount: 25
#------
DIALIGN 2.2.1
*************
Program code written by Burkhard Morgenstern and Said Abdeddaim
e-mail contact:
Published research assisted by DIALIGN 2 should cite:
Burkhard Morgenstern (1999).
DIALIGN 2: improvement of the segment-to-segment
approach to multiple sequence alignment.
Bioinformatics 15, 211 - 218.
For more information, please visit the DIALIGN home page at
http://bibiserv.techfak.uni-bielefeld.de/dialign/
program call: dialign2-2 -n Hs_Bt_Ce_PrE1_1k_20100112.dfasta
Aligned sequences: length:
======
1) HsPrE1COL1A1 1223
2) BtPrE1COL1A1 1223
3) CePrE1COL1A1 1175
Average seq. length: 1207.0
Please note that only upper-case letters are considered to be aligned.
Alignment (DIALIGN format):
======
HsPrE1COL1A1 1 cc------
BtPrE1COL1A1 1 aagtgctctt caatacagtt ctctccagtt tgactgtgct gggtagaagg
CePrE1COL1A1 1 ------
0000000000 0000000000 0000000000 0000000000 0000000000
HsPrE1COL1A1 3 ------C TGGCCACAGC CATGGC------AAACAAAACT
BtPrE1COL1A1 51 gtgtctaaaC AGGCCATGAC CATGGCcACG ACAGACCCTA ACACAAGACC
CePrE1COL1A1 1 ------ACG ACAGACCCTA ACACAAGACC
0000000000 0000000000 0000000222 2222222222 3333333333
HsPrE1COL1A1 30 CTTCTCTAAG TCACCAATGA TCACAGGCCT cccactaaaa ATACTTCCCA
BtPrE1COL1A1 101 CTTTTCTAAA TCACCAGTGA CCACGGCCCT TCCGTGC--- AAACTTACTG
CePrE1COL1A1 24 CTTTTCTAAG TCACGAGTCA CCACGGCCCT TCCGTGC--- AAACTTACTG
3333333333 3333133333 3333333333 2222222000 2222222222
HsPrE1COL1A1 80 ACTCTGGGGT GGAAGAGTtt gggggatgaa tttttagggg attgcaagcc
BtPrE1COL1A1 148 CCCCTAGGga g------GC------
CePrE1COL1A1 71 CTCCTGGGGT GAAAAAATcc taGC------
2222222200 0000000000 0022000000 0000000000 0000000000
HsPrE1COL1A1 130 ccaatcccca cctctgtgtc cctagAATCC CCCACCCCTA CCTTgGCTGC
BtPrE1COL1A1 161 ------AGTTT TCCACCACCA CCCT-GCTGC
CePrE1COL1A1 95 ------AATTT CCCACCACCA CCCT-GCTGC
0000000000 0000000000 0000022222 2222222222 2222022222
HsPrE1COL1A1 180 TCCATCACCC AACcACCAAA GCTTTCTTCT GCAGAGGCCA CCTAGTCAtg
BtPrE1COL1A1 185 CCCATCACTC AAC-ACCAAA GCTTCCTTCT GGCGAGACCG CATACTCAAA
CePrE1COL1A1 119 CCCATCACTC AAC-ACTAAA GCGTCCTTCT GGAGAGACCG CATACTCAAA
2222222222 2220311333 3333333333 3333333333 3333333322
HsPrE1COL1A1 230 TTTCTCACCC TGCACCTCAG CCTCCCCACT C------CaT CTCTCAATCA
BtPrE1COL1A1 234 TTTTTCACTC AGTGCCTCAG CACCCCTCAT CACCCCATCT tTTTCAATCA
CePrE1COL1A1 168 TTTTTCACTC AGTGCCTCAG CAGCCCCCCT CACCCCATCT CTTTCAATCC
2222222111 1111111111 1111111111 1111111111 0222222222
HsPrE1COL1A1 274 TGcCTAGGGT TTGGAGGAAG GCATTTGATT CTGTTCTGGA GCacagcaga
BtPrE1COL1A1 284 TG-CTAGGGT TTAGGGGAGA GTATTTGAGT CTATCCTGGA GCCTCAGGCA
CePrE1COL1A1 218 TG-CTAGGGT TTAGGGGAGA GTATTTGAAT CTATCCTGGA GCCTCAGGCA
2204444444 4444444444 4444444414 4444444444 4433333333
HsPrE1COL1A1 324 ----AGAATT GACATCCTCA AAATTAAAAC tcccttgcct gcacccctcc
BtPrE1COL1A1 333 TGGCAGAATT GACATCCTCA AAACAAAAAC CCACCCTAAG G------
CePrE1COL1A1 267 TGGCAGAATT GACATCCTCA AAACAAAAAC CCACCCTAAG G------
3333555555 5555555555 5555555555 3333333333 3000000000
HsPrE1COL1A1 370 ctcagatATC TGATTCTTAA TGTCTAGAAA GGAATCTgta aaTTGTTCCC
BtPrE1COL1A1 374 ------ATC TGATTCTTAA CGTCTATAAA TGAATCTATC --TTGTTCCC
CePrE1COL1A1 308 ------ATC TGATTCTTAA CGTCTATAAA TGAATCTATG --TTGTTCCC
0000000555 5555555555 5555544444 4444444222 0044444444
HsPrE1COL1A1 420 CAAATATTCC TAAGCTCCAT CCCCtAGCCA CACCAGAAGA CACCCCCAAA
BtPrE1COL1A1 415 CAAATATTCC TAAGTTCCAT ACCCCAGCCA CACCAGAAGA CACCCCTAAA
CePrE1COL1A1 349 CAAATATTCC TACATTCCAT ACCCCAGCCA CACCAGAAGA CACCCCTAAA
4444444444 4422555555 5555388888 8888888888 8888888888
HsPrE1COL1A1 470 CAGGCACATC TTTTT-AATT CCCAGCTTCC TCTGTTTTGG AGAGGTCCTC
BtPrE1COL1A1 465 CAGGCACATC TTTTTAAATT CTCTGGTTCT CCTGCTTTAA AGTGGTCCCC
CePrE1COL1A1 399 CAGGCGCATC TTTTTAAATT CTCTGGTCCT CCCGCTTTAA AGTGGTCCCC
8888556666 6666612222 2222222222 2203333333 3333333333
HsPrE1COL1A1 519 AGCaTGCCTC TTTATGCCCC TCCCTTAGCT CTTGCca-GG ATATCAGagg
BtPrE1COL1A1 515 AGC-TGCCTC TCCATACCCA TCCCTGAGCT CTTGCTCTGG ATGTCAGGAG
CePrE1COL1A1 449 AGC-TGCCTC TCCATCCCCA TCCCTGAGCT CTTGCTCTGG ATATCAGGAG
3330444444 4444444444 4444333333 3333322222 2222222222
HsPrE1COL1A1 568 gtgactgggG -CACAGCCAG GAGGACCCCC TCCCCAACAC CCccaac--C
BtPrE1COL1A1 564 GGTGAACAAG -CAGGGCGAG CAGGACCCTT TCCCAATCAC TCaaAGAGTC
CePrE1COL1A1 498 GGTGAACAAG aCATGGCGGG CAGGACCCTC TCCCAATCAC CC--AGAGTC
2222222222 0111111111 1111111111 1111111111 0000222224
HsPrE1COL1A1 615 CTTCCACCTT TGGAAGTCTC CCCACCCAGC TCCCCAGTTc cccagttcca
BtPrE1COL1A1 613 CTTCCATCTT TGAAAGCTTC CCTAACCAGC TCCCCAATTC CAGTCCCCTC
CePrE1COL1A1 546 CTTCCGTCTT TGAAAGCCTC CCTAACCAGC TCCACAATTC CAGTCCCCTC
4444444444 4444444444 4444444444 4442444441 1111111111
HsPrE1COL1A1 665 cttcttctaG ATTGGAGG-T CCCAGGAAGA GAGCAG-AGG GGCACCCCTA
BtPrE1COL1A1 663 CCTGA----G ACTGGGGG-T CCAAGAAAGA AAGCCGAAGA GGCGCCCCTA
CePrE1COL1A1 596 CCTGA----G ACTGGGGGgT CCTAGAAAGA AAGCCGAAGA GGCGCCCCTA
1111000022 2222222200 0003333333 3333333444 4444444444
HsPrE1COL1A1 713 CCCACTGGTT AGCCCACgcc attcTGAGGA CCCAGCTGCA CCCCtaccaC
BtPrE1COL1A1 708 CCCACTATTT AGCCGACAGT ATGTTGAGGA CCTAGCTGCA CCCCAGCGGC
CePrE1COL1A1 642 CCCACTATTT AGCCGACAGT ATGTTGAGGA CCTAGCCGCA CCCCAGCGGC
4444444444 4444444222 2222333333 3333333333 3333222225
HsPrE1COL1A1 763 AGCACCTCTG GCCCAGGCTG GGCTGGGGGG CTGGGGAGGC AGAGCTGCGA
BtPrE1COL1A1 758 AGCATCTCTG GCCCAGACTA AACTGGGGGG CTGGGGAGGC AGAGTTGCGA
CePrE1COL1A1 692 AGCATCTCTG GCCCAGACTG AACTGGGGGG CTGGGGAGGC AGAGCCGCGA
5555555555 5555555555 5555555555 5555666666 6664447777
HsPrE1COL1A1 813 AGAGGGGAGA TGTGGGGTGG ACTCCC---- -TTCCCTCCT CCTCCCCCTC
BtPrE1COL1A1 808 AGGGGGGAGA TGTGCGGTGG ACTCCCTTTC CTTCCCTCCT CCTCCCCCTC
CePrE1COL1A1 742 AGGGGGGAGA TGTGCGGTGG ACTCCCTTTC CTTCCCTCCT CCTCCCCCTC
7777777777 7777777777 7777553333 3666666666 6666666666
HsPrE1COL1A1 858 TCCATTCCAA CTCCCAAATT GGGGGCCGGG CCAGGCAGCT CTGATTGGCT
BtPrE1COL1A1 858 TCGGTTCCGA CTCCCAAATT GGGGGCCGGG CCAGGCAACT CTGATTGGCT
CePrE1COL1A1 792 TCGGTTCCGA CTCCCAAATT GGGGGCCGGG CCAGGCAACT CTGATTGGCT
6655777777 7777777777 7777777777 7777777577 7777777777
HsPrE1COL1A1 908 GGGGCACGGG CGGCCGGCTC CCCCTCTCCG AGGGGCAGGG TTCCTCCCTG
BtPrE1COL1A1 908 GGGGGACGGG CAGCCGGCTC CCCCTCTCCC GGGGGCAGGG TTCCTCCCCG
CePrE1COL1A1 842 GGGGGACGGG CAGCCGGCTC CCCCTCTCCC GGGGGCAGGG TTCCTCCCCG
7777777777 7577777777 7777777744 4666666666 6666666646
+1
HsPrE1COL1A1 958 CTCTCCATCA GGACAGTATA AAAGGGGCCC GGGCCAGTCG TCGGAGC---
BtPrE1COL1A1 958 CTCTCCATCA GGATGGTATA AAAGGGGCCC GGGCCGGTGG TCGGAGC---
CePrE1COL1A1 892 CTCTCCGTCA GGATGGTATA AAAGGGGCCC GGGCCGGTGG TCGGAGCaga
6666663666 6664466666 6666666666 6666666647 7777777000
HsPrE1COL1A1 1005 ------AGA CGGGAGTTTC TCCTCGGGGT CGGAGCAGGA
BtPrE1COL1A1 1005 ------AGA CGGGAGTTTC TCCTCGGGGT CGGAGCAGGA
CePrE1COL1A1 942 cgggagtttc tcctcgcAGA CGGGAGTTTC TCCTCGGGGT CGGAGCAGGA
0000000000 0000000555 9999999999 9999999999 9999999999
HsPrE1COL1A1 1038 GGCACGCGGA GTGTGAGGCC ACGCATGAGC GGACGCTAAC CCCCTCCCCA
BtPrE1COL1A1 1038 GGCACGCGGA GTGTGAGGCC ACGCATGAGC GGACGCTAAC CCCCACCCCA
CePrE1COL1A1 992 GGCACGCGGA --GTGAGGCC ACGCATGAGC GGACGCTAAC CCCCACCCCA
9999999555 2288888888 8888888888 8888888887 7777777777
HsPrE1COL1A1 1088 GCCACAAAGA GTCTACATGT CTAGGGTCTA GACATGTTCA GCTTTGTGGA
BtPrE1COL1A1 1088 GCCGCAAAGA GTCTACATGT CTAGGGTCTA GACATGTTCA GCTTTGTGGA
CePrE1COL1A1 1040 aCCGCAAAGA GTCTACATGT CTAGGGTCTA GACATGTTCA GCTTTGTGGA
2555888888 8888888888 8888888999 9999999999 9999999999
HsPrE1COL1A1 1138 CCTCCGGCTC CTGCTCCTCT TAGCGGCCAC CGCCCTCCTG ACGCACGGCC
BtPrE1COL1A1 1138 CCTCCGGCTC CTGCTCCTCT TAGCGGCCAC CGCCCTCCTG ACGCACGGCC
CePrE1COL1A1 1090 CCTCCGGCTC CTGCTCCTCT TAGCGGCCAC CGCCCTCCTG ACGCACGGCC
9999999999 9999999999 9999999999 9888666666 6666666666
HsPrE1COL1A1 1188 AAGAGGAAGG CCAAGTCGAG GGCCAAGACG AAGACA
BtPrE1COL1A1 1188 AAGAGGAGGG CCAGGAAGAA GGCCAAGAAG AAGACA
CePrE1COL1A1 1140 AAGAGGAGGG CCAAGAAGAA GGCCAAGAAG AAGACA
6666666344 4442311222 2222222222 222222
Alignment (FASTA format):
======
>HsPrE1COL1A1
cc------
------CTGGCCACAGCCATGGC------AAACAAAACT
CTTCTCTAAGTCACCAATGATCACAGGCCTcccactaaaaATACTTCCCA
ACTCTGGGGTGGAAGAGTttgggggatgaatttttaggggattgcaagcc
ccaatccccacctctgtgtccctagAATCCCCCACCCCTACCTTgGCTGC
TCCATCACCCAACcACCAAAGCTTTCTTCTGCAGAGGCCACCTAGTCAtg
TTTCTCACCCTGCACCTCAGCCTCCCCACTCCa------TCTCTCAATCA
TGcCTAGGGTTTGGAGGAAGGCATTTGATTCTGTTCTGGAGCacagcaga
----AGAATTGACATCCTCAAAATTAAAACtcccttgcctgcacccctcc
ctcagatATCTGATTCTTAATGTCTAGAAAGGAATCTgtaaaTTGTTCCC
CAAATATTCCTAAGCTCCATCCCCtAGCCACACCAGAAGACACCCCCAAA
CAGGCACATCTTTTT-AATTCCCAGCTTCCTCTGTTTTGGAGAGGTCCTC
AGCaTGCCTCTTTATGCCCCTCCCTTAGCTCTTGCca-GGATATCAGagg
gtgactgggG-CACAGCCAGGAGGACCCCCTCCCCAACACCCccaac--C
CTTCCACCTTTGGAAGTCTCCCCACCCAGCTCCCCAGTTccccagttcca
cttcttctAGATTGGAGG-TCCCAGGAAGAGAGCAG-AGGGGCACCCCTA
CCCACTGGTTAGCCCACgccattcTGAGGACCCAGCTGCACCCCtaccaC
AGCACCTCTGGCCCAGGCTGGGCTGGGGGGCTGGGGAGGCAGAGCTGCGA
AGAGGGGAGATGTGGGGTGGACTCCC-----TTCCCTCCTCCTCCCCCTC
TCCATTCCAACTCCCAAATTGGGGGCCGGGCCAGGCAGCTCTGATTGGCT
GGGGCACGGGCGGCCGGCTCCCCCTCTCCGAGGGGCAGGGTTCCTCCCTG
CTCTCCATCAGGACAGTATAAAAGGGGCCCGGGCCAGTCGTCGGAGC---
------AGACGGGAGTTTCTCCTCGGGGTCGGAGCAGGA
GGCACGCGGAGTGTGAGGCCACGCATGAGCGGACGCTAACCCCCTCCCCA
GCCACAAAGAGTCTACATGTCTAGGGTCTAGACATGTTCAGCTTTGTGGA
CCTCCGGCTCCTGCTCCTCTTAGCGGCCACCGCCCTCCTGACGCACGGCC
AAGAGGAAGGCCAAGTCGAGGGCCAAGACGAAGACA
>BtPrE1COL1A1
aagtgctcttcaatacagttctctccagtttgactgtgctgggtagaagg
gtgtctaaaCAGGCCATGACCATGGCcACGACAGACCCTAACACAAGACC
CTTTTCTAAATCACCAGTGACCACGGCCCTTCCGTGC---AAACTTACTG
CCCCTAGGgag------GC------
------AGTTTTCCACCACCACCCT-GCTGC
CCCATCACTCAAC-ACCAAAGCTTCCTTCTGGCGAGACCGCATACTCAAA
TTTTTCACTCAGTGCCTCAGCACCCCTCATCACCCCATCTtTTTCAATCA
TG-CTAGGGTTTAGGGGAGAGTATTTGAGTCTATCCTGGAGCCTCAGGCA
TGGCAGAATTGACATCCTCAAAACAAAAACCCACCCTAAGG------
------ATCTGATTCTTAACGTCTATAAATGAATCTATC--TTGTTCCC
CAAATATTCCTAAGTTCCATACCCCAGCCACACCAGAAGACACCCCTAAA
CAGGCACATCTTTTTAAATTCTCTGGTTCTCCTGCTTTAAAGTGGTCCCC
AGC-TGCCTCTCCATACCCATCCCTGAGCTCTTGCTCTGGATGTCAGGAG
GGTGAACAAG-CAGGGCGAGCAGGACCCTTTCCCAATCACTCaaAGAGTC
CTTCCATCTTTGAAAGCTTCCCTAACCAGCTCCCCAATTCCAGTCCCCTC
CCTG----AGACTGGGGG-TCCAAGAAAGAAAGCCGAAGAGGCGCCCCTA
CCCACTATTTAGCCGACAGTATGTTGAGGACCTAGCTGCACCCCAGCGGC
AGCATCTCTGGCCCAGACTAAACTGGGGGGCTGGGGAGGCAGAGTTGCGA
AGGGGGGAGATGTGCGGTGGACTCCCTTTCCTTCCCTCCTCCTCCCCCTC
TCGGTTCCGACTCCCAAATTGGGGGCCGGGCCAGGCAACTCTGATTGGCT
GGGGGACGGGCAGCCGGCTCCCCCTCTCCCGGGGGCAGGGTTCCTCCCCG
CTCTCCATCAGGATGGTATAAAAGGGGCCCGGGCCGGTGGTCGGAGC---
------AGACGGGAGTTTCTCCTCGGGGTCGGAGCAGGA
GGCACGCGGAGTGTGAGGCCACGCATGAGCGGACGCTAACCCCCACCCCA
GCCGCAAAGAGTCTACATGTCTAGGGTCTAGACATGTTCAGCTTTGTGGA
CCTCCGGCTCCTGCTCCTCTTAGCGGCCACCGCCCTCCTGACGCACGGCC
AAGAGGAGGGCCAGGAAGAAGGCCAAGAAGAAGACA
>CePrE1COL1A1
------
------ACGACAGACCCTAACACAAGACC
CTTTTCTAAGTCACGAGTCACCACGGCCCTTCCGTGC---AAACTTACTG
CTCCTGGGGTGAAAAAATcctaGC------
------AATTTCCCACCACCACCCT-GCTGC
CCCATCACTCAAC-ACTAAAGCGTCCTTCTGGAGAGACCGCATACTCAAA
TTTTTCACTCAGTGCCTCAGCAGCCCCCCTCACCCCATCTCTTTCAATCC
TG-CTAGGGTTTAGGGGAGAGTATTTGAATCTATCCTGGAGCCTCAGGCA
TGGCAGAATTGACATCCTCAAAACAAAAACCCACCCTAAGG------
------ATCTGATTCTTAACGTCTATAAATGAATCTATG--TTGTTCCC
CAAATATTCCTACATTCCATACCCCAGCCACACCAGAAGACACCCCTAAA
CAGGCGCATCTTTTTAAATTCTCTGGTCCTCCCGCTTTAAAGTGGTCCCC
AGC-TGCCTCTCCATCCCCATCCCTGAGCTCTTGCTCTGGATATCAGGAG
GGTGAACAAGaCATGGCGGGCAGGACCCTCTCCCAATCACCC--AGAGTC
CTTCCGTCTTTGAAAGCCTCCCTAACCAGCTCCACAATTCCAGTCCCCTC
CCTG----AGACTGGGGGgTCCTAGAAAGAAAGCCGAAGAGGCGCCCCTA
CCCACTATTTAGCCGACAGTATGTTGAGGACCTAGCCGCACCCCAGCGGC
AGCATCTCTGGCCCAGACTGAACTGGGGGGCTGGGGAGGCAGAGCCGCGA
AGGGGGGAGATGTGCGGTGGACTCCCTTTCCTTCCCTCCTCCTCCCCCTC
TCGGTTCCGACTCCCAAATTGGGGGCCGGGCCAGGCAACTCTGATTGGCT
GGGGGACGGGCAGCCGGCTCCCCCTCTCCCGGGGGCAGGGTTCCTCCCCG
CTCTCCGTCAGGATGGTATAAAAGGGGCCCGGGCCGGTGGTCGGAGCaga
cgggagtttctcctcgcAGACGGGAGTTTCTCCTCGGGGTCGGAGCAGGA
GGCACGCGGA--GTGAGGCCACGCATGAGCGGACGCTAACCCCCACCCCA
aCCGCAAAGAGTCTACATGTCTAGGGTCTAGACATGTTCAGCTTTGTGGA
CCTCCGGCTCCTGCTCCTCTTAGCGGCCACCGCCCTCCTGACGCACGGCC
AAGAGGAGGGCCAAGAAGAAGGCCAAGAAGAAGACA
Sequence tree:
======
Tree constructed using UPGMA
(HsPrE1COL1A1:0.001846,
(BtPrE1COL1A1:0.000979,
CePrE1COL1A1:0.000979):0.000867);
Intron
Fuzznuc
########################################
# Program: fuzznuc
# Rundate: Sun 5 Feb 2010 15:47:16
# Commandline: fuzznuc
# -filter
# [-sequence] HsI1_part_COL1A1.fas
# -pattern @../../SP1_1.pat
# Report_format: seqtable
# Report_file: stdout
########################################
#======
#
# Sequence: HsI1_part_COL1A1.seq from: 1 to: 172
# HitCount: 2
#
# Pattern_name Mismatch Pattern
# V$SP1_Q4_01 2 NNGGGGCGGGGNN
#
# Complement: No
#
#======
Start End Pattern_name Mismatch Sequence
133 145 V$SP1_Q4_01 2 ATGGGGGCGGGAT
134 146 V$SP1_Q4_01 1 TGGGGGCGGGATG
#------
#------
#------
# Total_sequences: 1
# Total_hitcount: 2
#------
########################################
# Program: fuzznuc
# Rundate: Sun 5 Feb 2010 15:46:55
# Commandline: fuzznuc
# -filter
# [-sequence] BtI1_part_COL1A1.fas
# -pattern @../../SP1_1.pat
# Report_format: seqtable
# Report_file: stdout
########################################
#======
#
# Sequence: BtI1_part_COL1A1.seq from: 1 to: 173
# HitCount: 2
#
# Pattern_name Mismatch Pattern
# V$SP1_Q4_01 2 NNGGGGCGGGGNN
#
# Complement: No
#
#======
Start End Pattern_name Mismatch Sequence
59 71 V$SP1_Q4_01 2 TCTGGGCGGGATC
135 147 V$SP1_Q4_01 2 ATGGGGCGGAATC
#------
#------
#------
# Total_sequences: 1
# Total_hitcount: 2
#------
########################################
# Program: fuzznuc
# Rundate: Sun 5 Feb 2010 15:47:56
# Commandline: fuzznuc
# -filter
# [-sequence] CeI1COL1A1.fas
# -pattern @../../SP1_1.pat
# Report_format: seqtable
# Report_file: stdout
########################################
#======
#
# Sequence: CeI1COL1A1.seq from: 1 to: 173
# HitCount: 2
#
# Pattern_name Mismatch Pattern
# V$SP1_Q4_01 2 NNGGGGCGGGGNN
#
# Complement: No
#
#======
Start End Pattern_name Mismatch Sequence
59 71 V$SP1_Q4_01 2 TCTGGGCGGGATC
135 147 V$SP1_Q4_01 2 ATGGGGCGGAATC
#------
#------
#------
# Total_sequences: 1
# Total_hitcount: 2
#------
Dialign
DIALIGN 2.2.1
*************
Program code written by Burkhard Morgenstern and Said Abdeddaim
e-mail contact:
Published research assisted by DIALIGN 2 should cite:
Burkhard Morgenstern (1999).
DIALIGN 2: improvement of the segment-to-segment
approach to multiple sequence alignment.
Bioinformatics 15, 211 - 218.
For more information, please visit the DIALIGN home page at
http://bibiserv.techfak.uni-bielefeld.de/dialign/
program call: dialign2-2 -n Hs_Bt_Ce_col1a1_intron.dfasta
Aligned sequences: length:
======
1) HsI1_part_CO 172
2) BtI1_part_CO 173
3) CeI1COL1A1.s 173
Average seq. length: 172.7
Please note that only upper-case letters are considered to be aligned.
Alignment (DIALIGN format):
======
HsI1_part_CO 1 aAGATGTCTA GGTGCTGGAG GTTAGGGTGT CTCCTAATTT TgagGTACAT
BtI1_part_CO 1 GAGATGTCTG GGCGCCGGAG GTTAGGGCGT ACCCTATTTT TACCGTACAT
CeI1COL1A1.s 1 GAGATGTCTG GGCGCCGGAG GTTAGGGCGT ACCCTATTTT TACCGTATAT
4888888888 8888888888 8888888888 8888888888 6222444444
HsI1_part_CO 51 TTCAAGTCTT GGGGGGGCCT CCCtt-CCAA TCAGCCGCTC CCatt-CTCC
BtI1_part_CO 51 TTCAGGTCTC TGGGCGGGAT CCCACGCCAA TCAGCCCCAC CCCATCCTCT
CeI1COL1A1.s 51 TTCAAGTCTC TGGGCGGGAT CtCACGCCAA TCAGCCCCAC CCCATCCTCT
4444444444 4444444444 4043334444 4444444444 4433337777
*1245
HsI1_part_CO 99 TAGCCCCGCC CCCGCCACCC CACCTGCCCA GGGAATgGGG GCGGGATGAG
BtI1_part_CO 101 TAGCCCCGCC CACGCCGTCC CACCTGCCCC GGGAAT-GGG GCGGAATCTG
CeI1COL1A1.s 101 TAGCGCCGCC CACTCCATCC CACCTGCCCC GGGAAT-GGG GCGGAATCTG
7777466666 6666666666 6666666666 6669990888 8888888888
HsI1_part_CO 149 GGCTGGACCT CCCTTCTCTC CTCC
BtI1_part_CO 150 GGTTGAACCT CCCATCTCTC CTCC
CeI1COL1A1.s 150 GGTTGAACCT CCCATCTCTC CTCC
8888888888 8888888888 8888
Alignment (FASTA format):
======
>HsI1_part_CO
aAGATGTCTAGGTGCTGGAGGTTAGGGTGTCTCCTAATTTTgagGTACAT
TTCAAGTCTTGGGGGGGCCTCCCtt-CCAATCAGCCGCTCCCatt-CTCC
TAGCCCCGCCCCCGCCACCCCACCTGCCCAGGGAATgGGGGCGGGATGAG
GGCTGGACCTCCCTTCTCTCCTCC
>BtI1_part_CO
GAGATGTCTGGGCGCCGGAGGTTAGGGCGTACCCTATTTTTACCGTACAT
TTCAGGTCTCTGGGCGGGATCCCACGCCAATCAGCCCCACCCCATCCTCT
TAGCCCCGCCCACGCCGTCCCACCTGCCCCGGGAAT-GGGGCGGAATCTG
GGTTGAACCTCCCATCTCTCCTCC
>CeI1COL1A1.s
GAGATGTCTGGGCGCCGGAGGTTAGGGCGTACCCTATTTTTACCGTATAT
TTCAAGTCTCTGGGCGGGATCtCACGCCAATCAGCCCCACCCCATCCTCT
TAGCGCCGCCCACTCCATCCCACCTGCCCCGGGAAT-GGGGCGGAATCTG
GGTTGAACCTCCCATCTCTCCTCC
Sequence tree:
======
Tree constructed using UPGMA
(HsI1_part_CO:0.012235,
(BtI1_part_CO:0.005867,
CeI1COL1A1.s:0.005867):0.006367);
2