TABLE 1. Oligonucleotides used in the RT-PCR, PCR, Northern blot, and 5´RACE.

______

Gene RT-primer sequence 5´-3´Forward primer sequence 5´-3´ Reverse primer 5´-3

______

asr0689-tggttagttgtagccacgctta tag

asr0690-cggcgaggttatttttgaag "

alr0691tag + ctggggtggtcaatcaagtt ttggcgaggataaccgatag "

alr0692tag + cttgggataacgtcaaagtagaaaaaagggcgattgaggattt "

alr0693tag + tacccttagcattctgtgggacaaaggaatttttcg "

hypF.1tag + ccttgatatggtgtcctgatcgactgaggaaattcgagtg "

hypF.2tag + ttcatcaccacataccaagggaattggtggttttca "

hypCtag + gcggcgatttcttgtaataattcaccgacattaaccacaaa "

hypD-ccaattgttgtttccggttt "

hypEtag + tcgtccttttggatgagatttattttaattgcccgtggtga "

hypA-tgaatatgctcaaggctcaaaa "

hypB tag + tggaagcatccaaatgacaagggaacaggttgtcatttgg "

asr0701tag + gctgggaaagcttgatatatacgctattacatcaaattct "

alr0702tag + taaaccgctattggggtcagtggtgaagtgattgggatga "

NORTHERN BLOT

asr0689tggttagttgtagccacgcttacccagacccaaaacaatagc

alr0693tgggacaaaggaatttttcgaaaacttttctgccccaaca

hypFaagggaattggtggttttcatcaaatgatgcaaaggcgta

hypA-hypBtgaatatgctcaaggctcaaaatgtgggggtatttgattggt

IDENTIFICATION OF TRANSCRIPTION START POINTS, 5´RACE

asr0689taagcgtggctacaactaacca

alr0693tcaacaaccgacgattacca

alr0694cagcatccacatcatccaac

asr0695agcccaaagcgaagaataca

______

tag = CACACCACAACCACACGAC

TABLE 2. Annotation and functional domains of the open reading frames (ORFs) found in the extended hyp-operon and the closest homologue in Anabaena variabilis ATCC 29413 (Fig.1A)

ORFaAnnotationaFunctional domain/-saClosest homologues in Referencesb.c

Anabaena variabilis ATCC 29413a

(% sequence identity)

asr0689unknown protein2 transmembrane regionsava_ 4597 (94.3 )UniProtKB/TrEMBL-Q8YZ01, IPR003439

asr0690unknown protein2 transmembrane regionsava_4598 (98.6)UniProtKB/TrEMBL:Q8YZ00, IPR003439

alr0691hypothetical proteinTPR, Protein prenyltransferaseava_4599 (97.5 )UniProtKB/TrEMBL-Q9WWQ1,

IPR001440/PF00515, IPR008940

alr0692similar to NifU proteinNifU (C-terminal), Thioredoxin-like domainsava_4600 (95.6 )UniProtKB/TrEMBL-Q8YYZ9, PF01106,

COG0694

alr0693unknown proteinNHL and TPR repeats ava_4601 (96.9)UniProtKB/TrEMBL-Q8YYZ8, Q9WWQ1,

IPR001258/ PF01436

hypFhydrogenase maturation proteinHydrogenase expression/formation protein (HUPF/HYPC)ava_4602 (91.5)

hypChydrogenase expression/formation proteinHydrogenase expression/formation protein (HUPF/HYPC)ava_4603 (96.3)

hypDhydrogenase expression/formation proteinHydrogenase formation HypD proteinava_4604 (95.6)

asr0697probable 4-oxalocrotonate tautomerase4-oxalocrotonate tautomeraseava_4605 (98.6)

hypEhydrogenase expression/formation proteinHydrogenase expression/formation protein HypEava_4606 (98.1)

hypAhydrogenase expression/formation proteinHydrogenase expression/synthesis protein, HypA ava_4607 (89.4)

hypBhydrogenase expression/formation protein [NiFe]-hydrogenase/urease maturation factor,

Ni(2+)-binding GTPaseava_4608 (88.9)

asr0701unknown protein1 transmembrane regionava_4178 (44.0) alr1571 (46.0)d

alr0702serine proteinaseSerine proteinaseava_4610 (96.9)

aCyanobase (

bUniProt Knowledgebase (Swiss-Prot and TrEMBL) (

c Pfam (

dThe orthologue, alr1571, present in Nostoc PCC7120 has a sequence identity of 46 % to asr0701 and a sequence identity of 99% compared with ava_4178.

TABLE 3: Conserved sequences in the intergenic regions of the extended hyp-operon in Nostoc PCC 7120 (figure 1A and figure 4)

______

Intergenic regionsConserved sequencesLength (nt)Appearances in the genomeHairpin formation energyDistance from translational ΔG (kcal mol-1) start point (nt)

______

R1.1hupS-asr0689ttgtcagttgtcagttgtca2044no hairpin476

R1.2ttgttagttgttag1411no hairpin455

R1.3ccactgactactgac1514no hairpin391

R2alr0691-0692agacgcgatt(c/a)atcgcgtct2034-10,5 49

R3.1alr0693-hypFggagggtttccctcc1519-6,1 81

R3.2aacttttcaaga1221no hairpin 60

R4hypF-hypCattgcgaattg1179 + 26 (alfa plasmid)-1,3 49

R5.1asr0701-alr0702aaatccctatcagggattgaaac2332-5,8256

R5.2aaatccctatcagggattgaaac2332-5,8186

R6alr0702-all0703taggggtgtaggggt1561no hairpin a

______

a Intergenic region downstream the two ORFs alr0702 and all0703.

TABLE 4: Orthologues to the five ORFs asr0689-alr0693 with conserved genomic arrangement in filamentous nitrogen-fixing cyanobacteria (Fig. 3).

______

Nostoc sp. strain PCC 7120asr0689asr0690alr0691alr0692alr0693identical genomic arrangement

Anabaena variabilis ATCC 29413ava_4597ava_4598ava_4599ava_4600ava_4601

Nostoc punctiforme PCC 73102NpR0366NpR0365NpR0367NpR0364NpR0363identical genomic arrangement

Nodularia spumigena CCY9414ORF2ORF3ORF1ORF4ORF5

Lyngbya majuscula CCAP 1446/4ORF2ORF3ORF1ORF4ORF5identical genomic arrangement as the subgroup above, with the exception of

Trichodesmium erythraeum IMS101tery_3363tery_3362tery_3364tery_3360aadditional ORFs within the cluster

______

a The orthologue of alr0693, tery_0790, is positioned directly upstream hypF in the hyp-gene cluster, located approximately 3.8 Mb away from the hupSL and putative maturation gene cluster of the small subunit.