The potential of nanopores for single-molecule, ultra-rapid sequencing of DNA

Bionanotechnology option -

Molecular pores in nanotechnology

Candidate no. : 22518

Word count : 3123

Thepotential of nanopores for single-molecule, ultra-rapid sequencing of DNA

Introduction

Sequencing the human genome was a project that started 15 years ago using techniques pioneered by Sanger et al(1)and Maxam and Gilbert (2).The estimated cost of sequencing a similar mammalian genome five years ago was US$300 million (3)using equipment can sequence ~30,000 bases per instrument per day at a cost of ~ US$0.50 per nucleotide (4). Today, another human genome could be sequenced in about six months, at a cost of ~US$30 million (5). However, developments are taking place investigating the notion that in the future nanopores may be used as a cheaper way to determine DNA sequences, at rates between 1000 and 10,000 bases per second (4).

The fundamental principle behind nanopore DNA sequencing is that the nanopores are behaving as ‘Coulter counters’ – macromolecules carrying a net electrical charge are electrophoretically driven through the nanopore by an applied electric potential across the pore. Ions flowing through this pore cause a detectable current. When macromolecules enter the pore they partially blockit and reduce the flow of ions. This causes measurable transient drops in the current that can be monitored to determine characteristics of that macromolecule (6). The hypothesis is that as the DNA is driven through the pore, each nucleotide will have a different affect on the current.Thus one can determine the sequence from the changes in the current as the DNA passes through the pore.In order to do this, single nucleotide resolution is required – it must be possible to distinguish one nucleotide from the next in the sequence.

α-Hemolysin

The first stage in nanopore DNA sequencing was to find an appropriate nanopore. An obvious example was the α-Hemolysin (α-HL) protein from Staphylococcus aureus(figure 1). α-HL has properties that make it an ideal choice for experimental use; it is wide enough to accommodate passage of a single-stranded (ss) polynucleotide, it can remain open for a long period of time (up to 24 hours), whilst remaining stable under a variety of ionic strengths, temperatures approaching 100oC (7)and up to 65oC in denaturing detergents(8). Also for future investigations, α-HL is useful as it tolerates radical alterations in its amino acid sequence. This can be utilised to perhaps engineer a pore that is better equipped for DNA sequencing, indeed α-HL has already been modified to detect divalent metal ions (9)and small organic molecules (10).

Figure 1 (from (11)).The crystal structure of α-HL is known to 1.9 Å resolution (12)and has an aperture that is wide enough to accommodate DNA or RNA. α-HL is a heptameric protein that self assembles in the lipid bilayer. It is comprised of three domains; the cap and the rim (comprising the cis ‘head’), and the (membrane spanning, trans) stem. The trans side of the protein spans the membrane, whilst the cis side is the mushroom-shaped ‘head’ of the protein. The protein is ~10 nm high, with the pore running the length of the protein. This aqueous channel ranges in diameter from 1.5 to 4.6 nm, with the cis entrance being 2.6 nm, the trans entrance being 2 nm.Double-stranded (ds) polynucleic acids can enter the cis vestibule of the protein, but only ss-polymers can traverse the 1.5 nm diameter constriction and thread through the narrow stem region.

The first experiment

Kasianowicz et al(13)showed for the first time that a polynucleotide could be driven through a nanopore. It was reasoned that it should be possible for the polyanionic polynucleotide to be drawn through a continuously open channel by an applied trans-membrane voltage (indeed, the initial aim of the experiment was to prove this, with characterisation of the polymer almost an afterthought). Furthermore, it was postulated that due to the dimensions of the channel (figure 1), the polynucleotide would have to pass through in an extended linear chain. This passage of the polynucleotide could be detected by the partial blockage that it caused to the ion flow.

The system was set up with a solvent-free bilayer of diphytanoyl-phosphatidylcholine separating two buffer-filled compartments (1.5 ml of 1 M KCl and 5 mM Hepes at pH 7.5). Less than 1 μg α-HL protein was added to one compartment, which then reconstituted into the bilayer. After a channel had formed, the compartment was washed with fresh buffer to prevent further channel formation.

Upon application of a potential of -120 mV (with the cis side negative), a steady single channel current ensued. With the addition of poly[U] the current dropped by 85-100% and lasted for hundreds to thousands of microseconds. Similar timescale events were seen for poly[C], poly[dT], poly[dC], and poly[dA,dT,dC]. The blockage times were proportional to nucleotide length and inversely proportional to the applied voltage.

A profile of the duration of blockage revealed that the events fell into three groups (figure 2), one of which was very short time-period. The other two groups were attributed to the polynucleotide being threaded through the pore in different orientations, i.e. 5’ first or 3’ first(at the time there was no physical basis to explain this, but subsequent models (14) speculate this is due the polymers acting in a ratchet-like fashion) .

Figure 2 (from(13)). The lifetimes of the blockages fall into three groups, the first of which (peak 1)was very short and was explained as the polynucleotide transiently blocking the entrance to the pore before dissociating again. The other two groups (peaks 2 and 3) represent polynucleotide passage through the pore, possibly in different orientation.

Deamer and Akeson (4)were sceptical of the notion that a seemingly small applied potential could capture the ends of individual nucleic acid molecules and draw them through a 1.5 nm pore. In order to show that it was indeed the polynucleotide translocation that was causing the blockages, a further experiment was carried out comparing ssDNA to dsDNA(13). The dsDNA was shown to cause indefinitely long blockages (suggesting it enters the cis vestibule, but not into the stem), whereas the ssDNA causes transient decreases in current. PCR analysis of the buffer compartments showed that the predicted amount of ssDNA (and none of the dsDNA) had translocated from the cis to trans compartments.

Kasianowicz et al(13)then postulated that characteristics of the DNA could be determined, and eventually sequencing could occur.

Characterisation of DNA

Earlier experiments had shown that polymers passing through porescould be characterised and also help to reveal information about the pore (15) (16), and now work was being carried out to characterise polynucleic acids. The first experiments set out to try and distinguish different homopolynucleotides (14)(17) (18). It was seen that simply measuring the speed of translocation could be used to distinguish between some nucleotide species (poly[U] traverses the pore ~20 times faster than poly[dA] (14), and about ten times faster than poly[A] (17)). In a seminal experiment, Akeson et al(18) were the first to show they could use α-HL to distinguish different polynucleotides from their current blockade characteristics (length of blockage and degree of blockage). They found that the blockages caused by poly[A] were smaller than those caused by poly[C] (~85% decrease compared to ~95%), and were also longer than poly[C] (22 ± 6μs compared to 5 ± 2 μs per nucleotide). The current amplitudes for poly[A] and poly[U] were virtually indistinguishable, but poly[U] blockades were typically shorter. It was noted that some of the results went against prior conceptions based upon the size of the purine/pyrimidine. For example, cytosine is a much smaller moiety than adenine, but causes a larger blockage. This anomaly was attributed to the secondary structure of the RNA. Poly[A] and poly[C] both form helices with diameters of 2.1 and 1.3 nm respectively. Thus the poly[A] helix is too big to fit through the α-HL 1.5 nm aperture, so has to unwind (which explains why poly[A] transition takes longer). Therefore the helical poly[C] obscures a larger part of the channel than the extended-chain poly[A]. Another possibility was that cytosine actually interacted withα-HL in some way. This theory was disproved by also testing poly[dC]. Poly[dC] cannot form a helix, and thus should cause a lower decrease in the current, which was observed.

An additional experiment was performed using a polynucleotide that contained 30 adenine bases and 70 cytosine bases. It was hoped that the method would produce a bilevel current that would indicate the passage of one nucleotide species to the other. This was indeed the case (figure 3); a ~95% blockage (solid arrow) that reduced to a ~85% blockage (dashed arrow). This also shows that the poly[C] end (3’) entered the pore first, as would be expected due to its narrow secondary structure helix (5’ end first transitions were seen, but resulted in permanent blockages). Akeson et al(18)were the first of many to realise that if traversal time could be slowed, then single nucleotide detection may be achievable.

Figure 3 (adapted from (18)). The current profile of the experiment carried out using the A(30)C(70)Gp polynucleotide. Several bilevel blockage events can be seen, with the two levels occurring at ~95 and ~85% blockage. The change in level indicates the transition from poly[C] to poly[A].

Characterisation of the same bases has also occurred in DNA (19). Here the two species are distinguished using three parameters; the most probable translocation current, the most probable translocation duration, and the characteristic dispersal values for individual translocation durations. The translocation durations was plotted against blockade current to produce ‘event diagrams’, of which poly[dA] and poly[dC] fall in to two distinct groups with only 1% overlap. This could also be used to distinguish poly[dAdC]50 and poly[dA50dC50].

These experimental results prove that detailed characterisation of polynucleotides is possible, and that further developments could lead to single nucleotide resolution.

Single nucleotide resolution

Single nucleotide resolution is the ultimate goal that will lead to sequencing. The first of several experimental procedures where it was claimedthat polynucleotides with only one base difference were distinguishable was carried out by Howorka et al(20). Pores were engineered with an oligonucleotide linked to a cysteine residue on α-HL. The result is a nanopore with a piece of ssDNA covalently attached within the cis vestibule of α-HL. It was shown that lengths of DNA that were complimentary to the tethered oligonucleotide could be distinguished from those with a single-base mis-match from observing their current trace. Furthermore, an oligonucleotide (of nine bases) where the final three bases (positions seven, eight, and nine) are unknown was tethered inside the α-HL and the final codon sequenced by using an array of oligonucleotides (seven bases long) that each had a different base at position seven. Whichever one of these produced a trace indicative of a complimentary sequence would reveal which base occupied position seven. The same was repeated for positions eight and nine. Thus one could say, that α-HL was used for sequencing of DNA, though it was not ‘single-molecule’, and in no way ‘ultra-rapid’. This technique was using the duplex-formation properties of the oligonucleotides, rather than the current-disruption properties of different bases to determine the sequence. Indeed, the authors proposed that the modified nanopore would be a useful tool with which to study DNA duplex formation in detail.

A similar (duplex-formation properties) approach was carried out by Deamer and Branton (21) where they again demonstrated ‘single-base resolution’. This time the DNA duplex was in the form of a blunt ended hairpin. The hairpin was used in an attempt to keep the DNA in the α-HL pore for longer, and thus slow down the translocation. Hairpins were designed so that only intramolecular interactions occurred (22), and they initially used a six-base-pair stem with a four-T loop (23). As the ds-hairpin is too wide to pass through the α-HL 1.5 nm aperture, translocation can only occur upon spontaneous dissociation of all the hydrogen bonds (caused by a force of ~20 pN exerted by an applied voltage of 125 mV). Blunt-ended DNA hairpins of stem length varying from three to eight bases were used, and it was observed that with each base addition the size of blockage was increased. The duration of the blockage also increased with stem-length, and correlated well with the free energy of hairpin formation. Hairpins containing a mis-match were produced, and the blockage duration was decreased from ~1 s to 10 ms. Thus, theoretically this technique can distinguish two DNA molecules that differ in only one nucleotide, however the authors do admit that these results do not lead to a method of sequencing. Further investigations into DNA hairpins (24) (25)have found that the ionic current signature whilst the hairpin is in the cis vestibule depends on the number of hydrogen bonds within the terminal base pair, the stacking between the terminal base pair and its nearest neighbour, and the 5’ vs 3’ orientation (24). Thus all four combinations of basepairs can be distinguished. Recently, non-blunt-end hairpins have been investigated (25). These showed that hairpin unzipping times decrease as follows; 8 bp hairpin > 8 bp hairpin with a single mis-match > 7 bp hairpin. The results show agreement with dissociation timescales of hairpins in bulk solution (26)which suggests that the hairpins stability is not affected by possible DNA-pore or electrostatic interactions.

Nakane et al(27)also have results that,as they describe it, demonstrated a ‘proof-of-concept’ for a single molecule oligonucleotide sensor capable of distinguishing short oligonucleotides with single base pair resolution. This involved a piece of ss-DNA biotinylated on the 5’ end to prevent it completely passing through the pore (figure 4). This is driven through the α-HL pore to a group of target ssDNAs on the other side of the membrane. The two will form a duplex, and then the potential is reversed, causing the withdrawal of the biotinylated DNA back through the pore. The time taken for the probe to withdraw will depend upon the strength of the duplex interactions. Thus, again, matched and mis-matched DNA can be discriminated. Although the authors have ‘proved-their-concept’, it offers little in the advancement towards sequencing. Furthermore, the potential for target DNA segments to move back across α-HL (trans to cis) is not accounted for or even mentioned.

Figure 4 (adapted from (27)). The biotinylated ssDNA is used to probe the pieces of ssDNA on the other side of the membrane. Upon the reversal of the potential, the probe DNA moves back through the pore, but is hindered by the duplex that is formed. Complete probe DNA withdrawal can only occur when the duplex has dissociated, thus a completely complementary sequence will require a longer time to dissociate than a sequence that contains a mis-match.

The authors did, however, proclaim that higher specificity had been achieved in nanopore-based sensors by incorporating a probe molecule that is permanently tethered to the interior of the pore (concept has since been used in further experiments (28)(29)).This counteracts an underlying problem in α-HL pore transduction; the fact that the DNA is driven through the pore too quickly. Several authors(18)(30) (11) (4)have recognised that in order to obtain single nucleotide detection the traversal time needs to be slowed. It has been shown both through Molecular Dynamics (MD) simulations (31) and experiments (32)that the dynamics of DNA translocation are sensitive to the magnitude of the applied electric field.However, one cannot simply reduce the voltage, as it still needs to be maintained high to overcome backward movement of the polynucleotide. The reason for this problem is that the number of ions involved in transition between one base and the next is only ~100 ions per microsecond, and the time interval for a measurement is just few microseconds, so the difference is lost in the noise (4). One approach to slow down translocation was to use DNA that formed a hairpin, causing it to remain inside α-HL, as described previously (21) (24) (25). The other, more recent approach is to use DNA that remains permanently within the pore (28) (29) byforming a rotaxane (figure 5) (28)and pseudorotaxane (29).

Figure 5(adapted from (29)and (28)).The ssDNA was prevented from leaving the pore in either direction, through addition of streptavidin to the PEG region on the trans side and the formation of a stable DNA hairpin on the cis side (right). Ashkenasy et al(29)produced a pseudorotaxane by engineering DNA with a stable terminal hairpin that holds the polynucleotide in the α-HL pore (left).

The goal of Ashkenasy et al(29)was to determine which part of the DNA encased in the α-HL pore gave rise to the signature ionic current. In order to do this, the typical current of poly[dA] (residual current ‘IR’ of ~22%) and poly[dC] (IR ~31%) was measured. They knew that the specific region ‘recognised’ by α-HL was ~20 nucleotides away from the end of the hairpin, so poly[dC] was produced with a single A that varied in position from 18 to 22. Multiple current readings were then taken, and each event was characterised as either ‘A-type’ (IR < 27%) or ‘C-type’ (IR > 27%) and the percentage of each type calculated. For A-position 18, 19, 21, and 22, the C-type events were in the large majority (> 65%), however, for position 20 there was a majority A-type signal. Thus, it is at position 20 where nucleobase ‘sensing’ occurs. This recognition site is near the trans opening of α-HL, and is consistent with results where tethered DNA was used to probe the interior of the pore (16)and MD(33). It is worth noting that this recognition site is not the 1.5 nm limiting aperture at the trans end of the stem, as one might expect. These results indicate that if the translocation rate of the DNA can be slowed (in this case stopped), then single base discrimination and thus sequencing may be possible.