18 min read
London Calling 2021
2021/05/20
James Clarke
- Flexibility designed in mind, do not have to batch to run (deliberate feature of the technology)
- Do 48 flow cells as a benchmarking, last done late November, managed 10 Tb from a single PromethION instrument
- median 208 Gb, maximum 242 Gb
- People use the PromethION in different ways, may want to add sample to flow cell and walk away
- Benchmarked that as well, 10kb N50 library, 220 Gb over 96 hours (single flow cell)
- Ran 30kb N50, 220 Gb output, with interaction (2 washes)
- For some people, ultra-long reads are essential
- 100 Gb from a single flow cell (100kb N50), 4 washes
- New flow cell record (245 Gb), beating ONT's internal record
PromethION / Accuracy
- Over last 2 years, have seen a real jump in raw read accuracy (92% -> 98%)
- Noticed when started seeing accuracy improvements that MinION was doing better than PromethION (shouldn't be happening)
- Found some noise from instrument was propagating onto the chip
- Re-laid out the chip, removed the ceiling of accuracy, returned to levels comparable to MinION
- Algorithms group have pushed raw read accuracy up again
- Absolute top-end accuracies, will start to see a slight wall
- PromethION in terms of S-N is better than MinION
- Now getting raw read modal accuracies of 99.3%
Flongle
- Take all the expensive bits, put them into an external instrument, so flow cell can be as cheap as possible
- Engineering has worked on improving process cycle times, a lot of inspection, improve consistency of flow cells
- Unfinished business: cheapness of flow cell
- Flow cell still contains a silicon sensor chip
- Team has been pushing over the last 6-7 months, can now got to plastic instead of silicon
- Flongle chip is the same format as the silicon, so will fit into existing adapters
- Flongle adapter will take a silicon or plastic adapter
- Have been able to get 1Gb out of a single flongle
- Have been looking at accuracy; plastic is better than silicon
- Have reduced sensor metal, should also improve accuracies
- When moving to Q20 chemistries and algorithms, will see accuracies improve again
Flongle Flush
- Want to remove the flush
- Electrochemistry relies on platinum, needs mediator to work with that
- New material: silver
- Silver can hold a stable potential without using a mediator
- Early data is looking quite good
- With silver, wells are running longer, have got to 30 hrs
- In terms of accuracy, silver is looking pretty good
- Mediators / wells hold the technology back
- Voltage sensing is a paradigm shift, looking at a local perturbation to a field
- MinION, max theoretical output 100,000 channels, 3.9 Tb / day from a single flow cell
- Assuming 60% pore utilisation, 58 Gb / hr
- Optimised at 600 b/s & 80% pores, 138 Gb / hr
- Single human at 30x in 60 mins
- $10 human genome, from a flow cell perspective
- Very strong six months, last time had built instrument, bit noisy, but achieved squiggle
- Now taking it all the way through to basecalling
- Need overhand structures (getting rid of wells)
- Need a middle layer, a fluidic resistor, looking to integrate those with the rest of the chip
- Got prototype ASIC in December, learning functionality and layout
- Next 6 months, take layers, integrate together, make breadboard instrument, use MinKNOW
- Train model, use trained model for basecalling
Stuart Reid
- This is what you can do now
- Thinking about nanopore signal, 1D time series of current measurements
- Employ RNN to decode signal
- Latest basecallers now hit raw read accuracy in excess of 99% single molecule
- Just basecallers alone, R9 from 90% to 98% (5-fold accuracy improvement in two years)
- Now have 3 models: fast (keeps up on all ONT devices, 96%); high (97.8%); super (98.3%)
- R9 and R10 both achieve those accuracies
- Still working on R10
- With nanopore you can directly see modifications
- Also get per-read long-range connections
- Working on standards and genome browsers
- Screenshot of JBrowse prototype, can see long-range connections
Consensus
- Taking a step back, looking at assemblies themselves
- Can get fully-accurate assemblies
- Offer consensus-polishing models for all base callers
- A lot of genomics is about variation: SNPs, INDELs, strutural variants, base modifications
- With Nanopore, can get all variant types from a single run
- Can do better with DeepVariant (all the 9s)
- Believe we are the gold-standard in methylation detection
- Combine Nanopore long-read with Pore-C, get fully haplotype-resolved assemblies
- Barcoding with Nanopore is very accurate; can mix samples together with confidence
Q20 chemistry
- Relatively small improvement, E8.1 motor, compatible with R9 and R10 flow cells
- Now receiving raw read modal accuracies > 99.3%
- Has been with developers for a few months now
- Q20 data base calls are now at the level that downstream tools just work
- LongShot SNP caller getting 99.7% F1
- Q50 down to 10X depth
Usability / software features
- A lot of efforts focused on bioinformatics
- Workflows delivered by Epi2Me Cloud
- Integrating Epi2Me with MinKNOW
- Bioinformatics is as much about data exploration as data analysis
- MinKNOW is designed to integrate with existing lab environment
- Offer a simpler spreadsheet-based version, can assign alias to each barcode
- Adaptive sampling, can get up to 10 times enrichment, a function of read length, proportion
- Enrichment / depletion is fully-released on GridION and MinION
- Can roll-your-own analysis
- ReadFish (basecall driven), UNCALLED (signal driven)
- Base calling choices, can be made at runtime, can choose in UI what you want
- Super accuracy 3X more intensive than high-accuracy basecalling
- Now re-enabling cloud base-calling, will make it easier to run research basecallers
- Working on updating laptop specification
- Until we release that, I highly recommend reading anything Miles Benton has written on the subject
- Trying to get guppy working with whatever GPU you have
- Working on network storage, security
Rosemary Sinclair Dokos
- Products: broken into 3 areas - preparation, sequencing, analysis
- Enjoyed working with all of you over the last few years to come up with solutions
- Have a really great core technology enabling new applications
- Want capital-free technology with fair pricing
- Want to really engage with the community
- Scalable devices, available as starter packs
- All users use tech for different things: amplicon sequencing, whole-genome HTS, ultra-long
- Have system that has high data output capacity
- Irrespective of where you are, getting very competitive price per Gb per flow cell
- $3 / Gb pricing, very competitive
- Continuous upgrade of devices / platforms
- Biggest difference comes from the sequencing chemistry, kits, and software
- Someone who bought a Mk1b in 2018 is getting much better throughput and accuracies than they did then
- For people who have build teams, you normalise and get really performing team at the end
- We will be labelling in shop developer access only, early access, release phase, fully released
- regarding communication feedback: have updated Feature Request pinboard
- Improved backend that communicates with the development teams
- All existing requests have been migrated over
Sample
- Huge thankyou for everyone who has been working on COVID over the last year
- Have been making sure that you have everything you need, everything is in stock
- Brought out AMX2 expansion pack
- MinIONs and GridIONs are all in stock, so new users can start quickly as well
- Continuing to support ARTIC classic methods, but 96-barcode protocol
- Much simpler user journey with Epi2ME ARTIC protocol
- Will soon be RT-PCR expansion packs (have RT-PCR enzymes)
- Have continued to work and develop VolTRAX for SARS-CoV-2
- In use by developers in field, evaluating to make sure it runs with all the needs and requirements
- VolTRAX V2b: PCR-enabled upgrade
- Upgraded blue cartridge, calibrated for PCR
- Works initially for SARS-CoV-2, but will work with others later
- Ultra-long read kit earlier this year, excited to see everything the community has been doing with this
- Updating our nanopore documentation centre
- Other kits in the pipeline: Native barcoding kits (kit 10)
- Will be bringing 24 barcodes into 1 kit, combining with the sequencing module, + Ampure beads
- After 24, 96 will follow. Moving to a single-use plate
- Q20 chemistry, very exciting; essential to get 99%+ accuracy
- Has Q20 motor, all other ligation sequencing kit components
- Focusing on R9.4.1 with Kit 10
- Focusing on R10.3 with Q20
- Gordon mentioned an increasing number of large genome projects
- Want automation and HT workflows
- Want to start sequencing two genomes in a single flow cell
- Combining 96 barcodes with XL kits
- On automation side, team developing on OpenTrons, Hamilton, Agilent
- Engineering team working on automated priming
Sequence
- Flongle - very versatile platform
- Now seeing improved robustness in the field
- Glass vials improved results
- Moved to a pack of 12 flow cells
- Reconfigured all flow cells, product now fully in stock
- High-output PromethION platform; a PromethION with current hardware
- R10 implementations need a hardware upgrade (>99%), need PromethION upgrade for Q20 chemistries
- Needs up-to-date software license
- Language comes into allowing anyone to sequence anywhere
- Next release of MinKNOW will have a chinese UI
Analysis
- Now quite a few configurations to run
- ONT ship all different models on all software
- Fast basecallers are now touching 96% accuracy
- Users who want high-accuracy models
- Pairing up with NVIDIA for DGX station A100 (2.5 pflops of compute) for $100k list price
- Epi2ME labs came out about 1yr ago
- All tutorials about how to handle the data
- Some users really like results and want to scale
- Epi2ME team working on NextFlow Workflows
- Not a black box, still understand what's happening on the data
- Can push to the cloud
- Epi2ME basecalling; we see reams an reams of tweets of people that don't have the right hardware
- Will be providing a lot more information about fair usage policies at launch
- ONT have come another step further
- Can do high outputs, can do low outputs
- Release plan:
- Guppy 5 May/June (Bonito CRF)
- MinKNOW G Barcode balancing July / August
- Midnight expansion kit May / June
- Q20 kit (Early access) June / July
- Multiplex ligation (research) August / September
- v10 cDNA kit August / September
- 24 cDNA barcodes September
Clive Brown - Nobody Expects the Strandish Exposition
New / Interesting / Novel statements [from David Eccles' perspective]
- 98.3% modal accuracy with super-accuracy base caller from existing R9.4 reads (e.g. from 2017 onwards)
- Q30 modal duplex reads with some samples using a trained Bonito basecaller
- trans-side molecule lock improves efficiency / sensitivity 200 times, increases duplex poroportions to 40-50% (from 1-4% from current sequencing systems)
- Alternate "outie" DNA sequencing which has the sequencing adapter on the distal end, unzips prior to sequencing, allows approximate length to be determined prior to sequencing (e.g. for adaptive sequencing that filters by length)
- outie sequencing also involves a motor stall prior to ejection; this can be exploited to sequence forward and backwards many times to improve base call accuracy
- Flongle will be shifting to a silver chloride chemistry that doesn't need priming, and has no substantial ionic depletion over time
- Flongle will be shifting to a plastic adapter, giving slightly better accuracy than the existing silicon (as well as a cheaper flow cell)
- P2 PromethION - two PromethION flow cells on a single device; there's a P2 [standard] that includes a compute unit ($60k USD, with 48 flow cells included), and a P2 solo that will plug into a GridION or desktop computer (no price at the moment).
Accuracy
- Signal from ions moving through the pore; research effort in basecalling that pushes the frontier.
- Single-molecule readout, a lot of development has gone into extracting the signal.
- Significant advances on the software side that have driven up accuracy.
- From 2020, move off to CTC-style methods that learn to label as the signal is processed.
- More recently, moving onto self-supervised methods.
- A lot of significant developments with algorithms team and machine learning team.
Bonito
- Where developments are showcased
- On existing chemistry, baseline 98.3-98.4%
- Unfiltered data aligned, R9.4.1; getting better alignment with R10
- Chemistry just coming out now called Q20+; with further software improvements, expect more improvements.
- Highly performant; will improve further when methods are tweaked. Working on improving homopolymer performance with R10
- Modal 99.3%, working on getting that better
- People have been able to replicate this performance in field
- As of today, can register interest in Q20+ today, early access in june
- Fuel fix; high capture adapter
- Tweak adapter
- Change to enzyme
- Duplex capable
- Other ligation components
- Have worked on improving PromethION electronics; upgrading PromethION boxes
Duplex reads
- Early on talked about sequencing both strands
- ONT story is about circling back to something until it works
- Original version was 2D; had some good features, not performant in field
- Worked on 2DC (not released); ditched 2D due to secondary structures on trans side; moved to 1D^2, canned that as well
- Dragged someone into the office, looked at 1D "follow on", second
strand will follow on through the hole 1-4%; trying now to get the
complement to flow through.
- First strand captured. When it's through, the second strand will follow it
- Sick & tired talking about single pass, etc. Will change to "simplex" / "duplex"
- Need to get fully symmetrical ends; need to heal any nicks that will terminate translocation
- One innovation is working on is making sure that sample only tethers to the membrane
- Now have a system that has lock molecules on the trans side of the pore, 200X improvement in efficiency, much higher sensitivity
- Large uplift in the duplex pairs on the trans tether; now getting about 40% duplex, 50% as highest yield
- See very consistent region of both strands all the way through the run, a nice clockwork architecture
- Another innovation: upgraded enzyme to E8.1; substantially improved
the movement quality on the second strand; have to get both
movements very good
- Complement chemistry is almost the same as the forward chemistry
- Other side effect: amount of improvement; duplex rate goes up as less is put on the pore
- Standard ligation prep kit; performs very well with long fragments
- Ultra-long kit, not optimal, but consistent performance with fragment length; probably down to symmetrical ligation
- Would be good to work with other companies to improve the efficiency
- Longest duplex pair to date is 442 kb; no degredation with fragment length
Duplex base calling
- Joint decoder developed from first strand and second strand;
derivative implemented in bonito, touching Q30 modal duplex
accuracy, with 1/3 reads above Q30, most reads above Q20. Now
overlapping the other long read platform, and pushing the market
leading platform
- Not from 30 copies (like PacBio); not from thousands (like Illumina)
- Everybody feels these graphs are going to shift to the right with software improvements
- Other key differentiator: length of duplex read doesn't degrade quality
- Longest duplex Q30: 156 kb
- Bioinformatics group have made a pileup, looking at Q30 is nice,
able to see somatic low-frequency calling; should significantly
boost a lot of the assembly work that's happening
- I don't like comparitave marketing; but did it anyway comparing with another platform; can't see any differences
- 2 copies, don't have concomitant loss of throughput
- Anticipating quite an aggressive launch with duplex over the summer
Moving at any size
- Going back to 2012; without motor DNA zips through at a tremendous speed
- Relationship between time taken for molecule to go through the pore and the size of the molecule; unzip speed and fragment size are correlated
- Two ways to go though a pore; method at the moment is "inny" - DNA caught, motor on front, motor breaks DNA as it goes through the pore
- Other way is to have the motor on the back end; unzip first, catch motor on the end
- Can sequence both directions
- Some while back, started looking at "outy" again
- DNA on distal end; motor destalls and jams
- After a pause, motor drags DNA out of the pore; will stop and sit when finished
- Have to decide to eject molecule, quite a lot of complicated software required
- Signal; can time how long strand takes to unzip; green section can see the motoring
- From the length of the grey section, know how long the molecule is before deciding to sequence; can eject a molecule based on its length
- Time taken to size molecule is a few seconds; can go through 10-20
molecule to find one that's long enough; quite efficient to
adaptively sample the molecule
- With a known fragment length distribution, quite efficient to find long molecules; size select fragments at run time
- At the moment the accuracy on this chemistry is not good; main reason is that it's not optimised for the chemistry
- Currently looking for pore variants to be performant on the accuracy side
Adaptive accuracy
- Another architectural feature; pause holds the DNA because motor is clipped onto the DNA, will let go after some time period
- Strand will slip back onto the pore, and start sequencing again; keep looping indefinitely
- Signal from second read will have the same characteristics
- Can re-read a fragment indefinately in the loop; adaptive accuracy, can basecall to a point where you're happy with the accuracy
- Creating join-signal basecaller; signal-space medaka
- Would like to increase the distance of drop-back (20-30 kb); getting perfect re-reads; looks like it's got long legs
- Ultra-longs work; longest have been seen in the megabase range
Adaptive sampling
- API provided for adaptive sampling; check if molecule is in the
region of interest. People have written software to do targeted
sequencing
- Another way for size selection than "outy" method: can be done with "Inny" method
- Ligate a hairpin onto one end of the molecule, move motor onto the hairpin end
- Unzip DNA until it jams and destalls the motor
- Again, from inward unzip, only get size estimation; only get sequence from complement
- Because all inny, Q20 chemistry straight away
- Not architecturally as good as outy, but usable straight away
- Can shift fragment distribution up to the right, still some shorter
molecules in there, can size-select on platform with the inny method
right now
- Outy method size selection doesn't look as good because it needs more development
Adaptive finishing
- Early days of genome sequencing, would find gaps, and have to do targeted cloning to span the bits of the genome that you got wrong
- Can do long-read genome with ultra-long, have assembly ambiguities
- Idea here is to do adaptive sampling with reads that start in contig
ends; do adaptive sampling to target length cutoff that will span
ambiguous parts of the assembly
- Can take flow cell off device, park in the fridge, come back to it and re-sample
- If really crafty, could be building the assembly graph as sequencing
- Can then think about doing both region and length targeted sequencing
- Conjecture: if we can adaptively select the target region, and
adaptively select length, and adaptively target haplotypes (from
first 400-1000 bases), should be able to determine which chromosome
the fragments come from
- If this can be done, any genome can be assembled and cloned - nanopore-only genomes
New and notable
- Stick with flongle; quite a few users
- Used as a test platform for next generation of chips
- Now building cheap, floppy flongle flow cells
- Better data quality, better signal to noise
- Moving back to silver chloride chemistry, can get rid of the faff of processing flow cells
- Want a just-add-sample flow cell, will come first on flongle
- Looking to integrate sample prep with the just-add flow cells
- Have been developing electronic sample preps, coming along quite nicely, but too early
- For a lot of applications, all you should need is just a swap
Other applications
- A little bit delayed, all require a new ASIC
- Completely designed by Nanopore, much lower power development, cheaper, fully disposable
- Will be showing data from that in the meeting later in the year, going back to original concept
- No point in having a device like SmidgION if needs vortexer and other lab equipment
PromethION
- Largest customer has 30 P48s, moving to 80 at end of this year
- Showing 10 Tb output per PromethION
- Best at the moment still using only about 60% of what's available; highest performance sequencer out there
- Lab with 80 will be by far the worlds largest sequencing facility
- A lot of it is about the front-end automation; sample-optimised workflows
- All now looking good; dramatically improving performance
P2 - new product
- Doesn't use any new bits; just an outline specification
- Integrated version with everything
- Two PromethION flow cells - P2
- Hoping can sell more granular flow cells
- Showing hopefully at NCN, targeting Q1 next year
- 2 * PromethION flow cells keeps well ahead of other sequencers
- Target 48 flow cells for $60k
Circular thing
- Cabled into another box, e.g. GridION
- Latest GridION version will be able to run all flow cells, including PromethION flow cells
- Will bring back 4-channel PromethION flow cells
- If you've made that investment in Nanopore, we'll keep you ahead
- Can register interest in both products right now
Summary
- Haven't spoken about integrated chemistry on flow cell
- Idea about writing information on DNA
- Sequencing proteins, saved for the future, not immediately relevant to anybody (> 6 months)
ORG.one
- Enable sequencing of reference-quality critically-endangered species
- Giving free flow cells
- Pushing to enable scientists living in the country to own the data and have a handle on what's next
- Existing situation: sequencing / assembly gets pushed off to rich countries; organised chaos
- Encouraging people to collaborate with each other, share data as soon as possible
- Some overlap with other projects that are similar to this one
- In pilot phase, will shake down any issues
- Some of these animals are quite difficult to sequence, will force us to quickly improve our technology
- Will be a test-bed for new Nanopore tech
Questions
- Q20 kit available for Flongle in Q2
- Higher capture rate - more sensitive
- Symmetrical Y-adapter on both ends should have more follow-on
- Should be able to turn-off follow-on with adaptive sampling; want follow-on to be a default behaviour that is switched off
- Overall throughput on Q20 is a little bit lower
- Size selection of library is quite difficult
- Follow-on will probably not work with outy
- Pore blocking: non-reversable block is one of the motivations for
wanting to look at outy again; factors are probably not present with
outy; catastrophic blocking should be lower
- Adaptive sampling should work with barcoded samples
- Bonito is duplex-ready
- Time taken to look at 10-20 fragments is quite small compared to sequencing time; shape of the input distribution is key
- Modified bases on Q20 - too complicated
- As long as basecaller incorporates HxM-C calling, should work fine
- Relatively small improvement in simplex has a very big payout in duplex calling; predict a big breakthrough on homopolymers
- Developing R10 replacements that have a much longer readout, should close off a lot of the biases
- Mk1C won't have enough compute power to drive the PromethION flow cells