The success of SMRTbell library sequencing relies on the quality of the starting material. Factors such as DNA damage, carry-over contaminants, and improper handling can significantly impact sequencing performance, leading to lower yields, shorter read lengths, and compromised data quality. This article provides a comprehensive and in-depth analysis of troubleshooting strategies to address the quality of starting material, along with detailed recommendations to optimize SMRTbell library sequencing performance.

 

DNA Damage

  • Causes: DNA damage during library preparation can arise from various sources, including mechanical shearing, suboptimal handling, and chemical or enzymatic reactions, resulting in nicked DNA and the formation of abasic sites.
  • Impact on Performance: The presence of DNA damage significantly impacts sequencing performance by reducing the availability of intact DNA templates, diminishing the number of templates accessible for sequencing, and impeding the generation of long reads.
  • Recommendations: To effectively address the impact of DNA damage, researchers should employ gentle extraction techniques during genomic DNA (gDNA) isolation to minimize the occurrence of nicking and strand breaks. Additionally, implementing appropriate DNA damage repair methods, such as enzymatic or chemical repair, can successfully restore DNA integrity before initiating library preparation. The integration of purification techniques, such as phenol-chloroform extraction or utilization of commercial purification kits, plays a critical role in effectively removing contaminants that contribute to DNA damage.

 

Carry-Over Contaminants

  • Causes: Carry-over contaminants, originating from residual reagents or buffers from previous library preparation steps, or polysaccharides and organic substances from the starting organism, pose a significant challenge during SMRT Bell Library sequencing.
  • Impact on Performance: Carry-over contaminants exert a detrimental effect on sequencing performance, manifesting as hindrance to efficient library preparation, introduction of artifacts and background noise, and influence on the accuracy and integrity of the resultant sequencing data.
  • Recommendations: To address carry-over contaminants effectively, researchers must implement stringent purification methods, such as phenol-chloroform extraction or column-based purification kits, to thoroughly remove residual reagents or buffers that could potentially interfere with library preparation. Considering the adoption of additional purification steps, such as ethanol precipitation or magnetic bead-based purification, can significantly enhance the removal of contaminants, thereby leading to improved sequencing performance. Moreover, adherence to proper storage conditions, encompassing low-temperature storage and minimal exposure to external contaminants, is imperative in preventing carry-over contamination during sample handling.

 

Sample Conditions

  • Causes: Inadequate sample conditions can result from exposure to extreme temperatures or pH values, improper DNA quantification, and the presence of insoluble material.
  • Impact on Performance: Suboptimal sample conditions can lead to compromised DNA integrity, affecting library preparation and sequencing performance.
  • Recommendations: To maintain optimal sample conditions, researchers should avoid exposing DNA samples to high temperatures (>65°C) or extreme pH values (<6 or >9) that can induce DNA damage or denaturation. Verifying the OD260/280 ratio between 1.8 and 2.0 ensures DNA purity and the absence of protein contamination, while confirming the OD260/230 ratio between 2.0 and 2.2 indicates minimal contamination from chemicals or other organic substances. Implementing rigorous quality control measures, such as gel electrophoresis or fluorometric assays, can confirm the absence of RNA contamination, UV exposure, or intercalating fluorescent dyes.

 

Fragmented DNA Size Distribution

  • Causes: The presence of inadequate fragment size distribution and loading bias during library preparation can impact sequencing performance.
  • Impact on Performance: An uneven fragment size distribution and loading bias can restrict the generation of long reads and bias the representation of DNA fragments in the sequencing library, leading to preferential loading of shorter templates and uneven coverage.
  • Recommendations: To optimize DNA fragment size distribution, researchers can tailor the DNA shearing process to target larger insert sizes, maximizing the generation of long reads and improving subread lengths. Employing size selection methods, such as gel-based or bead-based size exclusion, can remove shorter fragments that may bias the sequencing results. Ensuring proper purification steps, such as effective AMPure PB purification, eliminates short insert contaminants and adapter dimers. When pooling amplicons or libraries, aiming for similar sizes minimizes loading bias and achieves more uniform coverage across the genome.

 

Characterization of Fragment Size

Understanding the characteristics of the fragment size distribution is essential for optimizing sequencing performance:

  • Approximately 30-40% of sheared DNA falls within the desired size range, which is crucial for generating long reads.
  • While larger fragments do not provide an advantage in loading, they can impact library quantitation and fragment representation.
  • The small-size fraction within a shear exhibits a higher loading advantage, leading to reduced subread lengths but potentially higher sequencing yields.
  • Shorter insert sizes contain a higher number of individual molecules in a given quantity compared to larger inserts, affecting the complexity and representation of the library.
  • For accurate sizing of fragments larger than 17 kb, consider using alternative methods such as pulse-field gel electrophoresis (PFGE) for more precise size determination.

 

Short Insert Contaminants

  • Causes: Inadequate removal of short insert contaminants, such as adapter dimers and short inserts themselves, can compromise sequencing performance.
  • Impact on Performance: Short insert contaminants reduce the efficiency of polymerase binding, decrease the average subread length, and distort the representation of the target DNA.
  • Recommendations: To mitigate the impact of short insert contaminants, researchers should utilize purification methods during loading to selectively remove short inserts and adapter dimers, improving overall sequencing quality. Implementing thorough AMPure PB purification with multiple ethanol washes (3x) ensures efficient removal of contaminants and enhances library quality. Maintaining the correct ratio of adapters to inserts during library preparation prevents the formation of adapter dimers. If persistent issues with adapter dimers arise, consider alternative approaches such as A/Tailing ligation to overcome the problem.

 

Purification

  • Causes: Carry-over contaminants from the AMPure XP purification step can negatively impact library quality and sequencing performance.
  • Impact on Performance: Inadequate AMPure PB purification leads to lower SMRTbell library yields, decreased sequencing yields, and shorter subread lengths.
  • Recommendations: To optimize AMPure PB purification and enhance sequencing performance, researchers should transition from AMPure XP to AMPure PB beads, specifically designed for SMRTbell library purification, to improve purification efficiency and remove contaminants effectively. Performing additional rounds of SMRTbell purification using AMPure PB beads, ensuring proper bead-to-sample ratios and following manufacturer guidelines, achieves thorough purification and minimizes carry-over contaminants. Consider incorporating MagBeads or other magnetic bead-based purification methods during loading to enhance purification efficiency and improve library quality.

 

Inaccurate Quantification

  • Causes: Variability in quantification tools and the presence of contaminants that affect readings can lead to inaccurate library quantification.
  • Impact on Performance: Inaccurate quantification results in incorrect determination of sample concentrations, leading to overloading or underloading of samples in sequencing reactions, and discrepancies in read length, accuracy, and sequencing yield.
  • Recommendations: To ensure accurate quantification and reliable library preparation, researchers should utilize reliable and standardized quantification systems, such as Qubit fluorometry, for precise measurement of DNA concentration. Employing multiple quantification methods, such as spectrophotometry and fluorometric assays, validates results and cross-validates measurements. Implementing additional extraction methods or purification steps to remove contaminants that may interfere with quantification accuracy, such as RNA, short inserts, or residual reagents, further ensures accurate quantification.

 

Reference

  1. Kong, Nguyet, et al. “Automation of PacBio SMRTbell NGS library preparation for bacterial genome sequencing.” Standards in genomic sciences 12.1 (2017): 1-10.