- Split View
-
Views
-
Cite
Cite
Gustav Johansson, Melita Kaltak, Cristiana Rîmniceanu, Avadhesh K Singh, Jan Lycke, Clas Malmeström, Michael Hühn, Outi Vaarala, Susanna Cardell, Anders Ståhlberg, Ultrasensitive DNA Immune Repertoire Sequencing Using Unique Molecular Identifiers, Clinical Chemistry, , hvaa159, https://doi.org/10.1093/clinchem/hvaa159
-
Share
Abstract
Immune repertoire sequencing of the T-cell receptor can identify clonotypes that have expanded as a result of antigen recognition or hematological malignancies. However, current sequencing protocols display limitations with nonuniform amplification and polymerase-induced errors during sequencing. Here, we developed a sequencing method that overcame these issues and applied it to γδ T cells, a cell type that plays a unique role in immunity, autoimmunity, homeostasis of intestine, skin, adipose tissue, and cancer biology.
The ultrasensitive immune repertoire sequencing method used PCR-introduced unique molecular identifiers. We constructed a 32-panel assay that captured the full diversity of the recombined T-cell receptor delta loci in γδ T cells. The protocol was validated on synthetic reference molecules and blood samples of healthy individuals.
The 32-panel assay displayed wide dynamic range, high reproducibility, and analytical sensitivity with single-nucleotide resolution. The method corrected for sequencing-depended quantification bias and polymerase-induced errors and could be applied to both enriched and nonenriched cells. Healthy donors displayed oligoclonal expansion of γδ T cells and similar frequencies of clonotypes were detected in both enrichment and nonenriched samples.
Ultrasensitive immune repertoire sequencing strategy enables quantification of individual and specific clonotypes in a background that can be applied to clinical as well as basic application areas. Our approach is simple, flexible, and can easily be implemented in any molecular laboratory.
Introduction
T and B lymphocytes undergo clonal expansion upon activation. The monitoring of clonality has immense importance in the diagnosis and follow-up of hematological diseases (1–3,) and is a fundamental tool for the understanding of antigen specificity in immunity to infections, cancer, and in autoimmunity. Today, a comprehensive assessment of clonality is feasible using next-generation sequencing (NGS)-based approaches (3, 4,). Clinically, there is a need for improved methodological specificity and sensitivity to determine the immune repertoire, for example, to detect acute lymphoblastic leukemia and to monitor minimal residual disease (1, 5). Moreover, in studies of immune disorders, such as autoimmunity and tumor immunity, improved immune repertoire analysis will facilitate the identification of involved immune cells.
T cells are divided into two major subtypes based on their antigen receptor, the αβ and γδ T cells. The gene loci encoding the γ and δ chains of the γδ T-cell receptor (TCR) display fewer gene segments for recombination, but the potential repertoire of γδ T cells is larger than that of αβ T cells, due to possible usage of two D-segments in the TCRδ chain. Despite intense research, only a few ligands and their recognition by the γδ TCR are known (6, 7). Thus, there is a need for improved understanding of factors that drives γδ T-cell activation and clonal expansion.
The immune repertoire can be determined by sequencing the third complementary determining region (CDR3), by targeted amplification of either DNA (8,) or mRNA (4, 9,). A challenge when analyzing cell clonality with these standard NGS-based approaches is that the exact number and size of clones cannot be determined due to uneven amplification, and further that clonotypes with similar sequences cannot be reliably distinguished from sequence errors introduced during library construction and sequencing (2,). Several methods have been developed to overcome these limitations, including the use of synthetic spiked molecules to correct for quantification biases (10,) and bioinformatical approaches to reduce PCR-induced errors (11).
Another strategy to correct for sequencing errors and to quantify molecules is ultrasensitive sequencing using unique molecular identifiers (UMIs) (12,). To date, UMIs have mostly been applied to mRNA when profiling the immune receptor repertoire of B and T cells to remove PCR duplicates and improving sequence accuracy (12, 13,). However, analysis of clonal size based on mRNA has the disadvantage that the number of transcripts per cell is not constant (14–16,). The reverse transcription efficiency is also variable between sequences (17,) and 100 to 1000 times more prone to errors than high fidelity DNA polymerases (18–20). Therefore, to enable quantitative clonality analysis, DNA is preferred over mRNA.
UMIs can be added to DNA by either ligation- or PCR-based approaches (21, 22,). Recently, Chovanec et al. reported a ligation-based approach for immunoglobulin repertoire sequencing (23,). Ligation-based UMI approaches require that target DNA be captured before the analysis. Another limitation is that molecules are lost due to limited ligation efficiency (24,). In comparison, PCR-based UMI approaches are experimentally simpler and potentially also more sensitive, since they do not suffer from ineffective capture and ligation steps. Ligation-based UMI approaches were first developed because it is challenging to introduce UMIs in PCR primers due to a massive formation of nonspecific PCR products caused by the random nucleotide sequence of UMIs. This problem is solved by methods that either shield the UMI in secondary structures (25,) or by an extra purification step enriching DNA with incorporated UMIs (21). However, no PCR-based UMI approach is currently available for immune repertoire profiling.
Here, we developed an ultrasensitive immune repertoire sequencing approach, sequencing the rearranged T-cell receptor delta (TRD) locus in γδ T cells. To minimize the formation of nonspecific PCR products, we used the concept of SiMSen-Seq (26) that protects the UMI inside a hairpin loop that opens and closes its secondary structure in a temperature-dependent manner.
Material and Methods
Primer Design and Assay Validation
To capture the full diversity of the T-cell receptor δ repertoire, we designed target-specific primers for all 8 variable (TRDV) and 4 joining (TRDJ) genes associated with the TRD locus in the International immunogenetics information system gene database (27,) (Fig. 1A). Primers were designed using National Center for Biotechnology Information’s tool Primer-Blast (28,) as described (25). Forward primers targeted the downstream part of the TRDV genes, and the reverse primer targeted the downstream part of the TRDJ genes (Fig. 1B). Primers were designed to bind the nonrearranged parts of each segment amplifying the CDR3 region (Table 1). Each assay was validated using quantitative PCR and fragment analysis to ensure high efficiency (90% to 110%) and analytical specificity, respectively. Full details on assay validation of target specific primers by using synthetic reference material can be found in Supplemental Material and Method.
Primer name . | Target specific sequence 5′->3′ . | Strand . | Starta . | Stopa . |
---|---|---|---|---|
Target V-region forward primer | ||||
TRDV1 | GCGAAATCCGTCGCCTTAAC | Forward | 22096543 | 22096562 |
TRDV2 | ACTTGCACCATCAGAGAGAGATG | Forward | 22422991 | 22423013 |
TRDV3 | TCCAGTAAGGACTGAAGACAGTG | Reverse | 22469085 | 22469063 |
TRDV4/TRAV14 | CCAGAAGGCAAGAAAATCCGC | Forward | 21924565 | 21924585 |
TRDV5/TRAV29 | CTTAAACAAAAGTGCCAAGCACC | Forward | 22163785 | 22163807 |
TRDV6/TRAV23 | GCAGTTCTCATCGCATATCATGG | Forward | 22086894 | 22086916 |
TRDV7/TRAV36 | AGACCGGAGACTCGGCCAT | Forward | 22227216 | 22227234 |
TRDV8/TRAV38-2 | GGGGATGCCGCGATGTAT | Forward | 22281712 | 22281729 |
Target J-region reverse primers | ||||
TRDJ1 | CACAGTCACACGGGTTCCTT | Reverse | 22450132 | 22450113 |
TRDJ2 | CGATGAGTTGTGTTCCCTTTCCAA | Reverse | 22456733 | 22456710 |
TRDJ3 | AGTTTGATGCCAGTTCCGAAA | Reverse | 22459142 | 22459122 |
TRDJ4 | GTTGTACCTCCAGATAGGTTCCT | Reverse | 22455293 | 22455271 |
Barcode forward primer | ||||
5′-GGACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNNNNNNNNNATGGGAAAGAGTGTCC-V-region forward primer-3′ | ||||
Barcode reverse primer | ||||
5′-GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT-J-region reverse primer-3′ |
Primer name . | Target specific sequence 5′->3′ . | Strand . | Starta . | Stopa . |
---|---|---|---|---|
Target V-region forward primer | ||||
TRDV1 | GCGAAATCCGTCGCCTTAAC | Forward | 22096543 | 22096562 |
TRDV2 | ACTTGCACCATCAGAGAGAGATG | Forward | 22422991 | 22423013 |
TRDV3 | TCCAGTAAGGACTGAAGACAGTG | Reverse | 22469085 | 22469063 |
TRDV4/TRAV14 | CCAGAAGGCAAGAAAATCCGC | Forward | 21924565 | 21924585 |
TRDV5/TRAV29 | CTTAAACAAAAGTGCCAAGCACC | Forward | 22163785 | 22163807 |
TRDV6/TRAV23 | GCAGTTCTCATCGCATATCATGG | Forward | 22086894 | 22086916 |
TRDV7/TRAV36 | AGACCGGAGACTCGGCCAT | Forward | 22227216 | 22227234 |
TRDV8/TRAV38-2 | GGGGATGCCGCGATGTAT | Forward | 22281712 | 22281729 |
Target J-region reverse primers | ||||
TRDJ1 | CACAGTCACACGGGTTCCTT | Reverse | 22450132 | 22450113 |
TRDJ2 | CGATGAGTTGTGTTCCCTTTCCAA | Reverse | 22456733 | 22456710 |
TRDJ3 | AGTTTGATGCCAGTTCCGAAA | Reverse | 22459142 | 22459122 |
TRDJ4 | GTTGTACCTCCAGATAGGTTCCT | Reverse | 22455293 | 22455271 |
Barcode forward primer | ||||
5′-GGACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNNNNNNNNNATGGGAAAGAGTGTCC-V-region forward primer-3′ | ||||
Barcode reverse primer | ||||
5′-GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT-J-region reverse primer-3′ |
Nucleotide position from GRCh38/hg38, chromosome 14 (40).
Primer name . | Target specific sequence 5′->3′ . | Strand . | Starta . | Stopa . |
---|---|---|---|---|
Target V-region forward primer | ||||
TRDV1 | GCGAAATCCGTCGCCTTAAC | Forward | 22096543 | 22096562 |
TRDV2 | ACTTGCACCATCAGAGAGAGATG | Forward | 22422991 | 22423013 |
TRDV3 | TCCAGTAAGGACTGAAGACAGTG | Reverse | 22469085 | 22469063 |
TRDV4/TRAV14 | CCAGAAGGCAAGAAAATCCGC | Forward | 21924565 | 21924585 |
TRDV5/TRAV29 | CTTAAACAAAAGTGCCAAGCACC | Forward | 22163785 | 22163807 |
TRDV6/TRAV23 | GCAGTTCTCATCGCATATCATGG | Forward | 22086894 | 22086916 |
TRDV7/TRAV36 | AGACCGGAGACTCGGCCAT | Forward | 22227216 | 22227234 |
TRDV8/TRAV38-2 | GGGGATGCCGCGATGTAT | Forward | 22281712 | 22281729 |
Target J-region reverse primers | ||||
TRDJ1 | CACAGTCACACGGGTTCCTT | Reverse | 22450132 | 22450113 |
TRDJ2 | CGATGAGTTGTGTTCCCTTTCCAA | Reverse | 22456733 | 22456710 |
TRDJ3 | AGTTTGATGCCAGTTCCGAAA | Reverse | 22459142 | 22459122 |
TRDJ4 | GTTGTACCTCCAGATAGGTTCCT | Reverse | 22455293 | 22455271 |
Barcode forward primer | ||||
5′-GGACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNNNNNNNNNATGGGAAAGAGTGTCC-V-region forward primer-3′ | ||||
Barcode reverse primer | ||||
5′-GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT-J-region reverse primer-3′ |
Primer name . | Target specific sequence 5′->3′ . | Strand . | Starta . | Stopa . |
---|---|---|---|---|
Target V-region forward primer | ||||
TRDV1 | GCGAAATCCGTCGCCTTAAC | Forward | 22096543 | 22096562 |
TRDV2 | ACTTGCACCATCAGAGAGAGATG | Forward | 22422991 | 22423013 |
TRDV3 | TCCAGTAAGGACTGAAGACAGTG | Reverse | 22469085 | 22469063 |
TRDV4/TRAV14 | CCAGAAGGCAAGAAAATCCGC | Forward | 21924565 | 21924585 |
TRDV5/TRAV29 | CTTAAACAAAAGTGCCAAGCACC | Forward | 22163785 | 22163807 |
TRDV6/TRAV23 | GCAGTTCTCATCGCATATCATGG | Forward | 22086894 | 22086916 |
TRDV7/TRAV36 | AGACCGGAGACTCGGCCAT | Forward | 22227216 | 22227234 |
TRDV8/TRAV38-2 | GGGGATGCCGCGATGTAT | Forward | 22281712 | 22281729 |
Target J-region reverse primers | ||||
TRDJ1 | CACAGTCACACGGGTTCCTT | Reverse | 22450132 | 22450113 |
TRDJ2 | CGATGAGTTGTGTTCCCTTTCCAA | Reverse | 22456733 | 22456710 |
TRDJ3 | AGTTTGATGCCAGTTCCGAAA | Reverse | 22459142 | 22459122 |
TRDJ4 | GTTGTACCTCCAGATAGGTTCCT | Reverse | 22455293 | 22455271 |
Barcode forward primer | ||||
5′-GGACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNNNNNNNNNATGGGAAAGAGTGTCC-V-region forward primer-3′ | ||||
Barcode reverse primer | ||||
5′-GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT-J-region reverse primer-3′ |
Nucleotide position from GRCh38/hg38, chromosome 14 (40).
Library Construction and Sequencing
DNA were barcoded in a 10 µL reaction, containing 0.2 U Phusion Hot start II DNA polymerase, 1x Phusion High-Fidelity Buffer (both #F549S, Thermo Fisher Scientific), 0.2 mM dNTP (#D7295, Sigma-Aldrich), 0.5 M L-carnitine inner salt (#C0158, Sigma-Aldrich), 40 nM of each barcode primer (polyacrylamide gel electrophoresis purified, Integrated DNA Technologies) (Table 1) and target DNA. The following temperature program was used on a T100 Thermal cycler (Bio-Rad Laboratories): 98 °C for 30 sec, 3 cycles of amplification (98 °C for 10 sec, 62 °C for 6 min, 72 °C for 30 sec, all ramping rates were 4 °C/sec), and 65 °C for 15 min and 95 °C for 15 min. Twenty microliters of 45 ng/μL Streptomyces griseus protease (#P5147, Sigma-Aldrich) dissolved in 10 mM Tris, 1 mM EDTA-buffer (pH 8.0, #AM9849, Thermo Fisher) was added to each well at the start of the 15 min incubation step at 65 °C to degrade the polymerase, reducing the formation of nonspecific PCR products. Illumina adapters were added in a second PCR step. Ten microliters of barcoded PCR product was amplified in a 40 μL reaction containing, 1x Q5 Hot Start High-Fidelity Master Mix (#M0494, New England BioLabs) and 400 nM of each Illumina Adaptor index primer (desalted, Sigma-Aldrich, Supplemental Table 1) using the following thermal program on a T100 Thermal cycler; 98 °C for 3 min, 30 to 40 cycles of amplification (98 °C for 10 sec, 80 °C for 1 sec, 72 °C for 30 sec, 76 °C for 30 sec, all ramping rate of 0.2 °C/sec). Final reaction concentrations are shown for each PCR. A quality control to quantify the amount of barcoded DNA using qPCR was preformed and described in the Supplemental Material and Methods.
Libraries were purified using the Agencourt AMPure XP system (#A63881, Beckman Coulter), according to the manufacturer’s instructions with a bead to sample ratio of 1:1. Library quality and quantification were assessed on a Fragment Analyzer using the HS NGS Fragment kit (#DNF-474, Agilent) and analyzed using the PROsize software version 3.0 (Agilent), according to the manufacturer’s instructions.
Final quantification of library pool was performed by qPCR using NEBNext Library Quant Kit (#E7630, New England Biolabs), according to manufacturer’s instructions. Sequencing was performed on a Miniseq using midi output reagent kit with 20% added Phix control v3 (#FC-420-1004 and #FC-110-3001, Illumina) and 1 pM library. We performed 150 cycles of paired-end sequencing.
Data Analysis
Sequencing data were processed using the molecular identifier guided error correction (MIGEC) pipeline, version 1.2.9, using only paired-end reads and overlap-max-offset set to 100 (12). In summary, UMIs were extracted from raw sequencing reads. Data were then grouped according to their UMIs and assembled to consensus reads with applied error correction. In the nomenclature used here, all reads amplified with the same UMI sequence form a UMI family, and all UMI families with identical TRD sequence add up to a clonotype. Post-processing was performed using modified scripts from VDJtools, version 1.2.1 (30) and TcR R programming package, version v2.2.4.1 (31). More information on data analysis can be found in Supplemental Material and Methods.
Data Availability
All sequencing data can be found in the Sequence Read Archive database with the accession number PRJNA596380. Processed data, information about figure generation and details about specific analysis are available at Mendeley data (32).
Results
Development of a Sequencing Approach Targeting the δ-Chain in γδ T Cells Using Unique Molecular Identifiers
To develop an ultrasensitive sequencing approach of the rearranged TRD locus in γδ T cells (Fig. 1A), we applied targeted sequencing using UMIs. Target primers were designed for 8 TRDV genes and 4 TRDJ genes (Table 1). The 12 nucleotides long UMI was incorporated between the target primer and the adapter sequence. The method consisted of 2 rounds of PCR. In the first PCR step, all target DNA was barcoded in 3 cycles of amplification. In the second PCR, the barcoded DNA was amplified with Illumina adapter primers (Figs 1, B and C and Supplemental Table 1).
To validate the efficiency of each target primer combination, we used 32 synthetic gBlock DNA molecules containing the target gene segment of each primer combination (Fig. 1D and Supplemental Table 2). All target primer pairs showed 90% to 110% PCR efficiency using quantitative PCR (qPCR) (Supplemental Table 3). The analytical specificity of all combinations was also tested on 20 ng of genomic DNA from breast cancer cell line T47D that had the TRD locus in germline configuration. None of the primer pair combinations produced any specific PCR products using Fragment Analyzer analysis (data not shown).
Next, we tested the same target primers, but with hairpin protected UMI added (Fig. 1B). We analyzed the formation of barcoded PCR-products of each primer pair combination by qPCR, followed by melting curve analysis using a standard curve, ranging from 10 000 to 16 molecules with 5-fold dilution steps. Quantitative PCR analysis showed that the dynamic range of all assays spanned the entire range, and the melting curve analysis indicated that specific PCR products were formed (Supplemental Fig. 1, A and B). The presence of correct library size was also validated by Fragment Analyzer analysis (Supplemental Fig. 1C). At lower DNA concentrations, the amount of nonspecific PCR products increased, but specific PCR products were still generated.
Validation of a Simple, Robust and Fast Sequencing Protocol Targeting the TRD Genes
To determine amplification efficiency, sensitivity, and reproducibility of the final 32-plex assay targeting TRD gene rearrangements, we performed standard curves of synthetic gBlock molecules, ranging from 2 × 107 to 20 molecules per target sequence. To test the overall PCR efficiency and dynamic range of the 32-plex assay, we first performed barcoding PCR and then quantified the barcoded PCR product with qPCR using the Illumina adapter primers. The overall PCR efficiency for the 32-plex assay was 101% (Fig. 2A). Next, we sequenced the libraries generated from 2000, 200, and 20 synthetic reference DNA standard molecules to evaluate the performance of each primer pair combination (Fig. 2B). The sequencing efficiency ranged between 98% and 117% for all assays (Supplemental Table 4). Because different TRDV and TRDJ genes share sequence homology, we tested for off-target amplification. Synthetic gBlock DNA standards were matched by its unique template specific sequence to the primer that amplified respective molecule (Supplemental Fig. 2). All TRDV primers showed high specificity where only 7 out of 175 103 (0.004%) barcoded families were amplified by the wrong TRDV target primer. For the TRDJ primers, the corresponding number was 705 out of 175 961 (0.4%) barcoded families, which was expected due to higher sequence similarity between J genes compared to V genes.
To further determine the analytical sensitivity to detect rare TRD clones, we generated pools of synthetic gBlock standards where we sequenced 10 standard molecules per assay for 16 assays (160 molecules in total) in a background of 1000 standard molecules per assay for the remaining 16 individual assays (16 000 molecules in total). Hence, the analytical sensitivity of each primer pair combination was tested at a ratio of 1:1616 (∼0.06%). We quantified the pools of synthetic gBlock molecules by qPCR (Supplemental Fig. 3) and then by sequencing (Fig. 2C). Sequencing data showed that 1000, as well as 10 molecules, were reproducibly detected for all assays. Some variations between the assays may be explained by dilution artifacts as observed by qPCR (Supplemental Fig. 3).
Immune Repertoire Sequencing of TRD Using Enriched γδ T-Cells
To validate our 32-plex assay on human samples, we analyzed the genomic DNA extracted from γδ T-cells enriched from buffy coats using immunomagnetic cell separation. Sequencing libraries were constructed from 500, 100, and 25 ng γδ T-cell DNA and then sequenced. The total number of productive TRD molecules for each combination of rearranged TRD locus showed a linear correlation with the amount of analyzed DNA (Fig. 3A). The 500, 100, and 25 ng libraries generated mean 97.1%, 93.2%, and 76.1% on target reads, respectively. Non-aligned reads were almost exclusively generated from primer-dimers (mean 99.8%).
Supplemental Fig. 4A displays the 10 most commonly produced clonotypes. One advantage of using UMIs is that it could correct for amplification biases between molecules. Supplemental Fig. 4B shows the non-uniform amplification of different UMI families, which is agreement with other approaches using UMIs (21, 26). Several specific UMIs were only observed once, while some specific UMIs were detected more than 100 times. This quantitative bias was corrected since all reads with the same UMI originated from the same molecule and were collapsed into one molecule. Figure 3B illustrates the errors when not using UMI. For example, individual clonotypes with a single UMI sometimes displayed more than 10-fold variability in frequency when analyzed with raw sequencing reads. Figure 3C shows the reproducibility when detecting distinct number of molecules using UMIs. The coefficient of variation increased when detecting rare clonotypes, which agreed with sampling ambiguity when detecting few molecules. Furthermore, we detected no correlations between reads per UMI and amplicon length but a small decrease of reads per UMI for high GC-content (Fig. 3, D–G), supporting that our quantitative sequencing approach was largely independent of sequence context.
Immune Repertoire Sequencing of TRD in Healthy Donors
To characterize individual TRD immune repertoires, we analyzed γδ T-cells, enriched by fluorescence-activated cell sorting, from peripheral blood mononuclear cells (PBMC) of 10 healthy individuals. The number of productive CDR3 sequences detected by sequencing correlated with the estimated amount of productive CDR3 γδ T-cell molecules loaded (Fig. 4A). The TRD repertoire diversity was different between the 10 individuals (Fig. 4B). Donors 1 and 10 displayed no single clonotype with a frequency above 10%, while donors 3, 4, 6, 7, 8, and 9 were highly oligoclonal, where the top 5 clonotypes represented more than 50% of all productive TRD molecules detected. The distributions of clonotype sizes are shown in Supplemental Fig. 5. The observed variation in TRD repertoire diversity of healthy donors agreed with reported data (33, 34). The most common TRD rearrangement used among the different clonotypes was between TRDV2 and TRDJ1 (Supplemental Fig. 6). To validate our genomic approach at the cellular level, we determined the frequencies of TRDV1 and TRDV2 using FACS (Fig. 4C). The frequencies detected by FACS correlated significantly (P
Immune Repertoire Sequencing Using Non-Enriched Cells
One major advantage of targeted PCR is that cell enrichment is potentially not needed. To compare sequencing from non-enriched PBMC and enriched γδ T cells, PBMC from a buffy coat were isolated and split into 2 equal aliquots: one sample was enriched for γδ T cells using negative selection with magnetic beads, while the other sample was analyzed directly without further cell enrichment. We sequenced 100 ng DNA of the γδ T-cell enriched sample and 1 μg of the non-enriched sample. In total, 9925 and 4836 productive TRD molecules were identified in the γδ T-cell enriched and the non-enriched samples, respectively (Fig. 5, A and B). The clonotype frequencies between non-enriched and enriched cells correlated linearly (R2 = 0.80). However, 3 outlier clonotypes were identified. These clonotypes were abundantly expressed in non-enriched cells, while their relative frequencies in enriched cells were low. A drawback of analyzing non-enriched γδ T cells is that partially rearranged loci and unproductively rearranged alleles of TRD in αβ T cells may be amplified. Indeed, the number of out-of-frame sequences was higher in non-enriched cells (44.5%) compared to enriched γδ T cells (15.6%).
Discussion
The TCR repertoires of γδ T cells are the least explored among B and T lymphocytes. Notably, among all immune cells investigated, infiltrating γδ T cells show the strongest association with favorable outcome in a meta-analysis of 25 different malignancies (35, 36,). Here, we developed an ultrasensitive method for immune repertoire sequencing that increases the quantification accuracy by multiple order of magnitudes for low-frequency clones. By utilizing UMIs, we increased sequence accuracy enabling individual clonotypes to be reliable detected and quantified in a background of similar clonotypes (12,). We also show that sequence libraries could be reliably generated from 15 ng enriched γδ T cells DNA and 640 synthetic molecules, equivalent to 4.6 ng enriched γδ T cells DNA. Improved quantification accuracy was achieved by counting the number of UMIs, which related to the original number of DNA molecules and, therefore, the number of cells. Consequently, PCR-introduced amplification biases were avoided. Furthermore, current bioinformatics, such as MIGEC, can also handle PCR-introduced errors in the UMI, avoiding an overestimation of the clonotype size (37). The method minimizes sequencing errors by the construction of consensus sequences. Compared to ultrasensitive detection of allele variants, such as mutation analysis, immune repertoire sequencing is experimentally more challenging since the sequences of individual target DNA molecules are different from each other, including variable GC-content and length. Allele variant analysis regularly only includes the detection of 2 different sequences, often with a single nucleotide variant. Hence, assay optimization and amplification performance are fundamental for reliable immune repertoire sequencing. The 32-plex assay showed a small GC-content bias, but this error was corrected using UMIs as long as the original molecule is amplified.
True clonal variation is difficult to separate from sequencing errors (11, 38,). The use of UMIs addressed this issue, enabling accurate immune repertoire sequencing. The PCR-introduced UMIs corrected for all DNA polymerase-induced errors, except errors that occurred in the first barcoding PCR step (21,) and non-polymerase induced errors, such as chemically modified bases (39). With a PCR-based approach, there is also no need for target cell enrichment. We showed a linear correlation between the frequencies of clonotypes detected in enriched and non-enriched samples. Hence, immune repertoire DNA sequencing approaches without cell enrichments may be an interesting option, since it is both simpler and allows more sample types to be analyzed. Interestingly, we detected 3 clonotypes as outliers with more than a 10-fold higher frequency in the non-enriched sample compared to the enriched sample. The reason for this is unknown and needs to be further investigated, but one possibility may be that antibodies used in the depletion cocktail for negative selection also reacted to and depleted a subset of γδ T cells. However, to determine systematic differences between various sample types and enrichment techniques need further experiments.
The approach to add UMI by targeted PCR is both simpler and more efficient than ligation-based methods. Sequence libraries can be generated within 4 h, with limited hands-on time. Another advantage of PCR-based approaches is the possibility to choose a subset of target primers for specific applications. For example, in minimal residual disease in lymphoid malignancies where the immunoreceptor clonotype is known, and the detection of other immunoreceptor recombinations is of limited value, it is possible to monitor relevant clones longitudinally in a cost-effective manner. However, in exploratory studies, PCR-based approaches will not detect novel sequences that are not targeted by the primer sequences used. Here, a more unspecific ligation-based approach will be more suitable.
Single-cell analysis is another emerging method that enables detailed profiling of different clonotypes that may also provide additional information about transcriptomics, receptor chain pairing, and antigen affinity (40,). However, single-cell sequencing is experimentally and analytically more labor-intensive and requires a single-cell suspension that is not always feasible to generate. Therefore, our approach and single-cell analysis are complementary to each other. An immune repertoire sequencing approach can also be combined with approaches for nonrecombined DNA sequences targeting somatic mutations (21, 25). Combined immune repertoire and somatic mutation analysis is useful in both T and B cell malignancies, but also in solid tumor entities. Here, cell-free tumor DNA purified from the plasma as well as cellular immune cell DNA from the buffy coat could be analyzed in parallel.
Author Contributions
All authors confirmed they have contributed to the intellectual content of this paper and have met the following 4 requirements: (a) significant contributions to the conception and design, acquisition of data, or analysis and interpretation of data; (b) drafting or revising the article for intellectual content; (c) final approval of the published article; and (d) agreement to be accountable for all aspects of the article thus ensuring that questions related to the accuracy or integrity of any part of the article are appropriately investigated and resolved.
A.K. Singh, statistical analysis, provision of study material or patients; J. Lycke, financial support, provision of study material or patients; S. Cardell, financial support.
Authors’ Disclosures or Potential Conflicts of Interest: Upon manuscript submission, all authors completed the author disclosure form. Disclosures and/or potential conflicts of interest:
Employment or Leadership
G. Johansson, AstraZeneca; M. Hühn, AstraZeneca; O. Vaarala, AstraZeneca; A. Ståhlberg, Iscaff Pharma, SiMSen Diagnostics.
Consultant or Advisory Role
J. Lycke, Almirall, Teva, Biogen, Novartis, Genzyme/SanofiAventis; A. Ståhlberg, Iscaff Pharma, TATAA Biocenter, SiMSen Diagnostics.
Stock Ownership
M. Hühn, AstraZeneca; G. Johansson, AstraZeneca; O. Vaarala, AstraZeneca; A. Ståhlberg, Iscaff Pharma, TATAA Biocenter, SiMSen Diagnostics.
Honoraria
J. Lycke, travel support and/or lecture honoraria from Biogen, Novartis, Teva, Genzyme/SanofiAventis.
Research Funding
Johan Jansson Foundation for Cancer Research, Knut and Alice Wallenberg Foundation, Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden, Swedish Cancer Society (19-0306), Swedish Research Council (2017-01392), Swedish Childhood Cancer Foundation (2017-0043), the Swedish state under the agreement between the Swedish government and the county councils, the ALF-agreement (716321), VINNOVA.
Patents
A. Ståhlberg, United States Patent No. 10,557,134, Swedish patent application No. 2050673-9.
Role of Sponsor
The funding organizations played no role in the design of study, choice of enrolled patients, review and interpretation of data, preparation of manuscript, or final approval of manuscript.
References
Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, et al The human genome browser at UCSC. Genome Res 2002;
Nonstandard Abbreviations
-
NGS
next-generation sequencing
-
TCR
T-cell receptor
-
CDR3
third complementary determining region
-
UMI
unique molecular identifier
-
PBMC
peripheral blood mononuclear cell
-
qPCR
quantitative PCR
-
FACS
fluorescence-activated cell sorting
-
Cq-Value
cycle of quantification value
Human Genes
-
TRD
T cell receptor delta locus;
-
TRA
T cell receptor alpha locus
-
TRDV
T cell receptor variable region
-
TRDJ
T cell receptor joining region