Based on the sequence-based discovery platform described in the paper of Trinklein et al, Scaligner has been used to analyze NGS data.
OmniFlic rats have been immunized, and lymphocytes from relevant draining lymph nodes were harvested. Samples were sequenced, leading to 1 million DNA sequences for each samples on average.
Framework regions of the VH were identified with Scaligner.
The agglomerative clustering algorithm developed in Scaligner was then used to cluster the full set of CDR3 protein sequences for each sample at an 80% similarity threshold, and the total number of reads in each cluster was recorded for clonotypes represented by five or more paired sequence reads. In the paper, a clonotype is defined as a group of CDR3 protein sequences clustered at 80% similarity.
The polarization of CDR3 clonotypes was measured by calculating the percentage of total reads in a sample that were contained in each CDR3 clonotype, then the clonotypes were ranked by abundance for each sample. Those that were the most highly represented were prioritized for synthesis and functional screening.