Biotechnology and Bioengineering, Vol.111, No.4, 770-781, 2014
Exploring the Transcriptome Space of a Recombinant BHK Cell Line Through Next Generation Sequencing
Baby Hamster Kidney (BHK) cell lines are used in the production of veterinary vaccines and recombinant proteins. To facilitate transcriptome analysis of BHK cell lines, we embarked on an effort to sequence, assemble, and annotate transcript sequences from a recombinant BHK cell line and Syrian hamster liver and brain. RNA-seq data were supplemented with 6,170 Sanger ESTs from parental and recombinant BHK lines to generate 221,583 contigs. Annotation by homology to other species, primarily mouse, yielded more than 15,000 unique Ensembl mouse gene IDs with high coverage of KEGG canonical pathways. High coverage of enzymes and isoforms was seen for cell metabolism and N-glycosylation pathways, areas of highest interest for biopharmaceutical production. With the high sequencing depth in RNA-seq data, we set out to identify single-nucleotide variants in the transcripts. A majority of the high-confidence variants detected in both hamster tissue libraries occurred at a frequency of 50%, indicating their origin as heterozygous germline variants. In contrast, the cell line libraries' variants showed a wide range of occurrence frequency, indicating the presence of a heterogeneous population in cultured cells. The extremely high coverage of transcripts of highly abundant genes in RNA-seq enabled us to identify low-frequency variants. Experimental verification through Sanger sequencing confirmed the presence of two variants in the cDNA of a highly expressed gene in the BHK cell line. Furthermore, we detected seven potential missense mutations in the genes of the growth signaling pathways that may have arisen during the cell line derivation process. The development and characterization of a BHK reference transcriptome will facilitate future efforts to understand, monitor, and manipulate BHK cells. Our study on sequencing variants is crucial for improved understanding of the errors inherent in high-throughput sequencing and to increase the accuracy of variant calling in BHK or other systems. Biotechnol. Bioeng. 2014;111: 770-781. (c) 2013 Wiley Periodicals, Inc.