Electrophoresis, Vol.39, No.21, 2806-2814, 2018
DNA methylation assay based on pyrosequencing for determination of smoking status
The goal of this study was to utilize pyrosequencing to identify CpG sites indicative of tobacco smoking using DNA sequences surrounding ten frequently reported smoking-related CpGs. Initially, six genetic loci were investigated including AHRR, 2q37, 6p21.33, GFI1, F2RL3, and MYO1G in order to detect novel CpG sites associated with tobacco smoking. The methylation data revealed a set of 23 consecutive CpG sites in blood (Chr5:373,115-Chr5:373,653) that were significantly hypomethylated in current smokers. In addition, 10 of these 23 CpGs were also significantly hypomethylated in the saliva of current smokers. The most significant CpG sites were located at Chr5:373,490 in blood and Chr5:373,476 in saliva with a decrease in methylation in current smokers of 42.3% and 21.3% respectively. In the model-building steps of this study, a quick 4-CpG assay was developed. The assay consisted of the top ranked CpG sites in blood and saliva. The assay was applied in a leave-one-out approach to test its ability to infer an individual's self-identified history of smoking habits. A multinomial logistic regression model (MLR) containing all 4 CpG sites gave the most accurate results in blood and saliva. In blood, the model correctly predicted 90.0% of current smokers, 66.7% of former smokers, and 84.9% of never smokers. In addition, the MLR model correctly predicted 86.9% of current smokers, 54.5% of former smokers, and 77.8% of never smokers in saliva. These results demonstrate that this pyrosequencing-based assay can provide an effective tool for identifying individuals who smoke tobacco, particularly when using epigenetic markers in blood.