학회 |
한국화학공학회 |
학술대회 |
2017년 가을 (10/25 ~ 10/27, 대전컨벤션센터) |
권호 |
23권 2호, p.1799 |
발표분야 |
에너지 환경 |
제목 |
Text Mining Metal-Organic Framework Papers |
초록 |
We have developed a simple text mining algorithm that allows us to identify surface area and pore volumes of metal-organic frameworks using manuscript html files as inputs. The algorithm searches for common units (e.g. m2/g, cm3/g) associated with these two quantities to facilitate the search. From the training set data of over 200 MOFs, the algorithm managed to identify 90% and 88.8% of the correct surface area and pore volume values. Further application to test set of randomly chosen MOF html files yielded 73.2% and 85.1% accuracies for the two respective quantities. Most of the errors stem from unorthodox sentence structures that made it difficult to identify the correct data as well as bolded notations of MOFs (e.g. 1a) that made it difficult identify its real name. These types of tools will become useful when it comes to discovering structure-property relationships amongst MOFs as well as collecting a large data set of data for references. |
저자 |
박상훈1, 김백준1, 최시훈1, 김지한1, Peter Boyd2, Berend Smit3
|
소속 |
1KAIST, 2College of Chemistry, 3UC Berkeley and EPFL |
키워드 |
고효율 |
E-Mail |
|
원문파일 |
초록 보기 |