Computer-assisted chemical structure searching plays a critical role for efficient structure screening in cheminformatics. We designed a high-performance chemical structure & data search engine called DCAIKU, built on CouchDB and ElasticSearch engines. DCAIKU converts the chemical structure similarity search problem into a general text search problem to utilize off-the-shelf full-text search engines. DCAIKU also supports flexible document structures and heterogeneous datasets with the help of schema-less document database. Our evaluations show that DCAIKU can handle both keyword search and structural search against millions of records with both high accuracy and low latency. We expect that DCAIKU will lay the foundation towards large-scale and cost-effective structural search in materials science and chemistry research.
A high-performance and flexible chemical structure & data search engine built on CouchDB & ElasticSearch
Ren-zhi Li, Bo-jie Li, Guo-zhen Zhang, Jun Jiang, Yi Luo; A high-performance and flexible chemical structure & data search engine built on CouchDB & ElasticSearch. Chin. J. Chem. Phys. 1 June 2018; 31 (3): 341–349. https://doi.org/10.1063/1674-0068/31/cjcp1711202
Download citation file: