This repository archives a high-quality hand-curated lipophilicity dataset that includes the chemical structure (SMILES) of 1,130 organic compounds and their n-octanol/buffer solution distribution coefficients at pH 7.4 (logD7.4), originally curated by our paper (PDF).
As a determinant of several ADME properties, lipophilicity (logD7.4) is a key physical property in the development of small molecule oral drugs. This dataset can be applied for method benchmarking in regression modeling, cheminformatics, and chemometrics research.
If you find this dataset useful in your research, please cite our paper:
Formatted citation:
Wang, J-B., D-S. Cao, M-F. Zhu, Y-H. Yun, N. Xiao, Y-Z. Liang (2015). In silico evaluation of logD7.4 and comparison with other prediction methods. Journal of Chemometrics, 29(7), 389-398.
BibTeX entry:
@article{logd2015,
title={\textit{In silico} evaluation of $\text{logD}_{7.4}$ and comparison with other prediction methods},
author={Wang, Jian-Bing and Cao, Dong-Sheng and Zhu, Min-Feng and Yun, Yong-Huan and Xiao, Nan and Liang, Yi-Zeng},
journal={Journal of Chemometrics},
volume={29},
number={7},
pages={389--398},
year={2015},
publisher={Wiley Online Library}
}