"Ensemble of Deep Learning Approaches for ATC Classification

[abstract]

Anatomical Therapeutic Chemical (ATC) classification of unknown compounds is essential for drug development and research. In this paper, we propose a multi-label classifier system for ATC prediction based on convolutional neural networks (CNN) and Long Short-Term Memory Networks (LSTM). The CNN approach extracts a 1D feature vector from the compounds utilizing information about their chemical-chemical interaction and structural and fingerprint similarities to other compounds belonging to the ATC classes. The 1D vector is then reshaped into a 2D matrix. A CNN network is trained on the matrix and used to extract new features. LSTM is trained on the 1D vector and likewise used to extract features. These features are then trained on two general-purpose classifiers designed for multi-label classification and results are fused. Rigorous experimental evaluation demonstrates the superiority of our method compared to other state-of-the-art approaches. .

Keywords: ATC classification, Deep learning, Convolutional neural networks, Long Short-Term Memory Networks.

[Full Pre-print Paper]