K-talysticFlow (KAST) Documentation

K-talysticFlow (KAST)

K-atalystic Automated Screening Taskflow β€” Automated Deep Learning for Molecular Bioactivity Prediction

πŸ“– Overview  Β·  πŸš€ Installation  Β·  ⚑ Quick Start  Β·  πŸ™ GitHub


What is KAST?

K-talysticFlow (KAST) is an open-source pipeline that democratizes the use of deep learning for molecular bioactivity prediction in drug discovery and virtual screening workflows. KAST was developed at the Laboratory of Molecular Modeling (LMM-UEFS) to provide researchers with a reproducible, end-to-end solution β€” from data preparation to prediction β€” without requiring deep expertise in machine learning infrastructure.

The pipeline is built on DeepChem and TensorFlow, using Morgan/ECFP fingerprints as molecular descriptors and a MultitaskClassifier neural network trained from scratch on user-provided bioactivity data.

What can you use KAST for? Here are some examples:

  • Predict the bioactivity of small drug-like molecules against a biological target

  • Rank large compound libraries by predicted probability of activity

  • Train a custom deep learning model using your own active/inactive dataset

  • Evaluate model quality with ROC-AUC, enrichment factor, and cross-validation

  • Export ranked candidate lists for downstream experimental validation

KAST is a machine learning training and inference tool β€” it learns from your data and builds a target-specific model. It does not ship with pre-trained models for arbitrary targets.


Quick Start

The fastest way to get started is to set up the Conda environment and launch the interactive menu:

conda env create -f environment.yml
conda activate ktalysticflow
python main.py

Then follow the step-by-step pipeline:

[1] Prepare Data       β†’ Clean and organize your SMILES dataset
[2] Featurize          β†’ Generate Morgan/ECFP fingerprints
[3] Train Model        β†’ Build your deep learning model from scratch
[4] Evaluate           β†’ ROC-AUC, cross-validation, enrichment factor
[5] Predict            β†’ Screen new molecules and export ranked results

About

KAST is developed and maintained at the Laboratory of Molecular Modeling (LMM-UEFS) by KΓ©ssia Souza Santos. Contributions, issues, and suggestions are welcome via the GitHub repository.

Funding: This project was developed with support from CNPq (undergraduate research scholarship, PIBIC/IC) and is currently continued under a CAPES graduate research scholarship (MSc).