9th Advanced Drug Design workshop challenge

9th Advanced Drug Design workshop challenge - Olomouc 2026

DYRK1B Virtual Screening Challenge

The challange is over

However, submissions can still be made. The limit on the number of submissions was removed. Questions regarding the challange can be sent to discord channel

Introduction

DYRK1B (dual-specificity tyrosine-phosphorylation-regulated kinase 1B, UniProt Q9Y463) is a serine/threonine kinase involved in the regulation of cell cycle progression, cellular quiescence, metabolism, and stress responses. It has emerged as a therapeutically relevant target in multiple diseases.

In cancer, including pancreatic, ovarian, and colorectal malignancies, DYRK1B supports tumor cell survival and therapy resistance, making its inhibition a promising strategy to enhance chemosensitivity. DYRK1B has also been implicated in metabolic disorders such as insulin resistance and obesity, as well as in fibrotic and inflammatory pathways.

Registration

Participants may work individually or form teams of up to three members. Each team must register to obtain a submission token. Please use valid email addresses, as participants may be contacted during post-challenge evaluation.

Task Description

A library of 3,365 structures is provided, including several validated DYRK1B inhibitors with IC₅₀ values below 1 µM.

Your task is to select exactly 100 structures that are most likely to be active DYRK1B inhibitors. The primary goal is to maximize the number of true active hits.

Participants may use any computational strategy, including predictive modeling and virtual screening, to prioritize candidates.

Data

Protein Target

A single X-ray crystal structure of DYRK1B in complex with the inhibitor AZ191 (PDB ID: 8C2Z) is currently available. The protein was prepared in a docking-ready format compatible with Vina, Smina, and Gnina (e.g. available in EasyDock) and in the source format, containing all hydrogens and protonation states (may be used in other programs or to calculate protein-ligand fingerprints).

The docking grid was centered on the reference ligand AZ191, with an additional 6 Å margin. The grid size may require adjustment for larger ligands.

8C2Z_protein.pdb (protonated at physiological pH)
8C2Z_protein.pdbqt
grid.txt

ChEMBL Compounds

Compounds were extracted from the latest ChEMBL release (CHEMBL5543). Salts were removed and canonical SMILES were generated to identify duplicates and aggregate activity values.

Stereochemistry was not enforced, so stereoisomers and undefined stereocenters may coexist in the dataset.

known_compounds.zip

The archive contains SMILES and SDF files, grouped by activity type (IC₅₀, K_i, K_d, inhibition). Datasets may overlap.

Field	Description
Molecule ChEMBL ID	Structure identifier
Standard Type	Activity type within the dataset
activity	Average activity value for identical structures
Standard Units	Units of activity measurement
activity_sd	Standard deviation (NA if single measurement)
p_activity	Log-scaled activity value for QSAR modeling

Files chembl_all_class.sdf / chembl_all_class.smi:

Compounds were labeled active if IC₅₀/K_i/K_d ≤ 1 µM or inhibition ≥ 90%. Compounds with conflicting activity annotations were removed. This dataset is suitable for classification modeling.

Field	Description
Molecule ChEMBL ID	Structure identifier
activity_class	Binary label (0 = inactive, 1 = active)

chembl_all_class.zip

This file contains precomputed LigandScout databases for actives and inactives as they were defined above. They were generated by the method icon-fast (up to 25 conformers for each compound). For compounds with missing stereoconfigurations LigandScout generated some random configurations.

Blind Set

The blind dataset contains 3,365 structures in multiple formats. Some molecules may have undefined stereochemistry, which should be considered when applying 3D-based methods.

blind_set.zip (SMILES and SDF)
blind_set.ldb.zip (LigandScout format; for compounds with missing stereoconfigurations LigandScout generated some random configurations)

Additional Data and Tools

Participants may use any publicly available data sources and software tools, including proprietary software, to support model development and screening.

Submissions

Each submission must contain exactly 100 unique compound IDs. IDs may be uploaded as a text file or pasted directly into the submission field, with one ID per line. If fewer than 100 compounds are selected, remaining lines may be filled with random IDs.

Evaluation

Rankings are based on the proportion of correctly identified active compounds (recall). Each team may submit up to 10 submissions; only the best result will be considered.

Leaderboard

The leaderboard is updated only after submissions. Use the Refresh button to view the latest results.

Prizes

Teams are ranked by score, with submission time used as a tie-breaker. On-site and online teams are evaluated separately.

The top three teams in each category will receive RowanSci credits and will be invited to give short presentations describing their solutions.