Help & Documentation

1. General Description

LINKER-Pred Web Server is an integrated web platform designed for the prediction of Disordered Flexible Linkers (DFLs) — unstructured regions that connect protein domains and facilitate molecular flexibility and communication. The Web Server features two predictors: LINKER-Pred2 and LINKER-Pred-Lite.

Both models are based on convolutional neural networks (CNNs) trained on the DLD dataset and DisProt annotations. LINKER-Pred2 integrates ProtTrans and MSA-Transformer embeddings for state-of-the-art performance in CAID2 and CAID3 benchmarks, while LINKER-Pred-Lite provides a fast, lightweight alternative that excludes MSA-based features yet maintains strong predictive accuracy.

The LINKER-Pred Web Server enables large-scale, residue-level prediction of linker regions across proteomes, providing a practical tool for studying protein disorder and modularity.

2. Authors and Research Groups

Authors:
Di Meng1, Juliana Glavina2,3, Gianluca Pollastri1, and Lucía Beatriz Chemes2,3

1 School of Computer Science, University College Dublin, Belfield, Dublin 4, Ireland
2 Instituto de Investigaciones Biotecnológicas (IIB-INTECH), Universidad Nacional de San Martín (UNSAM) – Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
3 Escuela de Bio y Nanotecnologías (EByN), Universidad Nacional de San Martín

About the Group

University College Dublin:
Di Meng, Gianluca Pollastri

Universidad Nacional de San Martín:
Lucía B. Chemes, Juliana Glavina

Contact: dimeng093@gmail.com

3. How to Cite

If you use LINKER-Pred Web Server in your research, please cite the following publications:

[1] Meng, D., Glavina, J., Pollastri, G., & Chemes L.B. (2025). LINKER-Pred web server: A Public Web Server for Accurate Prediction of Disordered Flexible Linkers in Proteins. Submitted.

[2] Meng, D., Glavina, J., Pollastri, G., & Chemes L.B. (2025). Expanding the Landscape of Disordered Flexible Linkers: A Structural and Computational Framework for DLD Dataset Assembly. Submitted.

4. Result File Format

After a successful prediction, results are provided as a downloadable ZIP archive named [TASK_ID]_results.zip.

When unzipped, the folder contains one CSV file per input sequence, named as [sequence_name].csv. Each CSV file includes four columns:

Column Description Example
Residue Index Position of residue in the sequence (starting from 1) 1, 2, 3, ...
Amino Acid Single-letter representation of the residue A, G, L, K, ...
Predicted Linker Score Continuous probability between 0 and 1 representing the likelihood of being a linker residue 0.023, 0.741, 0.998, ...
Predicted Label Binary classification (1 = linker, 0 = non-linker), obtained by applying the user-selected threshold (default: 0.15) 0, 1, 0, 0, 1, ...

For example, if a protein sequence “P12345” was submitted, its results will be found in P12345.csv inside [TASK_ID]_results.zip.