Download the DLD Dataset

The DLD dataset contains information on 2646 protein sequences, including labels for independent domain linkers (IDL), dependent domain linkers (DDL), intra-domain loops (L), structural domains (D), and termini (T). Below are the download options for both the JSON and CSV versions of the dataset and its associated files.

Main Dataset

domainLinker.chain: The main file containing all 2649 PDB chains with labels (reference). The labels are as follows: 1: IDL, 2: DDL, 3: L, 4: D, and 5: T.

Region Files

These files contain data on specific regions of the protein sequences (more detailed information):

Download All Files as a ZIP

If you prefer to download all the files in one go, click the button below to download a ZIP file containing all the dataset files.

Download All Files (ZIP)