Description of DroID - the Drosophila Interactions Database (Version 4.0)


The Drosophila Interactions Database (DroID) assembles gene or protein interaction data from a variety of sources into one location. All of the data in DroID can be accessed and downloaded in part or whole at the DroID home page, http://www.droidb.org. The data also can be searched, integrated, graphed, and downloaded using IM Browser.


This database currently includes gene-gene and protein-protein interactions. Although a gene may encode multiple proteins, the methods used to detect protein interactions rarely record which protein variant from a gene was used. Thus, protein interactions are represented here by pairs of genes. The precise way to interpret a protein interaction represented as "gene 1 - gene 2" is that one or more proteins encoded by gene 1 interact with one or more proteins encoded by gene 2. The gene identifiers used in this database are Flybase Gene Numbers, FBgn.

DroID is updated periodically. The current verion is described in this document. Previous versions are described on the version history page and are available to download.

Version 4.0 - We updated DroID on July 3rd, 2008. In this version, we took efforts to make sure that every Flybase gene ID (FBgn) used in the database is a protein coding gene (according to Flybase) at the time of updating, and to remove FBgn's that are possibly ambiguous. If an old FBgn split into two new primary FBgns, we deleted records involving it. Because of this it is possible for some data sets to have fewer interactions than the previous version of DroID. Refer to Flybase Document for more information about primary and secondary FBgn's.

Summary of DroID v4.0 Data
Data set Number of interactions Number of genes
Curagen yeast two-hybrid
20182
6875
Finley Lab yeast two-hybrid
2915
1225
Hybrigenics yeast two-hybrid
1856
1282
Other physical interactions
897
628
Human interologs
40548
3996
Yeast interologs
64407
2668
Worm interologs
2383
1432
Genetic interactions
5350
1644

Below is a brief description of the various data sets. Definitions of the fields in each data set can be found further below.

Protein-protein interactions

Finley YTH - Includes protein interaction data generated in the Finley laboratory using the LexA yeast two-hybrid system, mostly from high throughput screens. The project is described here. Data versions are as follows.
Finley YTH v1.0 - 08/01/2004 - 423 interactions detected in a pilot screen using randomly selected Drosophila "bait" BD proteins. A list of the BD proteins used is here. (Zhong, Patel, Zhang, Mangiola, Stanyon, Finley, unpublished).
Finley YTH v2.5 - 12/10/2004 - Added 1,814 interactions detected in screens with 152 proteins related to cell cycle regulators. This data is described in Stanyon et al., 2004, Genome Biology, 5(12):R96. PMID: 15575970
Finley YTH v2.6 - 2/16/2007 - Secondary FBgn's mapped to primary FBgn's. Ambiguous FBgn's removed.
Finley YTH v3.0 - 7/2/2008 - Added results from a Y2H screen that tested computationally predicted protein-protein interactions, those records are marked by 'Schwartz, Yu, Gardenour, Finley, Ideker, submitted' in the REFERENCE field.

Curagen YTH - Protein interactions detected in a high throughput yeast two-hybrid screen conducted at Curagen (New Haven, CT). The current version (V3.0) contains 20,182 interactions involving 6,875 proteins, or nearly half of the proteome. All of the interactions were assigned confidence scores, with roughly one quarter of them falling into the high confidence set (scores >0.5). This data was described in Giot et al., 2003, Science 203, 1727-1736. PMID: 14605208

Hybrigenics YTH - Protein interactions detected in high throughput yeast two-hybrid screens conducted at Hybrigenics (Paris, France). They used 102 bait proteins to detect >2,300 interactions, and assigned 710 of these to a high confidence group. This data was described in Formstecher et al., 2005, Genome Research 15, 376-384. PMID: 15710747. Hybrigenics provides interaction data based on internal coding sequence ids, some of which could not be mapped to protein coding FBgns. DroID V4 has 1,856 interactions involving 1282 genes for this data set.

Other physical protein-protein interactions - these are experimentally derived physical interactions other than those from the three major YTH datasets above. These interacitons are collected from the large databases ( BioGRID, IntAct, MINT). The orginal database source and information is available for each interaction.

Genetic interactions

Genetic Interactions - Includes gene-gene interactions downloaded from Flybase in June 2008 These represent interactions between two gene alleles. For example, an allele of one gene may enhance or suppress the phenotype of an allele in another gene. Alternatively, the combination of two alleles may result in a "synthetic" phenotype not observed for either of the individual alleles.

Interolog data

Predicted interactions between Drosophila proteins based on experimental evidence for interactions between orthologers protein in other species. We collected and integrated interactions for yeast, worm, and human from online interaction databases. Proteins in the obtained interaction sets were then mapped to Fly orthologs using InParanoid (version 6.0, August 2007), which is an orthology mapping algorithm. The dates that original data was downloaded are noted in each table.

Yeast Interologs - Yeast interactions were downloaded from BioGRID, IntAct, MINT, and MIPS in June 2008. The integrated interaction set was then mapped to Fly interologs using InParanoid, see above. For each interolog, IM Browser lists the source databases containing the original yeast interaction and the associated PubMed IDs.

Worm Interologs - Worm interactions were downloaded from BioGRID, IntAct, and MINT in June 2008. The integrated interaction set was then mapped to Fly interologs using InParanoid, see above. For each interolog, IM Browser lists the source databases containing the original worm interaction and the associated PubMed IDs.

Human Interologs - Human interactions were downloaded from BioGRID, HPRD, IntAct, MINT, Reactome and PDZBase in June 2008. The integrated interaction set was then mapped to Fly interologs using InParanoid, see above. For each interolog, IM Browser lists the source databases containing the original human interaction and the associated PubMed IDs.

Table Definitions

The Drosophila Interactions Database contains two types of tables. Most tables store interaction data; there is one table which stores Drosophila gene attribute data. Table column names (used in downloaded text files), their short descriptive names (used in IM Browser when right clicking an interaction and choosing 'Edge attributes'), and their explanations are provided below for reference purpose.

Finley Yeast Two Hybrid Data

Curagen Yeast Two Hybrid Data

 

Hybrigenics Yeast Two Hybrid Data

Genetic Interactions

Yeast Interologs

Worm Interologs

Human Interologs

Gene Attributes