Automatically constructing a directory of molecular biology databases

Luciano Barbosa, Sumit Tandon, Juliana Freire

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

There has been an explosion in the volume of biology-related information that is available in online databases. But finding the right information can be challenging. Not only is this information spread over multiple sources, but often, it is hidden behind form interfaces of online databases. There are several ongoing efforts that aim to simplify the process of finding, integrating and exploring these data. However, existing approaches are not scalable, and require substantial manual input. Notable examples include the NCBI databases and the NAR database compilation. As an important step towards a scalable solution to this problem, we describe a new infrastructure that automates, to a large extent, the process of locating and organizing online databases. We show how this infrastructure can be used to automate the construction and maintenance of a Molecular Biology database collection. We also provide an evaluation which shows that the infrastructure is scalable and effective-it is able to efficiently locate and accurately identify the relevant online databases.

Original languageEnglish (US)
Title of host publicationData Integration in the Life Sciences - 4th International Workshop, DILS 2007, Proceedings
Pages6-16
Number of pages11
Volume4544 LNBI
StatePublished - 2007
Event4th International Workshop on Data Integration in the Life Sciences, DILS 2007 - Philadelphia, PA, United States
Duration: Jun 27 2007Jun 29 2007

Other

Other4th International Workshop on Data Integration in the Life Sciences, DILS 2007
CountryUnited States
CityPhiladelphia, PA
Period6/27/076/29/07

Fingerprint

Chemical Databases
Directories
Molecular biology
Molecular Biology
Databases
Infrastructure
Explosions
Compilation
Explosion
Biology
Maintenance
Simplify
Evaluation

ASJC Scopus subject areas

  • Computer Science(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Theoretical Computer Science

Cite this

Barbosa, L., Tandon, S., & Freire, J. (2007). Automatically constructing a directory of molecular biology databases. In Data Integration in the Life Sciences - 4th International Workshop, DILS 2007, Proceedings (Vol. 4544 LNBI, pp. 6-16)

Automatically constructing a directory of molecular biology databases. / Barbosa, Luciano; Tandon, Sumit; Freire, Juliana.

Data Integration in the Life Sciences - 4th International Workshop, DILS 2007, Proceedings. Vol. 4544 LNBI 2007. p. 6-16.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Barbosa, L, Tandon, S & Freire, J 2007, Automatically constructing a directory of molecular biology databases. in Data Integration in the Life Sciences - 4th International Workshop, DILS 2007, Proceedings. vol. 4544 LNBI, pp. 6-16, 4th International Workshop on Data Integration in the Life Sciences, DILS 2007, Philadelphia, PA, United States, 6/27/07.
Barbosa L, Tandon S, Freire J. Automatically constructing a directory of molecular biology databases. In Data Integration in the Life Sciences - 4th International Workshop, DILS 2007, Proceedings. Vol. 4544 LNBI. 2007. p. 6-16
Barbosa, Luciano ; Tandon, Sumit ; Freire, Juliana. / Automatically constructing a directory of molecular biology databases. Data Integration in the Life Sciences - 4th International Workshop, DILS 2007, Proceedings. Vol. 4544 LNBI 2007. pp. 6-16
@inproceedings{0c9ae2f6766c4549a3e6531ab7934be8,
title = "Automatically constructing a directory of molecular biology databases",
abstract = "There has been an explosion in the volume of biology-related information that is available in online databases. But finding the right information can be challenging. Not only is this information spread over multiple sources, but often, it is hidden behind form interfaces of online databases. There are several ongoing efforts that aim to simplify the process of finding, integrating and exploring these data. However, existing approaches are not scalable, and require substantial manual input. Notable examples include the NCBI databases and the NAR database compilation. As an important step towards a scalable solution to this problem, we describe a new infrastructure that automates, to a large extent, the process of locating and organizing online databases. We show how this infrastructure can be used to automate the construction and maintenance of a Molecular Biology database collection. We also provide an evaluation which shows that the infrastructure is scalable and effective-it is able to efficiently locate and accurately identify the relevant online databases.",
author = "Luciano Barbosa and Sumit Tandon and Juliana Freire",
year = "2007",
language = "English (US)",
isbn = "3540732543",
volume = "4544 LNBI",
pages = "6--16",
booktitle = "Data Integration in the Life Sciences - 4th International Workshop, DILS 2007, Proceedings",

}

TY - GEN

T1 - Automatically constructing a directory of molecular biology databases

AU - Barbosa, Luciano

AU - Tandon, Sumit

AU - Freire, Juliana

PY - 2007

Y1 - 2007

N2 - There has been an explosion in the volume of biology-related information that is available in online databases. But finding the right information can be challenging. Not only is this information spread over multiple sources, but often, it is hidden behind form interfaces of online databases. There are several ongoing efforts that aim to simplify the process of finding, integrating and exploring these data. However, existing approaches are not scalable, and require substantial manual input. Notable examples include the NCBI databases and the NAR database compilation. As an important step towards a scalable solution to this problem, we describe a new infrastructure that automates, to a large extent, the process of locating and organizing online databases. We show how this infrastructure can be used to automate the construction and maintenance of a Molecular Biology database collection. We also provide an evaluation which shows that the infrastructure is scalable and effective-it is able to efficiently locate and accurately identify the relevant online databases.

AB - There has been an explosion in the volume of biology-related information that is available in online databases. But finding the right information can be challenging. Not only is this information spread over multiple sources, but often, it is hidden behind form interfaces of online databases. There are several ongoing efforts that aim to simplify the process of finding, integrating and exploring these data. However, existing approaches are not scalable, and require substantial manual input. Notable examples include the NCBI databases and the NAR database compilation. As an important step towards a scalable solution to this problem, we describe a new infrastructure that automates, to a large extent, the process of locating and organizing online databases. We show how this infrastructure can be used to automate the construction and maintenance of a Molecular Biology database collection. We also provide an evaluation which shows that the infrastructure is scalable and effective-it is able to efficiently locate and accurately identify the relevant online databases.

UR - http://www.scopus.com/inward/record.url?scp=34547468874&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34547468874&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:34547468874

SN - 3540732543

SN - 9783540732549

VL - 4544 LNBI

SP - 6

EP - 16

BT - Data Integration in the Life Sciences - 4th International Workshop, DILS 2007, Proceedings

ER -