TACITuS: Transcriptomic data collector, integrator, and selector on big data platform

Salvatore Alaimo, Antonio Di Maria, Dennis Shasha, Alfredo Ferro, Alfredo Pulvirenti

Research output: Contribution to journalArticle

Abstract

Background: Several large public repositories of microarray datasets and RNA-seq data are available. Two prominent examples include ArrayExpress and NCBI GEO. Unfortunately, there is no easy way to import and manipulate data from such resources, because the data is stored in large files, requiring large bandwidth to download and special purpose data manipulation tools to extract subsets relevant for the specific analysis. Results: TACITuS is a web-based system that supports rapid query access to high-Throughput microarray and NGS repositories. The system is equipped with modules capable of managing large files, storing them in a cloud environment and extracting subsets of data in an easy and efficient way. The system also supports the ability to import data into Galaxy for further analysis. Conclusions: TACITuS automates most of the pre-processing needed to analyze high-Throughput microarray and NGS data from large publicly-Available repositories. The system implements several modules to manage large files in an easy and efficient way. Furthermore, it is capable deal with Galaxy environment allowing users to analyze data through a user-friendly interface.

Original languageEnglish (US)
Article number366
JournalBMC Bioinformatics
Volume20
DOIs
StatePublished - Nov 22 2019

Fingerprint

Galaxies
Selector
Microarrays
Throughput
Microarray
RNA
Repository
User interfaces
High Throughput
Bandwidth
Processing
Module
Web-based System
Subset
Big data
User Interface
Preprocessing
Manipulation
Query

Keywords

  • Cloud storage and management
  • Galaxy
  • RNA-Seq

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

TACITuS : Transcriptomic data collector, integrator, and selector on big data platform. / Alaimo, Salvatore; Di Maria, Antonio; Shasha, Dennis; Ferro, Alfredo; Pulvirenti, Alfredo.

In: BMC Bioinformatics, Vol. 20, 366, 22.11.2019.

Research output: Contribution to journalArticle

Alaimo, Salvatore ; Di Maria, Antonio ; Shasha, Dennis ; Ferro, Alfredo ; Pulvirenti, Alfredo. / TACITuS : Transcriptomic data collector, integrator, and selector on big data platform. In: BMC Bioinformatics. 2019 ; Vol. 20.
@article{03472b6b2262498186a8841f6004e877,
title = "TACITuS: Transcriptomic data collector, integrator, and selector on big data platform",
abstract = "Background: Several large public repositories of microarray datasets and RNA-seq data are available. Two prominent examples include ArrayExpress and NCBI GEO. Unfortunately, there is no easy way to import and manipulate data from such resources, because the data is stored in large files, requiring large bandwidth to download and special purpose data manipulation tools to extract subsets relevant for the specific analysis. Results: TACITuS is a web-based system that supports rapid query access to high-Throughput microarray and NGS repositories. The system is equipped with modules capable of managing large files, storing them in a cloud environment and extracting subsets of data in an easy and efficient way. The system also supports the ability to import data into Galaxy for further analysis. Conclusions: TACITuS automates most of the pre-processing needed to analyze high-Throughput microarray and NGS data from large publicly-Available repositories. The system implements several modules to manage large files in an easy and efficient way. Furthermore, it is capable deal with Galaxy environment allowing users to analyze data through a user-friendly interface.",
keywords = "Cloud storage and management, Galaxy, RNA-Seq",
author = "Salvatore Alaimo and {Di Maria}, Antonio and Dennis Shasha and Alfredo Ferro and Alfredo Pulvirenti",
year = "2019",
month = "11",
day = "22",
doi = "10.1186/s12859-019-2912-4",
language = "English (US)",
volume = "20",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

TY - JOUR

T1 - TACITuS

T2 - Transcriptomic data collector, integrator, and selector on big data platform

AU - Alaimo, Salvatore

AU - Di Maria, Antonio

AU - Shasha, Dennis

AU - Ferro, Alfredo

AU - Pulvirenti, Alfredo

PY - 2019/11/22

Y1 - 2019/11/22

N2 - Background: Several large public repositories of microarray datasets and RNA-seq data are available. Two prominent examples include ArrayExpress and NCBI GEO. Unfortunately, there is no easy way to import and manipulate data from such resources, because the data is stored in large files, requiring large bandwidth to download and special purpose data manipulation tools to extract subsets relevant for the specific analysis. Results: TACITuS is a web-based system that supports rapid query access to high-Throughput microarray and NGS repositories. The system is equipped with modules capable of managing large files, storing them in a cloud environment and extracting subsets of data in an easy and efficient way. The system also supports the ability to import data into Galaxy for further analysis. Conclusions: TACITuS automates most of the pre-processing needed to analyze high-Throughput microarray and NGS data from large publicly-Available repositories. The system implements several modules to manage large files in an easy and efficient way. Furthermore, it is capable deal with Galaxy environment allowing users to analyze data through a user-friendly interface.

AB - Background: Several large public repositories of microarray datasets and RNA-seq data are available. Two prominent examples include ArrayExpress and NCBI GEO. Unfortunately, there is no easy way to import and manipulate data from such resources, because the data is stored in large files, requiring large bandwidth to download and special purpose data manipulation tools to extract subsets relevant for the specific analysis. Results: TACITuS is a web-based system that supports rapid query access to high-Throughput microarray and NGS repositories. The system is equipped with modules capable of managing large files, storing them in a cloud environment and extracting subsets of data in an easy and efficient way. The system also supports the ability to import data into Galaxy for further analysis. Conclusions: TACITuS automates most of the pre-processing needed to analyze high-Throughput microarray and NGS data from large publicly-Available repositories. The system implements several modules to manage large files in an easy and efficient way. Furthermore, it is capable deal with Galaxy environment allowing users to analyze data through a user-friendly interface.

KW - Cloud storage and management

KW - Galaxy

KW - RNA-Seq

UR - http://www.scopus.com/inward/record.url?scp=85075438928&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85075438928&partnerID=8YFLogxK

U2 - 10.1186/s12859-019-2912-4

DO - 10.1186/s12859-019-2912-4

M3 - Article

C2 - 31757212

AN - SCOPUS:85075438928

VL - 20

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 366

ER -