Design and implementation of contextual information portals

Jay Chen, Russell Power, Lakshminarayanan Subramanian, Jonathan Ledlie

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper presents a system for enabling offline web use to satisfy the information needs of disconnected communities. We describe the design, implementation, evaluation, and pilot deployment of an automated mechanism to construct Contextual Information Portals (CIPs). CIPs are large searchable information repositories of web pages tailored to the information needs of a target population. We combine an efficient classifier with a focused crawler to gather the web pages for the portal for any given topic. Given a set of topics of interest, our system constructs a CIP containing the most relevant pages from the web across these topics. Using several secondary school course syllabi, we demonstrate the effectiveness of our system for constructing CIPs for use as an education resource. We evaluate our system across several metrics: classification accuracy, crawl scalability, crawl accuracy and harvest rate. We describe the utility and usability of our system based on a preliminary deployment study at an after-school program in India, and also outline our ongoing larger-scale pilot deployment at five schools in Kenya.

Original languageEnglish (US)
Title of host publicationProceedings of the 20th International Conference Companion on World Wide Web, WWW 2011
Pages453-462
Number of pages10
DOIs
StatePublished - 2011
Event20th International Conference Companion on World Wide Web, WWW 2011 - Hyderabad, India
Duration: Mar 28 2011Apr 1 2011

Other

Other20th International Conference Companion on World Wide Web, WWW 2011
CountryIndia
CityHyderabad
Period3/28/114/1/11

Fingerprint

World Wide Web
Websites
Scalability
Classifiers
Education
Web crawler

Keywords

  • document classification
  • focused crawling
  • offline
  • web portal

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems

Cite this

Chen, J., Power, R., Subramanian, L., & Ledlie, J. (2011). Design and implementation of contextual information portals. In Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011 (pp. 453-462) https://doi.org/10.1145/1963192.1963359

Design and implementation of contextual information portals. / Chen, Jay; Power, Russell; Subramanian, Lakshminarayanan; Ledlie, Jonathan.

Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011. 2011. p. 453-462.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Chen, J, Power, R, Subramanian, L & Ledlie, J 2011, Design and implementation of contextual information portals. in Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011. pp. 453-462, 20th International Conference Companion on World Wide Web, WWW 2011, Hyderabad, India, 3/28/11. https://doi.org/10.1145/1963192.1963359
Chen J, Power R, Subramanian L, Ledlie J. Design and implementation of contextual information portals. In Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011. 2011. p. 453-462 https://doi.org/10.1145/1963192.1963359
Chen, Jay ; Power, Russell ; Subramanian, Lakshminarayanan ; Ledlie, Jonathan. / Design and implementation of contextual information portals. Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011. 2011. pp. 453-462
@inproceedings{0a896d367fd24607bad4dded679b0b54,
title = "Design and implementation of contextual information portals",
abstract = "This paper presents a system for enabling offline web use to satisfy the information needs of disconnected communities. We describe the design, implementation, evaluation, and pilot deployment of an automated mechanism to construct Contextual Information Portals (CIPs). CIPs are large searchable information repositories of web pages tailored to the information needs of a target population. We combine an efficient classifier with a focused crawler to gather the web pages for the portal for any given topic. Given a set of topics of interest, our system constructs a CIP containing the most relevant pages from the web across these topics. Using several secondary school course syllabi, we demonstrate the effectiveness of our system for constructing CIPs for use as an education resource. We evaluate our system across several metrics: classification accuracy, crawl scalability, crawl accuracy and harvest rate. We describe the utility and usability of our system based on a preliminary deployment study at an after-school program in India, and also outline our ongoing larger-scale pilot deployment at five schools in Kenya.",
keywords = "document classification, focused crawling, offline, web portal",
author = "Jay Chen and Russell Power and Lakshminarayanan Subramanian and Jonathan Ledlie",
year = "2011",
doi = "10.1145/1963192.1963359",
language = "English (US)",
isbn = "9781450305181",
pages = "453--462",
booktitle = "Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011",

}

TY - GEN

T1 - Design and implementation of contextual information portals

AU - Chen, Jay

AU - Power, Russell

AU - Subramanian, Lakshminarayanan

AU - Ledlie, Jonathan

PY - 2011

Y1 - 2011

N2 - This paper presents a system for enabling offline web use to satisfy the information needs of disconnected communities. We describe the design, implementation, evaluation, and pilot deployment of an automated mechanism to construct Contextual Information Portals (CIPs). CIPs are large searchable information repositories of web pages tailored to the information needs of a target population. We combine an efficient classifier with a focused crawler to gather the web pages for the portal for any given topic. Given a set of topics of interest, our system constructs a CIP containing the most relevant pages from the web across these topics. Using several secondary school course syllabi, we demonstrate the effectiveness of our system for constructing CIPs for use as an education resource. We evaluate our system across several metrics: classification accuracy, crawl scalability, crawl accuracy and harvest rate. We describe the utility and usability of our system based on a preliminary deployment study at an after-school program in India, and also outline our ongoing larger-scale pilot deployment at five schools in Kenya.

AB - This paper presents a system for enabling offline web use to satisfy the information needs of disconnected communities. We describe the design, implementation, evaluation, and pilot deployment of an automated mechanism to construct Contextual Information Portals (CIPs). CIPs are large searchable information repositories of web pages tailored to the information needs of a target population. We combine an efficient classifier with a focused crawler to gather the web pages for the portal for any given topic. Given a set of topics of interest, our system constructs a CIP containing the most relevant pages from the web across these topics. Using several secondary school course syllabi, we demonstrate the effectiveness of our system for constructing CIPs for use as an education resource. We evaluate our system across several metrics: classification accuracy, crawl scalability, crawl accuracy and harvest rate. We describe the utility and usability of our system based on a preliminary deployment study at an after-school program in India, and also outline our ongoing larger-scale pilot deployment at five schools in Kenya.

KW - document classification

KW - focused crawling

KW - offline

KW - web portal

UR - http://www.scopus.com/inward/record.url?scp=79955153090&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79955153090&partnerID=8YFLogxK

U2 - 10.1145/1963192.1963359

DO - 10.1145/1963192.1963359

M3 - Conference contribution

AN - SCOPUS:79955153090

SN - 9781450305181

SP - 453

EP - 462

BT - Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011

ER -