Image analysis and length estimation of biomolecules using AFM

Andrew Sundstrom, Silvio Cirrone, Salvatore Paxia, Carlin Hsueh, Rachel Kjolby, James K. Gimzewski, Jason Reed, Bhubaneswar Mishra

Research output: Contribution to journalArticle

Abstract

There are many examples of problems in pattern analysis for which it is often possible to obtain systematic characterizations, if in addition a small number of useful features or parameters of the image are known a priori or can be estimated reasonably well. Often, the relevant features of a particular pattern analysis problem are easy to enumerate, as when statistical structures of the patterns are well understood from the knowledge of the domain. We study a problem from molecular image analysis, where such a domain-dependent understanding may be lacking to some degree and the features must be inferred via machine-learning techniques. In this paper, we propose a rigorous, fully automated technique for this problem. We are motivated by an application of atomic force microscopy (AFM) image processing needed to solve a central problem in molecular biology, aimed at obtaining the complete transcription profile of a single cell, a snapshot that shows which genes are being expressed and to what degree. Reed etal. (Single molecule transcription profiling with AFM, Nanotechnology, vol. 18, no. 4, 2007) showed that the transcription profiling problem reduces to making high-precision measurements of biomolecule backbone lengths, correct to within 20-25bp (6-7.5nm). Here, we present an image processing and length estimation pipeline using AFM that comes close to achieving these measurement tolerances. In particular, we develop a biased length estimator on trained coefficients of a simple linear regression model, biweighted by a Beaton-Tukey function, whose feature universe is constrained by James-Stein shrinkage to avoid overfitting. In terms of extensibility and addressing the model selection problem, this formulation subsumes the models we studied.

Original languageEnglish (US)
Article number6228535
Pages (from-to)1200-1207
Number of pages8
JournalIEEE Transactions on Information Technology in Biomedicine
Volume16
Issue number6
DOIs
StatePublished - 2012

Fingerprint

Atomic Force Microscopy
Biomolecules
Transcription
Image analysis
Atomic force microscopy
Linear Models
Image processing
Nanotechnology
Molecular biology
Linear regression
Learning systems
Molecular Biology
Research Design
Pipelines
Genes
Molecules

Keywords

  • Atomic force microscopy (AFM)
  • Beaton-Tukey
  • biased estimation
  • biomolecule
  • biweight
  • cDNA
  • digital contour
  • DNA
  • image processing
  • length estimation
  • linear regression
  • machine learning
  • RNA
  • single molecule
  • supervised learning

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Biotechnology
  • Computer Science Applications
  • Medicine(all)

Cite this

Sundstrom, A., Cirrone, S., Paxia, S., Hsueh, C., Kjolby, R., Gimzewski, J. K., ... Mishra, B. (2012). Image analysis and length estimation of biomolecules using AFM. IEEE Transactions on Information Technology in Biomedicine, 16(6), 1200-1207. [6228535]. https://doi.org/10.1109/TITB.2012.2206819

Image analysis and length estimation of biomolecules using AFM. / Sundstrom, Andrew; Cirrone, Silvio; Paxia, Salvatore; Hsueh, Carlin; Kjolby, Rachel; Gimzewski, James K.; Reed, Jason; Mishra, Bhubaneswar.

In: IEEE Transactions on Information Technology in Biomedicine, Vol. 16, No. 6, 6228535, 2012, p. 1200-1207.

Research output: Contribution to journalArticle

Sundstrom, A, Cirrone, S, Paxia, S, Hsueh, C, Kjolby, R, Gimzewski, JK, Reed, J & Mishra, B 2012, 'Image analysis and length estimation of biomolecules using AFM', IEEE Transactions on Information Technology in Biomedicine, vol. 16, no. 6, 6228535, pp. 1200-1207. https://doi.org/10.1109/TITB.2012.2206819
Sundstrom, Andrew ; Cirrone, Silvio ; Paxia, Salvatore ; Hsueh, Carlin ; Kjolby, Rachel ; Gimzewski, James K. ; Reed, Jason ; Mishra, Bhubaneswar. / Image analysis and length estimation of biomolecules using AFM. In: IEEE Transactions on Information Technology in Biomedicine. 2012 ; Vol. 16, No. 6. pp. 1200-1207.
@article{bd1cb27b864b4ec7a5dad5e4e8609303,
title = "Image analysis and length estimation of biomolecules using AFM",
abstract = "There are many examples of problems in pattern analysis for which it is often possible to obtain systematic characterizations, if in addition a small number of useful features or parameters of the image are known a priori or can be estimated reasonably well. Often, the relevant features of a particular pattern analysis problem are easy to enumerate, as when statistical structures of the patterns are well understood from the knowledge of the domain. We study a problem from molecular image analysis, where such a domain-dependent understanding may be lacking to some degree and the features must be inferred via machine-learning techniques. In this paper, we propose a rigorous, fully automated technique for this problem. We are motivated by an application of atomic force microscopy (AFM) image processing needed to solve a central problem in molecular biology, aimed at obtaining the complete transcription profile of a single cell, a snapshot that shows which genes are being expressed and to what degree. Reed etal. (Single molecule transcription profiling with AFM, Nanotechnology, vol. 18, no. 4, 2007) showed that the transcription profiling problem reduces to making high-precision measurements of biomolecule backbone lengths, correct to within 20-25bp (6-7.5nm). Here, we present an image processing and length estimation pipeline using AFM that comes close to achieving these measurement tolerances. In particular, we develop a biased length estimator on trained coefficients of a simple linear regression model, biweighted by a Beaton-Tukey function, whose feature universe is constrained by James-Stein shrinkage to avoid overfitting. In terms of extensibility and addressing the model selection problem, this formulation subsumes the models we studied.",
keywords = "Atomic force microscopy (AFM), Beaton-Tukey, biased estimation, biomolecule, biweight, cDNA, digital contour, DNA, image processing, length estimation, linear regression, machine learning, RNA, single molecule, supervised learning",
author = "Andrew Sundstrom and Silvio Cirrone and Salvatore Paxia and Carlin Hsueh and Rachel Kjolby and Gimzewski, {James K.} and Jason Reed and Bhubaneswar Mishra",
year = "2012",
doi = "10.1109/TITB.2012.2206819",
language = "English (US)",
volume = "16",
pages = "1200--1207",
journal = "IEEE Journal of Biomedical and Health Informatics",
issn = "2168-2194",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "6",

}

TY - JOUR

T1 - Image analysis and length estimation of biomolecules using AFM

AU - Sundstrom, Andrew

AU - Cirrone, Silvio

AU - Paxia, Salvatore

AU - Hsueh, Carlin

AU - Kjolby, Rachel

AU - Gimzewski, James K.

AU - Reed, Jason

AU - Mishra, Bhubaneswar

PY - 2012

Y1 - 2012

N2 - There are many examples of problems in pattern analysis for which it is often possible to obtain systematic characterizations, if in addition a small number of useful features or parameters of the image are known a priori or can be estimated reasonably well. Often, the relevant features of a particular pattern analysis problem are easy to enumerate, as when statistical structures of the patterns are well understood from the knowledge of the domain. We study a problem from molecular image analysis, where such a domain-dependent understanding may be lacking to some degree and the features must be inferred via machine-learning techniques. In this paper, we propose a rigorous, fully automated technique for this problem. We are motivated by an application of atomic force microscopy (AFM) image processing needed to solve a central problem in molecular biology, aimed at obtaining the complete transcription profile of a single cell, a snapshot that shows which genes are being expressed and to what degree. Reed etal. (Single molecule transcription profiling with AFM, Nanotechnology, vol. 18, no. 4, 2007) showed that the transcription profiling problem reduces to making high-precision measurements of biomolecule backbone lengths, correct to within 20-25bp (6-7.5nm). Here, we present an image processing and length estimation pipeline using AFM that comes close to achieving these measurement tolerances. In particular, we develop a biased length estimator on trained coefficients of a simple linear regression model, biweighted by a Beaton-Tukey function, whose feature universe is constrained by James-Stein shrinkage to avoid overfitting. In terms of extensibility and addressing the model selection problem, this formulation subsumes the models we studied.

AB - There are many examples of problems in pattern analysis for which it is often possible to obtain systematic characterizations, if in addition a small number of useful features or parameters of the image are known a priori or can be estimated reasonably well. Often, the relevant features of a particular pattern analysis problem are easy to enumerate, as when statistical structures of the patterns are well understood from the knowledge of the domain. We study a problem from molecular image analysis, where such a domain-dependent understanding may be lacking to some degree and the features must be inferred via machine-learning techniques. In this paper, we propose a rigorous, fully automated technique for this problem. We are motivated by an application of atomic force microscopy (AFM) image processing needed to solve a central problem in molecular biology, aimed at obtaining the complete transcription profile of a single cell, a snapshot that shows which genes are being expressed and to what degree. Reed etal. (Single molecule transcription profiling with AFM, Nanotechnology, vol. 18, no. 4, 2007) showed that the transcription profiling problem reduces to making high-precision measurements of biomolecule backbone lengths, correct to within 20-25bp (6-7.5nm). Here, we present an image processing and length estimation pipeline using AFM that comes close to achieving these measurement tolerances. In particular, we develop a biased length estimator on trained coefficients of a simple linear regression model, biweighted by a Beaton-Tukey function, whose feature universe is constrained by James-Stein shrinkage to avoid overfitting. In terms of extensibility and addressing the model selection problem, this formulation subsumes the models we studied.

KW - Atomic force microscopy (AFM)

KW - Beaton-Tukey

KW - biased estimation

KW - biomolecule

KW - biweight

KW - cDNA

KW - digital contour

KW - DNA

KW - image processing

KW - length estimation

KW - linear regression

KW - machine learning

KW - RNA

KW - single molecule

KW - supervised learning

UR - http://www.scopus.com/inward/record.url?scp=84870947621&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84870947621&partnerID=8YFLogxK

U2 - 10.1109/TITB.2012.2206819

DO - 10.1109/TITB.2012.2206819

M3 - Article

VL - 16

SP - 1200

EP - 1207

JO - IEEE Journal of Biomedical and Health Informatics

JF - IEEE Journal of Biomedical and Health Informatics

SN - 2168-2194

IS - 6

M1 - 6228535

ER -