getting started with cellprofiler mark-anthony bray, ph.d imaging platform, broad institute...

59
Getting Started with Getting Started with CellProfiler CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

Upload: riya-kidner

Post on 28-Mar-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

Getting Started with CellProfilerGetting Started with CellProfiler

Mark-Anthony Bray, Ph.DImaging Platform, Broad InstituteCambridge, Massachusetts, USA

Page 2: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

2

Software OverviewSoftware Overview

• Available from www.cellprofiler.org• Free, open source (Python)• Software available for Windows, Mac and Linux

Image Analysis &Quantification

Image-centric Data Analysis

Page 3: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

3

CellProfiler: OverviewCellProfiler: Overview

• ProcessProcess large sets of images• Identifies and measuresIdentifies and measures objects• ExportExport data for further analysis

• Goal: Provide powerful image analysis methods with a user-friendly interface

• Philosophy: Measure everything, ask questions later...• Support data analysis based on individual cells

Page 4: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

5

Typical CellProfiler Pipeline WorkflowTypical CellProfiler Pipeline Workflow

• For image-based assays, the basic objective is always to – Identify cells/organisms – Measure feature(s) of interest

• The uniqueness of each assay comes in– Deciding what compartments to identify and

how to identify them – Determining which measure(s) are most useful

to identify interesting samples

Page 5: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

7

The CellProfiler InterfaceThe CellProfiler Interface

• Pipeline panel: Displays modules in pipeline– Modules executed in order from top to bottom

Change module position

Add or remove modules

Module help

Page 6: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

8

Load pipeline by double-clicking on it

View images by double-clicking on the filename

The CellProfiler InterfaceThe CellProfiler Interface

• File panel: Displays files in default image folder

Page 7: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

9

The CellProfiler InterfaceThe CellProfiler Interface

• The figure window has additional menu options

• Toolbar menu: Pan, zoom in/out

• CellProfiler Image Tools– Image Tool (also

displayed by clicking on image)

– Interactive zoom– Show pixel data

(location, intensity)

Page 8: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

10

The CellProfiler InterfaceThe CellProfiler Interface

• Folder panel: Change default input and output directories– Usually these should be separate folders

Input folder: Contains images to be analyzed

Output folder: Contains the output file plus exported data and images

Page 9: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

11

The CellProfiler InterfaceThe CellProfiler Interface

• Settings panel: View and change settings for each module– Clicking on a different module updates the settings view

Page 10: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

12

Module CategoriesModule Categories• File processing: Image

input, file output

• Image processing: Often used for pre-processing prior to object identification

• Object processing: Identification, modification of objects of interest

• Measurement: Collection of measurements from objects of interest

• Data Tools: Measurement exploration, measurement output

Page 11: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

13

The First Module: LoadImagesThe First Module: LoadImages

• Related how? Depending on the imaging device, one file may represent– One channel at one imaging location– Multiple channels at one imaging location– Multiple channels at multiple locations– Etc…

• Loads an “image set” which is a group of related images, in preparation for further processing

DNA GFP

Page 12: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

14

The First Module: LoadImagesThe First Module: LoadImages• Can use text matching to define the difference between images in a set

All images stained for GFP have the text Channel1- in the name

Same for DNA images (Channel2-)

Assign each image a meaningful name name for downstream reference

Page 13: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

16

•16

What Is An “Image”?What Is An “Image”?

•Images from Carolina Wahlby

Page 14: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

17

Object IdentificationObject Identification• Once the images are loaded, how do you find objects of

interest?

• Step 1: Distinguish the foreground from the background by picking a good threshold

• Step 2: Identify objects as regions brighter than the threshold

• Step 3: Cut and join objects to “improve” their shape

Page 15: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

18

Primary Object IdentificationPrimary Object Identification

• Many options for thresholding, cut and join methods, etc.

Page 16: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

19

ThresholdingThresholding

• Definition: Division of the image into background and foreground

• Method: Pick the method that provides the best results– Otsu: Default - Good for readily identifiable foreground / background – Background, RobustBackground: Good for images in which most of

the image is comprised of background

• What is the best threshold value for dividing the intensity histogram into foreground and background pixels…

Here?

Or here?

Pixel values

Fre

qu

en

cy

Page 17: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

20

ThresholdingThresholding

• Correction factor– Multiplication factor applied to threshold– Adjusts threshold stringency/leniency– Setting this factor is empirical

• Upper/lower bounds– Set safety limits on automatic threshold to

guards against false positives– Helpful for unexpected images: Empty wells,

images with dramatic artifacts, etc

Page 18: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

21

Object SeparationObject Separation

• Once the foreground objects have been identified, we need to distinguish multiple objects contained in the same “clump”

Images from Carolina Wahlby

•••

••

••

Page 19: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

22

Object SeparationObject Separation

• Two step process in “de-clumping”1. Identification of the objects in a clump2. Drawing boundaries between the clumped objects

Adjust settings to “de-clump” objects

Page 20: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

23

Object SeparationObject Separation

– Intensity: Works best if objects are brighter at center, dimmer at edges

– Shape: Works best if objects have indentations where clumps touch (esp. if objects are round)

Peaks

2

1 2

Indentations

• Clump identification: Two options

1

1

•••

••

••

Page 21: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

24

Object SeparationObject Separation

– Distance: Draws boundary lines midway between object centers

– Intensity: Draws boundary lines at dimmest line between objects

• Test mode allows users to view results of all setting combinations

• Drawing boundaries: Two options

1

•••

••

••

Page 22: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

25

Object SeparationObject Separation

• Additional separation settings: Adjust these settings if objects are being incorrectly split into pieces or merged together

Original image Smoothing filter

size = 4

Smoothing filter

size = 8

• Smoothing: Increase to reduce intensity irregularities which produce over-segmentation of objects

Page 23: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

26

Object SeparationObject Separation

• Suppress Local Maxima– Smallest distance allowed between object intensity

peaks to be considered one object rather than a clump– Decrease to reduce improper merging of objects in

clumps

Original image Maxima

distance = 4

Maxima

distance = 8

Maxima

Page 24: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

27

Object SeparationObject Separation

• Adjusting these parameters can produce more improper segmentation than it solves

• The proper settings are usually a matter of trial and error– The automatic settings are a good starting point, though

• However….

Original image Smoothing filter

size = 4

Smoothing filter

size = 8

Page 25: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

28

Filtering Invalid ObjectsFiltering Invalid Objects

• See FilterObjects module for more advanced filtering options

Discard objects that fail size criterion or touch the image border

Page 26: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

29

Primary Object IdentificationPrimary Object Identification

• Colors used to label each segmented object– Shows if each object has

been identified and separated properly

• Outlines highlight valid objects– Green: Valid– Yellow: Invalid – Touching

border– Red: Invalid – Size

criterion

• Gives object count as a measurement

Page 27: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

30

Secondary Object Identification Secondary Object Identification • Goal: Identify individual cell boundaries by “growing” primary objects

using a staining channel– Nuclei typically more uniform in shape, more easily separated than cells

• Segment nuclei first, then use segmented nuclei to start cell segmentation

Page 28: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

31

Secondary Object IdentificationSecondary Object Identification

• Methods– Distance-N: Ignores image

information• Useful in cases where no cell

stain is present

– Watershed, propagate, Distance-B: Uses image information

• Finds dividing lines between objects and background / neighbors

• Test mode allows user to view results of all methods

Propagation

Distance-N

Page 29: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

32

Secondary Object IdentificationSecondary Object Identification

• Regularization: Controls the precise dividing line between cells that touch each other– Performed by balancing between intensity and distance– Usually not adjusted

• Correction factor, lower/upper bounds on threshold: Same purpose as in IdentifyPrimaryObjects

Regularization = 0 Regularization = ∞

Page 30: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

33

Tertiary Object IdentificationTertiary Object Identification

• Goal: Identify tertiary objects by removing the primary objects from secondary objects – “Subtract” the nuclei objects from cell objects

to obtain cytoplasm

Cells Nuclei Cytoplasm— ═

Page 31: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

34

Measurement Modules: Object MorphologyMeasurement Modules: Object Morphology

Select the objects to measure

Page 32: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

35

Module: MeasureObjectAreaShapeModule: MeasureObjectAreaShape

• Goal: Measure morphological features such as – Area– Perimeter– Eccentricity– MajorAxisLength– MinorAxisLength– Orientation– FormFactor: Compactness measure, circle = 1, line = 0

Page 33: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

36

Measurement Modules: Object IntensityMeasurement Modules: Object Intensity

Select the image to measure from

Select the objects to measure

Page 34: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

37

Module: MeasureObjectIntensityModule: MeasureObjectIntensity

• Goal: Measure object intensity features such as– Integrated intensity: Sum of the pixel intensities within

an object– Mean, median, standard deviation intensities– Maximal and minimal pixel intensities– Lower/Upper quartile

• The object intensity may be obtained from any image, not just the image used to identify the object– Example: Ph3 intensity may be measured using the

nuclei objects

Page 35: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

38

Measurement Modules: Object TextureMeasurement Modules: Object Texture

Select the image to measure from

Select the objects to measure

Select the spatial scale

Page 36: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

39

MeasureObjectTextureMeasureObjectTexture

• Goal: Determine whether the staining pattern is smooth on a particular scale

• Selection of the appropriate texture scale is essentially empirical– A higher number measures larger patterns of texture– Smaller numbers measure more localized (finer)

patterns of texture

• Can also add several texture modules to the pipeline, each measuring a different texture scale

Page 37: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

40

Other Measurement ModulesOther Measurement Modules

• CalculateMath: Arithmetic operations for measurements• CalculateStatistics: Assay quality (V and Z' factors) and

dose response data (EC50) for all measurements

• Image-based measures– MeasureImageAreaOccupied– MeasureImageGranularity– MessureImageIntensity

• Object-based measures– MeasureCorrelation– MeasureObjectNeighbors– MeasureRadialDistribution

Page 38: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

41

Data Export ModulesData Export Modules

• User may output images or image measurements

Select the objects to export

Page 39: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

42

Measurement DisplayMeasurement Display

• The average measurements for all objects in the image are displayed in the figure window

• However, the individual measurements for each object are stored in the output file

Page 40: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

43

Data Export ModulesData Export Modules

• Goal: Retain images of intermediate image processing steps for quality control or save measurements for later analysis and exploration

• SaveImages: Writes an image to a file– Intermediate images in the pipeline are not saved unless

requested– Choice of many image formats to write → module can be used as

an image format converter

• ExportToSpreadsheet: Export measurements as a comma-separated file readable by spreadsheet programs

• ExportToDatabase: Export measurements as a per-object and per-table plus configuration file for upload to a MySQL database

Page 41: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

45

Illumination CorrectionIllumination Correction

• The physical limitations of any microscope produce nonuniformities in the optical path of the sample, microscope, and/or camera

• Example: Tiling raw images shows that there is uneven illumination from left to right in each image– This heterogeneity can lead to inaccurate intensity

measurements – A cell located at (a) is brighter than one at (b) even if the

cells have the same amount of fluorescent material

(a) (b)

Carpenter et al, Genome Biology 2006, 7:R100

Page 42: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

46

Illumination CorrectionIllumination Correction

• Illumination correction ensures that object segmentation and measurements (e.g. DNA content) are more accurate

Carpenter et al, Genome Biology 2006, 7:R100

Page 43: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

47

Illumination CorrectionIllumination Correction

• Two modules– Correct Illumination Calculate: Creates a illumination correction function– Correct Illumination Apply: Applies the function to your images

• Available options– Correct each image individually, or all images together as an ensemble?– Calculate the illumination function by using foreground pixels or

background pixels?– Apply the function using division or subtraction?

• Additional considerations– Create a new illumination correction function if you image on a different

microscope or change plates– Correct each channel since absolute illumination intensities may differ

between channels– First, create and save the function from image set, then load and apply it

prior to identification

Page 44: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

48

Cluster ComputingCluster Computing

• If processing time is too great on a single computer, then run the pipeline on a cluster– Download and install CellProfiler on a computing

cluster– Add the ExportToDatabase module– Add the CreateBatchFiles module to the end of the

pipeline and configure it appropriately– Run the first image cycle locally– Submit the batches to your cluster for processing– Check the progress of processing

• For really big screens, it is necessary to process images in batches on a computing cluster.

Page 45: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

49

Data AnalysisData Analysis

• At the end of a pipeline, you may have 500+ features per cell– Size, shape, staining intensity, texture

(smoothness), etc

• Remember our Philosophy: “Measure everything, ask questions later...”

Page 46: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

50

Data AnalysisData Analysis

• What does this data set look like? • Cytological profile, or Cytoprofile

• Shows all the measurements acquired– For each individual cell – In every image – In the entire experiment.

+1

0

-1

Cell #6111617

-.2 .7 -.1 0 .2 -.9

Page 47: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

51

CellProfiler Analyst: OverviewCellProfiler Analyst: Overview

• ExploreExplore data large sets of images• IdentifyIdentify interesting subpopulations and see

the original images• Identify Identify interesting phenotypes automatically

• Goal: Provide the user with a powerful suite of image exploration and machine learning methods

Page 48: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

52

The CellProfiler Analyst InterfaceThe CellProfiler Analyst Interface

• CellProfiler Analyst (CPA) allows you to explore the data with a variety of tools

• Upon startup, CPA request a properties file which contains– Locations of the measurement tables– How the images are referenced– Other assorted information

Page 49: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

53

Plate ViewerPlate Viewer

• Displays data in plate layout– 96- or 384-well format– Measurements are shown as color-coded wells or mouse tool-

tips– Right-clicking on well reveals list of images to display

Page 50: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

54

Image ViewerImage Viewer

• Displays an image referenced by number

• Color display– Colors are assigned to

each channel of image data

– Shown as a merged color image

– Toggle channel visibility and color scaling

Page 51: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

55

Plotting ToolsPlotting Tools

• Various plotting tools allow user to explore and sift through the measurements and make discoveries

Page 52: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

56

Data AnalysisData Analysis

• Why make so many measurements?– For many screens, only a few measurements

are necessary to obtain the phenotype

X-axis: DNA content

Y-a

xis:

pho

spho

-H3

stai

ning

Page 53: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

57

Data AnalysisData Analysis

• Unfortunately, for other phenotypes, the proper features are not so simple to find…

Wild-type HT29 cells

Cells on the move

Crescent-shaped nuclei Peas in a pod

Crooked projections

Actin dots at junctions

Long projections

Hyphae-like projections

Page 54: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

58

Data AnalysisData Analysis

• Concentrating on single cells allows us to avoid problems of heterogeneous populations, and to detect rare events (such as mitosis)

• However, determining which combinations of features and values are appropriate for a phenotype is tedious and impractical

• We have included a machine learning classification tool to automatically chose the features and values require to score a rare or subtle phenotype

Page 55: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

59

Automated Cell Image ProcessingAutomated Cell Image Processing

• Cytoprofile of 500+ features measured for each cell

104 images, 103 cells in each:Total of 107 cells/experiment

Thousands of wells

Each cell with cytoprofile

Page 56: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

60

Iterative Machine LearningIterative Machine Learning

• System presents ~500 cells to biologists for scoring

• System defines rule based on cytoprofile of scored cells

YesYes

Rule

Iteration

NoNo

Page 57: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

61

Iterative Machine LearningIterative Machine Learning

• Scored cells are sorted by well: Identify samples with a high proportion of positive cells

Scored

107 cells

Rule

Page 58: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

62

Final NotesFinal Notes

• Where to get help– Access help from the CellProfiler main window– Ask for help on the CellProfiler.org forum

Page 59: Getting Started with CellProfiler Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA

63

Image assay developmentApply image analysis methods to biological questions

Mark Bray

Anne Carpenter David

Logan

Algorithm development & software engineeringDevelop & test new image analysis and data mining methods

and create open-source software tools

IT/Administration

Peggy (Margaret) Anthony

Kate Madden

RayJones

Vebjørn Ljoså

Auguste Genovesio(begins 2010)

Adam Fraser

Carolina Wählby

The TeamThe Team

Lee Kamentsky

Director