carmen and spike detection and sorting

CARMEN and spike detection and sorting

Leslie S. SmithUniversity of Stirling, Scotland, UK

http://www.carmen.org.uk

INCF June 17 2012 2

Contents• CARMEN architecture and project• Neural Data Format (NDF)• Workflows (?)• Project status

– Where we are now• Some reflections

CARMEN ‘Cloud’ (CAIRN)

Raw Signal Data Search & Visualisation

Data

Metadata

Compute Cluster on which Services

are DynamicallyDeployed

Web

Port

al

Rich Clients

Secu

ri ty

Workflow Enactment

Engine

RegistryService

Repository

Enactment of scientific analysis processes

Security Policies Controlling Access to Data & Code Search for Data &

Analysis Code

Structured Metadata Store Enabling Search & Annotation

AnalysisCode Store

Raw & Derived Data Store

INCF June 17 2012 4

CARMEN project status

• Initially a 4 year UK e-Science project– From September 2006-March 2011

• Extended with BBSRC tools and techniques grant– To Sept 2014

• Major work in last 2 years has been– Improving User Interface– NDF implementation (working)– Workflow Implementation (nearly there!)

INCF June 17 2012 5

CARMENCARMEN and spikes

Note that CARMEN also has other services, including higher level services, etc.

6

Data, services and workflows• CARMEN supports

– Data and metadata– Services: which process data, and– workflows (almost): concatenations of services.

• Initially: – We allowed more or less any data format– Services which processed a data format and produced a different

data format• …but to develop workflows (and to enable interoperability

between services)– We now strongly recommend using our Neural Data Format (NDF)

INCF June 17 2012

INCF June 17 2012 7

Neural Data Format (NDF)

• An NDF dataset consists of a configuration file in XML format which contains metadata and references to the associated host data files.

• A special XML data element, History, is included within the header file for recording data processing history. This element contains the full history (recording chain) of previous processing

• The NDF API has been implemented as a C library. The NDF API translates the XML tree/nodes to C style data structures and insulates the data structures within the binary data file from the clients.

INCF June 17 2012 8

NDF: Supported datatypes

9

NDF XML file<?xml version="1.0" encoding="UTF-8" standalone="no" ?><ndtfDataCfg xmlns="http://www.carmen.org.uk" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.carmen.org.uk ndtfDataCfg.xsd"> <Version>1.0.1</Version> <NdtfDataID>897A9272-4E6F-4F32-8A63-89C699F99120</NdtfDataID> <GeneralInfo> <Description>NDF Spike Detector Service (COB) Version 2 - SNN</Description> <Laboratory>Carmen VLE</Laboratory> <CreateDate>2012-06-13</CreateDate> <CreateTime>16:29:03</CreateTime> <RecordID>N/A</RecordID> </GeneralInfo> <DataSet>…

INCF June 17 2012

INCF June 17 2012 10

NDF XML file cont’d <History> <Processor> <ProcessingDateTime StartDateTime="2012-06-01T16:16:08"/> <CommandLine>mat2ndf (m0192_all.mat,m0192_all.ndf)</CommandLine> </Processor> <Processor> <ProcessingDateTime StartDateTime="2012-06-13T17:05:58"/> <CommandLine>Spike detector COB NDF m0192_all.ndf, 256, 0.002, 15, 30, no</CommandLine> <ProcessingSettings>Spike Detector (COB) for NDF data</ProcessingSettings> </Processor> </History> </ndtfDataCfg>


NDF based services• Filtering services:

– HPF, LPF, BPF• Spike detectors:

– single or multiple channel signal – Simple thresholding, positive/negative/both sided, NEO (Teager energy

operator), Cepstrum of Bispectrum• Spike sorters

– Kmeans– Waveclus (superparamagnetic clustering)

• We can add new spike detectors and new spike sorters reasonably easily. – Wrapping services

• The User Interface allows specific channels and sections of the dataset to be selected


Workflows

• Currently at alpha stage testing: – Can create workflows (graphically), generate scripts,

store them, apply security and sharing appropriately: execution of workflows is almost ready.

• Workflows will be generable either graphically or using a scripting language.


Workflow graphical interface

14

Where are we now? On the cluster

• Can run single services:– But joining them together requires user intervention– (NDF services do read each other’s data correctly)– New NDF services can be “wrapped”

• Have run workshops on this

• Can turn a variety of formats into NDF– Mcd, nev, nex, plx, map, smr, abf, abf2

• Spike detectors of three sorts– Can process multi-electrode data in one service

• Spike sorters of two sorts (Kmeans and waveclus)• But no workflows yet (promised soon!)

– … also not really enough public datasets

INCF June 17 2012


Where are we now? Local systems

• NDF toolbox is available for Matlab (downloadable). – Runs on recent versions of Matlab: PC, Mac, Linux.

• Services which run on the cluster are/will shortly be available to run locally under Matlab– Not really the intent of the project, but does enable

service running and testing (and debugging!) to be carried out locally

• Environment is essentially the same as on the cluster.• Local workflows enabled through writing XML files, and simple (-

ish) scripts.– Can test the wrapping of scripts locally


CARMEN and validation

• Validation of services• Testing services on

multiple datasets• Testing multiple services

on datasets– Locally– On the Portal.


Why is this so difficult? Why has it taken so long?What lessons can we learn?

• Initially allowed user to write services for their own data types– Ties services to specific types: not easily shareable

• NDF proved complex to implement– Generality, multiple language support

• User Interface for portal was difficult– Wanted to support non-technical users

• Existing software proved difficult to use– Software developed for R&D systems proved not to be robust: expecting

it to be was overly optimistic• Insufficient development staff

– Underestimated programming/development requirements– Supporting early development projects (“low hanging fruit”) took a lot of

time.


Carmen consortium

carmen and spike detection and sorting

Documents

ndf data incf

data codesearch

data formatservices

binary data file

different data formatbut

neural data format ndfincf

special xml data element

c style data structures