nozha boujemaa - research director in big data … · of the subunits into an initial model and...
TRANSCRIPT
Friday14thofApril20179H30–18H00CentredeRecherchesInterdisciplinaires(CRI)-FacultédeMédecine,siteCochinPort-Royal,ailesud,2ème
étage-UniversitéParisDescartes-24,rueduFaubourgSaintJacques-75014Paris
NozhaBoujemaa-ResearchDirectorInBigDataDefenseMaster’sTheses
PhDPosterSessionEscapeGame
FreeBreakfast-FreeBuffet-FreeDrinks-FreeEntrance
Program
AmphithéâtreLUTON-24rueduFaubourgSaintJacques75014Paris
14thofApril2017
9H00 Breakfast
KeynoteSpeaker10H00 NozhaBoujemaa,ResearchDirector,AdvisortotheChairman&CEOofInriaInBigData
10H55 Coffeebreak
11H20 ThibaultCorneloup-SpatialstatisticsfortheanalysisoftheorganizationofP-bodies 11H45 CharlesBernard-Assessingthephenotypicheterogeneityofacellpopulationviasinglecell
RNASeqdataanalysis12H10 AhmedSaadawi-IntegrativemodelingoftheTREXcomplex12H35EléonoreBellot-Emergenceofcomplexityandandindancechoreography
13H00 Lunch
14H00SebastiánSosa-Carrillo-Parameterssearchingforasynthetictoggleswitch14H25 MislavAcman-Evaluationofmonotonicregressionmethodforthepredictionofcomplex
phenotypesfromtranscriptomicdata 14H50 PresentationofCRIdatascienceclub:UrszulaCzerwinska 15H05 EscapeGame 16H05 PosterpresentationPHD&Servringcoffeeatthesametime
OlgaSeminck:
PredictingProcessingCostofAnaphoraResolution
AamirAbbasi:Towardsafastbrain-machineinterfaceintegratingartificialsomatosensoryfeedback
ChristopheCoste:TraitLevelAnalysisOfMultitraitPopulationProjectionMatrices
16H45UriBarenholz–InvitedSpeaker
17H30 Buffet
SpatialstatisticsfortheanalysisoftheorganizationofP-bodies
ThibaultCorneloup
CentredeRecherchesInterdisciplinaires
InternatModelingandimageanalysisteam,morphogenesis,signalization,modellingdepartment,INRA-Versailles
P-bodysuits are ribonucleoproteins complex without membrane, located in the cytoplasm of eukaryotic cells and involved in the degradation and the translational repression of mRNA. Within the framework of the project ARNPbodyStruc (D. Weil), the IJPB team modelling (p. Andrey) has for objective to develop a 3D model of the architecture of P-bodysuits resting on data of MET. A prerequisite is to develop tools to compare predictions of the model and MET observations PUTS it on the basis of the distributions of golden particles which, coupled with antibodies, are used to reveal the spatial location of proteins involved in the assembly of P-bodysuits.
Because of the limited character of the data obtained by immuno-MET, we adopted an approach of statistical analysis of limited processes to make this comparison. The approach consists in characterizing the distributions of golden balls by confronting them with models of spatial distribution and to quantify the gap between observations and spatial model by means of distance functions (Andrey and al 2010).
By using random models and classic functions of distance, we have already established the feasibility and the interest of this statistical approach and showed that the protein ddx6 follows a non-random distribution there clusters which correspond probably to the association of ddx6 in ARNm (Ernoult-Lange and al 2012). New functions of distance and new spatial models must be however developed to characterize with a finer resolution the distributions of golden balls. The objective will be the analysis of distributions radial roads by C++ language, the analysis of alignments, and the analysis of multivariate patterns for the comparison of distributions of different proteins such as DDX6, LSM14 and GE1.
References :
Andrey P, Kieˆu K, Kress C, Lehmann G, Tirichine L, et al. (2010) Statistical Analysis of 3D Images Detects Regular Spatial Distributions of Centromeres and Chromocenters in Animal and Plant Nuclei. PLoS Comput Biol 6(7): e1000853. doi:10.1371/journal.pcbi.1000853
Michèle Ernoult-Lange, Sonia Baconnais, Maryannick Harper, et al. (2012) Multiple binding of repressed mRNAs by the P-body protein Rck/p54. RNA:18: 1702-1715. doi: 10.1261/rna.034314.112
Arpon J, Gaudin P, Andrey P. (2017) A method for testing random spatial model on nuclear objects distributions. In review.
AssessingthephenotypicheterogeneityofacellpopulationviasinglecellRNASeqdata
analysis
CharlesBernard
EquipeStress&Cancer(FatimaMechta-Grigoriou)atInstitutCurie,Paris
Previousresearchledbymygrouprevealedthatinbreastandovariancancers,atleastfourdifferenttypesofCancerAssociatedFibroblasts(CAF)canbecharacterizedbyFACS,IHCandRNA-Seq.TheteamisnowtacklingtheheterogeneityexistingwithineachofthesefourCAFsubpopulations,usingsingle-cellRNA-seq.
Thefirstpartofthepresentationwillbededicatedtotheanalysisofthiscellularheterogeneity,applyingtheSPADEalgorithm(Spanning-treeProgressionAnalysisofDensity-normalizedEvents)onthesescRNAseqdata.SPADEallowsahierarchicalvisualizationofcellularsubtypeswithinapopulation,accordingtotheirsimilaritiesofexpressionpattern.TheoutputofSPADEissimilartoaphylogenetictreeandcanhelponetoinferhierarchybetweencellsofsimilarprofiles,wherebranchingpointsonthetreereflectmajorcellfatedecisions.
AnotherissueraisedbyscRNAseq,dueespeciallytothelowdetectionofawiderangeofgenes,istoidentifytherelevantassociationswhichcouldexistbetweengenes,intheframeofpairwisecomparisonsofgeneexpressions.Previousstudieshaveindeedshownthat«classical»correlationtestsfailatcapturingacertainnumberofmeaningfulassociationsfromabiologicalviewpoint.Inthesecondpartofthepresentation,Iwillthereforeintroducestatisticalmethodswhichsucceedatcapturingassociationsbetweentheexpressionoftwogenes,whichcanbenon-linear,non-monotonicorwhichcan'tevenbemodelledbyamathematicalfunction.
IwilltryfinallytoshowhowthesedetectionsofnovelassociationscanbecombinedwiththeoutputofSPADEhierarchicalclusteringtoconfirmthebiologicalrelevanceofpotentialmarker(s)(genes)usedtocharacterizethedifferentsubtypesidentifiedineachoftheCAFsubpopulations.
IntegrativemodelingoftheTREXcomplex
AhmedSaadawi
Structuralbioinformaticsunit(MichaelNilgesgroup),InstitutPasteur,Paris
TheprocessofnuclearexportofmRNAsissophisticatedandentailsdifferentsteps.Thefunctionalorganizationofsuchaprocessispivotalforthefidelityandmaintenanceofgeneexpression.OneofthecorecomponentsnecessitatedforsuccessfulmRNAsnuclearexportistheTREXproteincomplex.TheTREX(TRanscription-EXport)complexisanevolutionarilyconservedmultiproteincomplexthatplaysamajorroleinthefunctionalcouplingofdifferentstepsduringmRNAbiogenesis,includingmRNAtranscription,processing,decay,andnuclearexport.AlthoughthefunctionofTREXisrelativelywellcharacterized,itsstructureand/orarchitecturehasnotyetbeenfullyresolved.Herein,weaimatdeterminingthearchitectureoftheTREXmacromolecularcomplexusingintegrativemodeling-inparticularbyutilizingtheintegrativemodelingplatformorIMP.Themodelingprocessconsistsofafour-stagecomputationalcycleof1)gatheringsparsedatasuchaselectronmicroscopydensitymaps,SAXS,cross-linking,sequenceinformation(FASTA),X-raycrystallography(PDBs),etc.;2)representationofthesubunitsintoaninitialmodelandtranslationofthedataintospatialrestraints;3)samplingofdifferentconformationsusingMonteCarlosimulation;4)analyzingthedatabyclusteringthesampledmodelstodeterminehigh-probabilityconfigurationsorbest-scoringmodelsthatsatisfytheexperimentaldata.TheendpointwouldbetoobtainareasonablestructuralresolutionofTREX.
References:
1)Katahira,Jun."mRNAexportandtheTREXcomplex."BiochimicaetBiophysicaActa(BBA)-GeneRegulatoryMechanisms1819.6(2012):507-513.
2)Russel,Daniel,etal."Puttingthepiecestogether:integrativemodelingplatformsoftwareforstructuredeterminationofmacromolecularassemblies."PLoSBiol10.1(2012):e1001244.
3)Fernandez-Martinez,Javier,etal."StructureandFunctionoftheNuclearPoreComplexCytoplasmicmRNAExportPlatform."Cell167.5(2016):1215-1228.
Emergenceofcomplexityandandindancechoreography
ÉléonoreBellot
MobileLab,CRI(CentedeRecherchesInterdisciplinaires)
Complexsystemsaresystemscomposedofmanyinteractingcomponents,exhibitingglobalpropertiesemergingfromthelocalnon-linearinteractionsoftheseentities,sothatthesystemisnotreducibletoonelevelofdescription.Or,astoldshorter:'thewholeismorethanthesumofitsparts'.Itadressesdifferentconceptsasself-organization,collectivemotion,emergenceofpatterns.
Collectivemotionofflocksofbirdsisagoodexample,asbirdsactsaccordingtothenear-neighbours,andscaleeffectsleadstocomplexpatternsuchasspiralingwithoutanycentralcoordination,thancanbefullymodeledwithtoolsofstatisticalphysics.
Thefieldofcomplexsystemscutsacrossmanydisciplinessuchasphysics,biology,socialsciences,economy,management...anditalsorelatestosomechoreographicworksinwhichthebehaviorresultedisproducedbyexplicit-or-notinteractionrulesbetweenthedancers.Thiscanbeseenforexamplequiteeasilyinfolkdances:quitesimplelocalrulescreatenicedynamicalpatterns.
Thesechoreographic(inabroadmeaningincludingimprovisation)worksarethematterofstudyofthisinternship.
Theideaisfirsttotrytoconfronttheformalismofcomplexsystemstodanceworks,toseetowhatextenditcanberelevant.Then,helpedbyinterviewswithchoreographswhoseworksrelate,tohaveanideaoftheirapproachandunderstandingofthisfield,andalsohowdotheysharethisinformationwiththedancers.Lastly,theideawouldbetodevelopincollaborationwiththeartistsachoreographictoolboxofcomplexsystemstricks,freelyusableandeditableasanopeninspirationfordancers,tobroadentheusetheycanhaveofthisfieldintheirprojects.
Parameterssearchingforasynthetictoggleswitch
SebastiánSosa-Carrillo
CentredeRecherchesInterdisciplinaires
InternatLifewareteam,INRIASaclay
Nowadaystheuseofmodelstodescribebiologicalphenomenaisbeingmoreandmorecommon.Oneofthemostcomplicatedtaskinthisapproachistoinferthemodel’sparametersfromtherealexperimentaldata.Severalmethodshavebeendevelopedtoachievethisgoal.Forinstance,oneofthemistosocalledCMA-ES,whichaimstofindgoodmodelparametersbyrandomlysearchingintheparametersspaceinordertominimizethemismatchbetweenthesimulateddataandtherealexperimentaldata,thismismatchiscalled“thecost”.Someofthelimitationsofthismethodsarecausedduetothehigh-dimensionalparametersspaceandthefactthatthecostfunctioncanpresentmultiplelocalminima,soitcanbenotguaranteedthatthebestparametersarereturnedbythismethod.Inadditiontothis,moreconstraintscanbeimposedbytheobservedbehaviorofthesystemindifferentexperimentalconditions,aswellasthefactthatthemodelparametersneedabiologicalmeaningthatmustbecoherentwithwhatisknownaboutthesystem.Mygoalinthisinternshipistoconstraintheparameterssearchofourtoggleswitchmodelinordertofindparameterswhichaccomplishagoodfittingoftheexperimentaldata,whilealsobeingabletoreproducethesystembehaviorindifferentexperimentalconditions,andofcoursemaintainingtheabiologicalmeaningwhichisinagreementwiththecurrentknowledge.
References:
Gardner,T.S.,Cantor,C.R.,&Collins,J.J.(2000).ConstructionofagenetictoggleswitchinEscherichiacoli.Nature,403(6767),339-342.
Hansen,N.,&Ostermeier,A.(2001).Completelyderandomizedself-adaptationinevolutionstrategies.Evolutionarycomputation,9(2),159-195.
Lugagne,J.B.(2016).Contrôletemps-réeld'unebasculegénétique(Doctoraldissertation,UniversitéParis7).
Evaluationofmonotonicregressionmethodforthepredictionofcomplexphenotypes
fromtranscriptomicdata
MislavAcman
SystemsBiologygroup(Schwikowskigroup),InstitutePasteur,Paris
Two-dimensionalmonotonicregressionisamachinelearningclassificationmethod,whichcanassociatequantitativestatesofbiomolecules,suchasmRNAabundance,withquantitativephenotypesofinterest.Theapproachisbasedonmonotonicregression[1],anditsmainadvantageisfastbrowsingthroughawiderangeoflinearandnon-linearrelationshipsbetweenpairsofpredictors.First,itfitsthebestmonotonicfunctiontoallpossiblepairsofpredictors(bothtrueandinversevalues)inthetrainingdataset.Then,byutilizingleave-one-outcross-validation(LOOCV),thepairsareassessedfortheirpredictiveperformance.Top-scoringpairsareselected.Theyconstituteasignaturethatisusedforfurtherpredictionsofthemethod.
Thusfar,monotonicregressionwassuccessfullyusedtoanalyseawholebloodtranscriptomedatasetofdenguepatientsinordertoidentifypatientswhoareprobabletodevelopasevereformofthedisease,whichrequireshospitalization.However,beforepresentingthistooltothebioinformaticscommunity,furtherevaluationofthemethodisneeded.Forthis,severalindependentdatasetswereselectedfromGeneExpressionOmnibusdatabase(GEO,https://www.ncbi.nlm.nih.gov/geo/)[2,3,4].Thedatasetswerethenusedtocomparethepredictivepowerandflexibilityofmonotonicregressiontopublishedresultsandothercommonlyusedclassificationmethods.
References:
1)Stout,QuentinF."Isotonicregressionformultipleindependentvariables."Algorithmica71.2(2015):450-470.
2)Berry,MatthewPR,etal."Aninterferon-inducibleneutrophil-drivenbloodtranscriptionalsignatureinhumantuberculosis."Nature466.7309(2010):973-977.
3)Zak,DanielE.,etal."AbloodRNAsignaturefortuberculosisdiseaserisk:aprospectivecohortstudy."TheLancet387.10035(2016):2312-2322.
4)Kong,SekWon,etal."Characteristicsandpredictivevalueofbloodtranscriptomesignatureinmaleswithautismspectrumdisorders."PLoSOne7.12(2012):e49475.
PredictingProcessingCostofAnaphoraResolution
OlgaSeminck
UniversitéParisDiderotLaboratoiredeLinguistiqueFormelle,EcoleDoctoraleFrontièresduVivant
Anaphoraresolution,forhumanspeakers,canbemoreorlesscostlydependingonvariousfactorslikeambiguity,syntacticcomplexityandsemanticplausibility.Thevariationofcosthasbeenmeasuredbymanystudiesinpsycholinguistics,throughexperimentalparadigmslikeself-pacedreading,oreye-tracking.Ourprojectaimsatdevisingasystem,inspiredbycurrentNLPcoreferenceresolutionsystems,thatcanpredictaprocessingcostforanaphoraresolution,whichcanbeevaluatedbyrunningoursystemonhumandatacomingfrompsycholinguisticexperiments,oreye-trackingcorporae.g.theDundeeCorpus(Kennedyetal.2003).Inspiredbysurprisaltheory(Hale2001)andtheentropyreductionhypothesis(Hale2006),weproposeacontinuous,incrementalmeasurethatassignsprocessingcosttoanaphora.Ourmeasurereflectshowcertainaprobabilisticanaphoraresolutionsystemisaboutitsdecisions.Todoso,withasimpleanaphoraresolutiontool,wecomputeaprobabilitydistributionoverallantecedentcandidatesofananaphorandcalculateentropyoverit.Wehypothesizethattheentropyoverthisdistributioncanbeseenastheprocessingcostoftheresolutionoftheanaphor.Sothesmallertheentropy,thelessprocessingcostthatispredicted.Afirststudyweconductedontwobiasesthatwerediscoveredbypsycholinguists(SubjectAssignmentStrategyandParallelFunctionHypothesis(e.g.Crawleyetal.1990))showedthatourmodelwasabletosimulatehumanperformanceinthesematters:itassignedthepronounsinawaycomparabletohumanparticipantsandthecostitpredictedcorrespondedtoreadingtimesrecordedinself-pacedreadingexperiments.
References:
RosalindACrawleyetal.(1990).Theuseofheuristicstrategiesintheinterpretationofpronouns.In:JournalofPsycholinguisticResearch19.4,245–264
JohnHale(2001).AprobabilisticEarleyparserasapsycholinguisticmodel.In:ProceedingsofthesecondmeetingoftheNorthAmericanChapteroftheAssociationforComputationalLinguisticsonLanguagetechnologies.AssociationforComputationalLinguistics,1–8
JohnHale(2006).Uncertaintyabouttherestofthesentence.In:CognitiveScience30.4,643–672
AlanKennedyetal.(2003).Thedundeecorpus.In:Proceedingsofthe12thEuropeanconferenceoneyemovement
Towardsafastbrain-machineinterfaceintegratingartificialsomatosensory
feedback
AamirAbbasi
UnitédeNeuroscience,InformationetComplexité(UNIC),CNRS,Gif-sur-Yvette.
Inthisproject,wewillinvestigatetheimpactofsomatosensoryfeedbackonlearningamotortask,usingaBrainMachineInterface(BMI)set-up.Wewillcombineelectrophysiology,optogeneticsandanexternalprosthesistoimplementaclosedloopsensorimotorBMIinthemouse.Onthemotorside,theactivityofsingleunitsontheprimarymotorcortexwilldriveaprosthesiscarryingareward.Onthesensorysideinordertoinvestigaterulesofneuralcodingwewillperformpatternedstimulationdirectlyinthebarrelfieldoftheprimarysomatosensorycortexrepresentingthewhiskersofthemouse.Severalpatternsofsensorystimulationwillbetested.Biomimeticpatternswillconsistofactivatingtheregionsrepresentingthewhiskersinapatterncorrespondingtothemovementofanobjectalongthesnout,thusmimickingthemovementofprosthesisinspace.Anotherapproachwillbetoapplyarbitraryfixedrulesofstimulation,andusetheadaptivepropertiesofthenetworktolinkthespatiotemporalpatternsofstimulationtotheinformationaboutthepositionoftheprosthesisinspace.Overall,withourinvestigationweaimtostudythepotentialofnaturalsomatosensoryinputsbio-mimickeryindevelopingefficientBMIs,bothintermsofprosthesislearningandreliability.
References:
ArduinP.-J.,FregnacY.,ShulzD.E.&Ego-StengelV.(2014)Bidirectionalcontrolofaone-dimensionalroboticactuatorbyoperantconditioningofasingleunitinratmotorcortex.FrontNeurosci8,206.
ArduinP.-J.,FregnacY.,ShulzD.E.&Ego-StengelV.(2013)"Master"neuronsinducedbyoperantconditioninginratmotorcortexduringabrain-machineinterfacetask.JNeurosci33(19),8308-8320.
BensmaiaS.J.&MillerL.E.(2014)Restoringsensorimotorfunctionthroughintracorticalinterfaces:progressandloomingchallenges.NatRevNeurosci15(5),313-325.
TRAITLEVELANALYSISOFMULTITRAITPOPULATIONPROJECTIONMATRICES
ChristopheCoste
Laboratoired'Eco-anthropologieetEthnobiologie/UMR7206Equipe"AnthropologieEvolutive",MNHN,Muséedel'Homme
Inmostmatrixpopulationprojectionmodels,individualsarecharacterizedaccordingtousuallyoneortwotraits;suchasage,stage,sizeorlocation.Abroadtheoryofmultitraitpopulationprojectionmatrices(MPPMs)incorporatinglargernumberoftraitswaslongheldbackbytimeandspacecomputationalcomplexityissues.Asaconsequence,nostudyhasyetfocusedontheinfluenceofthestructureoftraitsdescribingalife-cycleonpopulationdynamicsandlife-historyevolution.
Wepresenthereanovelvector-basedMPPMbuildingmethodologythatallowstocomputationally-efficientlymodelpopulationscharacterizedbynumerousandlargetraits,andextendsensitivityanalysesforthesemodels.Wethenpresentanewmethod,thetraitlevelanalysisconsistinginfoldinganMPPMonanyofitstraitstocreateamatrixwithalternativetraitstructurebutsimilarasymptoticproperties.ByaddingorremovingoneorseveraltraitstotheMPPM,andanalyzingtheresultingchangesinspectralproperties,thisallowsinvestigatingtheinfluenceofthetraitstructureontheevolutionoftraits.
Weillustratethisbymodelinga3-trait(age,parityandfecundity)populationdesignedtoinvestigatetheimplicationsofparity-fertilitytrade-offsinacontextoffecundityheterogeneityinhumans.TheTraitlevelanalysis,comparingmodelsofthesamepopulationmodeledwithdifferenttraits,demonstratesthatthesensitivityoffitnesstoage-specificfertilitydiffersbetweencaseswithorwithoutfertility-paritytrade-offs.Moreoveritshowsthatage-specificfertilityhasverydifferentevolutionarysignificancedependingonwhetherheterogeneityisaccountedfor.Thisisbecausetrade-offscanvarystronglyinstrengthandevendirectiondependingonthetraitstructureusedtomodelthepopulation.
UriBarenholzfromWeizmannInstitute
Designprinciplesofautocatalyticcyclesconstrainenzymekineticsandforcelowsubstratesaturationatfluxbranchpoints
Asetofchemicalreactionsthatrequireametabolitetosynthesizemoreofthatmetaboliteisanautocatalyticcycle.Wefindthatmostofthereactionsinthecoreofcentralcarbonmetabolismarepartofcompactautocatalyticcycles.Ouranalysisshowsthatsuchmetabolicdesignsmustmeetspecificconditionstosupportstablefluxes,henceavoidingdepletionofintermediatemetabolites.Autocatalyticcyclesarethereforesubjectedtoconstraintsthatmayseemcounter-intuitive:theenzymesofbranchreactionsoutofthecyclemustbeoverexpressedandtheaffinityoftheseenzymestotheirsubstratesmustberelativelyweak.
WeuserecentquantitativeproteomicsandfluxomicsmeasurementstoshowthattheaboveconditionsholdforfunctioningcyclesincentralcarbonmetabolismofE.coli.Ourworkdemonstratesthatthetopologyofametabolicnetworkcanshapethekineticparametersofenzymesandleadtoseeminglywastefulenzymeusage.