nasa ames data sciences group - amazon web services · • nasa engineering and safety center •...
TRANSCRIPT
![Page 1: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/1.jpg)
NASAAmesDataSciencesGroup
www.nasa.gov •1
NikunjC.Oza,Ph.D.Leader,DataSciencesGroup
![Page 2: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/2.jpg)
•2
TheDataSciencesGroupatNASAAmes
GroupMembersIlya AvrekhKamalika Das,Ph.D.DaveIversonVijayJanakiraman,Ph.D.RodneyMartin,Ph.D.BryanMatthewsDavidNielsenNikunjOza,Ph.D.VeronicaPhillipsJohnStutzHamed Valizadegan,Ph.D.+summerstudents
Team Members are NASA Employees, Contractors, and Students.
FundingSources
• ScienceMissionDirectorate:AISTandCMACprograms
• NASAAeronauticsResearchMissionDirectorate- ATD,SMART-NAS,SASOProject
• NASAEngineeringandSafetyCenter
• ExplorationSystemsMissionDirectorate,ExplorationTechnologyDevelopmentProgram
• Non-NASA:DARPA,DoD
DataMiningResearchandDevelopment(R&D)forapplicationtoNASAproblems(Aeronautics,EarthScience,SpaceExploration,SpaceScience)
![Page 3: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/3.jpg)
ExampleDataMiningProblems
• Aeronautics:AnomalyDetection,PrecursorIdentification,textmining(classification,topicidentification)
• EarthScience:Fillinginmissingmeasurements,anomalydetection,teleconnections,climateunderstanding
• SpaceScience:Kepler planetcandidates• SpaceExploration:systemhealthmanagement,vascularstructureidentification
![Page 4: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/4.jpg)
FourV’sofBig Tough,SleepDeprivingData
AmazingAlgorithm
ØVolume:Ø RadarTracks:47facilities(1
year)~423GB(Compressed),~3.2TB(CSV)
Ø WeatherandForecast(EntireNAS):CIWS~2.8TB
ØVelocityØ RadarTracks:47Facilities
Ø ~35GB/month(compressed).
Ø ~268GB/month(uncompressed)
Ø WeatherandForecast(EntireNAS):CIWS~233GB/month
ØVeracityØ DatadropoutsØ DuplicatetracksØ TrackendinginmidairØ Reusedflightidentifiers
ØVarietyØ Numerical
(continuous/binary)Ø Weather(forecast/actual)Ø Radar/AirportmetadataØ ATCVoiceØ ASRStextreports
(Pilot/Controller)
IntuitiveReports
![Page 5: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/5.jpg)
AeronauticsDataMiningProblems
• AnomalyDetection– AnomalyDiscoveryoverlargesetofvariables– Particularvariableofinterest,forexample,fuelburn
• Determineexpectedinstantaneousfuelburngivencurrentstateofaircraft
• Comparewithactualinstantaneousfuelburn• Wheredifferenceishigh,problemmaybeoccurring
• PrecursorIdentification– Givenundesirableeffect(e.g.,go-around),identifyprecursors(e.g.,overtakesituation,highspeedapproach)
• Textmining– Textclassification,topicidentification
![Page 6: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/6.jpg)
TopicExtractionExample
autopltacftspd
capturemoderatelevel
engagedleveloffvertctl
disconnectedselectedfpmlightclbpitch
manuallywarningpwr
TOPIC1
timedayleg
contributingfactorshrscrewfactorfatiguenighttriprestdutyflyinglonglate
previousincidentlack
alerter
TOPIC2apchrwyvisualilstwrlndglocarptfinal
missedclredmsl
interceptvectoredsightgar
terrainfield
uneventfulctl
TOPIC3
Otherexamplesof‘fatigue’
AltitudeDeviationSpatialDeviationRampExcursionLandingwithoutclearanceRunwayIncursionUnstableApproach
![Page 7: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/7.jpg)
AeronauticsAnomalyDetection:CurrentMethods
Exceedance-BasedMethods• Knownanomalies• Conditionsover2-3variables(e.g.,speed>250knots,altitude=1000ft,landing)
• Cannotidentifyunknownanomalies• Lowfalsepositiverate,highfalsenegative(misseddetection)rate.
![Page 8: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/8.jpg)
Data-DrivenMethods
• DISCOVERanomaliesby– learningstatisticalpropertiesofthedata– findingwhichdatapointsdonotfit(e.g.,faraway,lowprobability)
– nobackgroundknowledgeonanomaliesneeded
• Complementarytoexistingmethods– Lowfalsenegative(misseddetection)rate– Higherfalsepositiverate(identifiedpoints/flightsunusual,butnotalwaysoperationallysignificant)
• Data-drivenmethods->insights->modificationofexceedance detection
![Page 9: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/9.jpg)
Example:HighSpeedGo-Around
• OvershootsExtendedRunwayCenterline(ERC)byover1SM
• Over250Kts @2500Ft.• Angleofintercept>40°• Overshoots2nd approach
BryanMatthews,DavidNielsen,JohnSchade,Kennis Chan,andMikeKiniry,AutomatedDiscoveryofFlightTrackAnomalies,33rd DigitalAvionicsSystemsConference,2014
![Page 10: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/10.jpg)
ProvidingDomainExpertFeedback
Input Features Anomalies
Nominals
MKAD
SME
Operationally significant anomalies
Uninteresting anomalies
Activelearning strategy
Input Features
MKAD
Training
Anomalies
Nominals
Testing
Rationale features
Active learning with rationales framework
2-class classification/ranking
algorithm
Manali Sharma,Kamalika Das,MustafaBilgic,BryanMatthews,DavidNielsen,andNikunjOza,ActiveLearningwithRationalesforIdentifyingOperationallySignificantAnomaliesinAviation,EuropeanConferenceonMachineLearningandPrinciplesandPractices
OfKnowledgeDiscovery(ECML-PKDD),2016
![Page 11: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/11.jpg)
EarthScienceExample
• Understandrelationshipsbetweenecosystemdynamics andclimaticfactors
• Modelasaregressionanalysisproblem• 3sciencequestions– Magnitudeandextentofecosystemexposure,sensitivityandresiliencetothe2005and2010Amazondroughts
– Understandhuman-inducedandotherattributionascausesofvegetationanomalies
– Howlearneddependencymodelvariesacrosseco-climaticzonesandgeographicalregionsonaglobalscale
NASAESTOAIST-14project,UncoveringEffectsofClimateVariablesonGlobalVegetation(PI:Kamalika Das,Ph.D.)
![Page 12: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/12.jpg)
ProblemFormulation
• Point-to-pointregressionanalysis(GeneticProgrammingbasedSymbolicRegression)
• Estimatespatio-temporaldependencyofforestecosystemsonclimatevariables
Vijt=f(Lcij, CVij
t, CVnbt, CVij
t-1, CVnbt-1,.....CVij
t-k, CVnbt-k)
V:vegetation, i,j:pixellocationindicesLC:landcover type, t:timeindexCV:climate variable(s) nb:spatialneighborhoodof
indexi,jk:temporaldependencyOpenchallenges: 1.Estimatingfunctionf
2.Estimatingbestchoicesfork,nb
![Page 13: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/13.jpg)
DataPipeline
NDVIResolution: 250 m
Projection: Sinusoidal
LSTResolution: 1 km
Projection: Sinusoidal
TRMM (Ver 6)Resolution: 25 kmProjection: WGS84
Reprojectand
resampledata
NDVI, TRMM, LST
Resolution: 1 kmProjection: WGS84
Filterdatabasedonlandcover
2000 – 2010 Monthly data
Time-Series:Changetoseasonal
Monthly -> Seasonal Windowing:
Smoothingover25x25sizewindow
4 Seasons/yea
r2000 – 2010 Seasonal
data
Season 1: March – MaySeason 2: June – SepSeason 3: OctSeason 4: Nov - Feb
![Page 14: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/14.jpg)
Resultsfor2004-2010
Year RidgeRegression LASSO SVR Symbolic
Regression
2004 0.284 0.284 0.280 0.262
2005 0.289 0.289 0.288 0.278
2006 0.426 0.426 0.430 0.321
2007 0.374 0.374 0.370 0.318
2008 0.308 0.308 0.310 0.336
2009 0.353 0.353 0.360 0.328
2010 0.546 0.547 0.540 0.479
Marcin Szubert,Anuradha Kodali,Sangram Ganguly,Kamalika Das,andJoshC.Bongard,ReducingAntagonismbetweenBehavioralDiversityandFitnessinSemanticGeneticProgramming,ProceedingsoftheGeneticandEvolutionaryComputation
Conference(GECCO),pp.797-804,2016.
Mean Squared Error
![Page 15: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/15.jpg)
OngoingandFutureWork• Experimentwithdifferentcombinationsoftemporal
lookback and/orspatialeffects• Introduceadditionalregressors(radiation,forestfiremaps,
deforestationmaps)• StudytheeffectofdifferentregressorsondifferentAmazon
tiles• DerivenonlinearGPmodelsonAmazontiles• Givenappropriatehistoricaldata,havetheabilitytopredict:
“Underwhatconditionsdoesvegetationnotrecoverwithinacertaintimeframe.”
• Doglobalscaleanalysisinparallel
![Page 16: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/16.jpg)
VESsel GENeration (VESGEN)AnalysisPatriciaParsons-Wingerter,PhD,NASAChiefInnovator/POCNASAAmes2016InnovationFundAward,ChiefTechnologist’sOffice
• VESGEN2Dmapsandquantifiesvascularremodelingforawidevarietyofquasi-2Dvascularizedbiomedicaltissueapplications.
• WorkingontransformingtoVESGEN3D,inlinewithmostvascularizedorgansandtissuesinhumansandvertebrates.
• Vascular-dependentdiseasesincludecancer,diabetes,coronaryvesseldisease,andmajorastronauthealthchallengesinthespacemicrogravityandradiationenvironments,especiallyforlong-durationmissions.
• Onekeycomponentisbinarization:conversionofgrayscaleimagestoblack/whitevascularbranchingpatterns.– Takes10-25hoursofhumaneffort.– Exploringpatternrecognition,matchingfiltering,vessel
tracking/tracing,mathematicalmorphology,multiscaleapproaches,andmodelbasedalgorithms.
![Page 17: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/17.jpg)
OTSUThresholding
![Page 18: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/18.jpg)
OTSUvs.AdaptiveThresholding
![Page 19: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/19.jpg)
FutureWork• Workinprogress:exploringmorepreprocessingandpost-processingtechniques
• Eachstepofpreprocessingandpostprocessing hassomeinputparameters– Theresultissensitivetothisparameters– Weaimtomaketheparameterselectioneitherautomated(machinelearning)orsemi-automated(usercanchoosetherightparameter)
• MachineLearningtolearnthebinarization– Giventhemanuallabels,performsupervisedorsemi-supervisedlearning
– Eachpixelanditsclasslabel(foregroundorbackground)isthetrainingexample
![Page 20: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/20.jpg)
DASHlinkdisseminate.collaborate.innovate.https://dashlink.ndc.nasa.gov/
DASHlinkisacollaborativewebsitedesignedtopromote:• Sustainability• Reproducibility• Dissemination• Communitybuilding
Userscancreateprofiles• Sharepapers,uploadanddownloadopensourcealgorithms• FindNASAdatasets.
How dowegettheWordOut?
![Page 21: NASA Ames Data Sciences Group - Amazon Web Services · • NASA Engineering and Safety Center • Exploration Systems Mission Directorate, Exploration Technology Development Program](https://reader033.vdocuments.us/reader033/viewer/2022050519/5fa28c236d82fc405128800e/html5/thumbnails/21.jpg)
NASAAmesDataSciencesGroup
www.nasa.gov •21
NikunjC.Oza,Ph.D.Leader,DataSciencesGroup