automating programming via concept mining, probabilistic reasoning over semantic knowledge base of...

26
Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

Upload: donald-hunter

Post on 18-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

Automating programming via concept mining, probabilistic reasoning over

semantic knowledge base of SE domain

by Max Talanov

Page 2: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

A lot of trivial tasks that could be automated Add/remove field Patronymic on Customer page

Add dropdown list on the form...

A lot of not so trivial solutions that should be reused but are not reused How-tos

Libraries...

Approximately 60% of developer's time in outsourcing is spent to solve this kind of problems

</ Problem

Page 3: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

Template based code generators: IDEs

Visual studio. IDEA. ...

Template based generators Maven

Archetypes

CASE Tools: Rational Rose

ArgoUML

</ Current Solutions

Page 4: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

Once generated solution is hardly maintainable and require to allocate significant amount of money for farther support. Developer generated the solution based on DB structure,

then added some functionality in it, then customer wants to change the DB structure.

Developer has to regenerate and merge his further changes

This is done for only one reason – generator does not understand what it's doing.

</ They cannot solve problem

Page 5: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

2005 MIT media lab published the article “FEASIBILITY STUDIES FOR

PROGRAMMING IN NATURAL LANGUAGE”.

Metafor is the program that creates the sceleton of the Python classes based on

shallow English descripton.

</ MIT Metafor

Page 6: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

Input: shallow English description

Output: scaffolding Python classes.

Metafor utilised natural language processor Montylingua, common sence KB ConceptNet, programmatic interpreter.

</ MIT Metafor

Page 7: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

System should:

Operate with changes to be applied not the static structure of the target application (How-to).

Use domain knowledge model to map the inbound requirements into acceptance criteria to be used to create the solution.

Use both trained data and generation to create the solution.

Use several abstraction layers of the target application.

</ Key Ideas

Page 8: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

System has to operate with knowledge.

System has to understand what it is doing Architecture of the target application.

Methods to change the architecture.

Domain specific information.

Requirements for changes.

System has to understand the human operator Communicate in natural language.

</ Requirements

Page 9: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

Knowledge base.

Linguistic component.

Perceiving component.

Solution generator.

Communication component.

</ Key Components

Page 10: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

LinguisticReq PerceivingSolution

generator

Communicator

Updatedapp

Request

KB

</ Collaboration Diagram

Page 11: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

Is the main storage of the data to be used by the system.

KB is RDF storage with OWL data.

KB is used to store the semantic information: Target application architecture

Domain specific knowledge (How-tos)

Common sense information

Predicates generated based on requirements text

Acceptance criteria for generated solution

...

</ Knowledge Base

Page 12: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

Human expert specifies the requirements, linguistuc component generates set of predicates for further processing.

Inbound: CR, bug report or FRS according to SE standard SPICE.

Outbound: set of predicates.

Stanford Parser creates the set of predicates that are treated as inbound knowledge.

LinguisticRequirements

</ Linguistic Component

Page 13: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

Perceiving module maps inbound predicates to the model (domain model) in knowledge base, using trained data and stochastic search generation. In case of failure invokes Communicator to generate clarification request.

Inbound: set of predicates and domain knowledge model.

Outbound: predicates mapped to domain knowledge model.

Linguistic Perceiving Solution generatorPredicates Updated

model

</ Perceiving Component

Page 14: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

Selects or generates the solution for the specified acceptance criteria in updated model. Provides the updated application with the confirmation request to human expert.

Inbound: acceptance criteria in KB

Outbound: solution in actual code.

UpdatedappPerceiving Solution

generatorUpdated

model

</ Solution GeneratorComponents

Page 15: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

Reasoner. Reasoner interface.

Genetic generator. Solution checker.

Trainer. Associator.

Generalizer.

Analogy detector.

Target language translator.

</ Solution GeneratorComponents

Page 16: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

</ Solution GeneratorActivities

Analogy Detector:retrieves solution for similar acceptancecriteria

Solution checker: returns solution assessment

Genetic generator:generates new solution

Communicator:sends confirmation request to human expert.

Communicator: analyses reply of human expert

Trainer:run

Solutuon found?

Solution ok?

Solution ok?

yes

yes

yes

no

no

no

Page 17: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

For the purpose of making a logical inference for axepted alternatives within environment of possible contradictions and several probable variants, we decided to use probabilistic reasoner NARS.

NARS main features: Deduction

Induction

Analogy

...

</ Reasoner

Page 18: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

Inbound: Acceptance criteria.

Outbound: Solution in form of How-tos sequences.

Generator is capable of creation of the sequences of How-tos according to the acceptance criteria. Inference is produced by NARS.

This could be interpreted as human imagination mechanism.

</ Genetic Generator

Page 19: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

Is devoted to logically infer the percentage of how good is generated solution according to acceptance criteria.

Solution checker mainly relies on NARS probabilistic mechanisms, but collects all proper information from KB to be processed by reasoner.

</ Solution Checker

Page 20: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

Machine learning component, is used to detect associations and infer generic associations of inbound acceptance criteria and approved solutions.

Analogy detector is used to retrieve previously learned associations that could be used for specified acceptance criteria.

</ Trainer

Page 21: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

Translates knowledge representation of architecture in actual files in target language, based on previously described syntax in KB.

</ Target LanguageTranslator

Page 22: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

Component is dedicated to generate requests to human expert and analyse replyes of the expert.

Perceiving Solution generator

CommunicatorRequest

</ Communicator

Page 23: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

</ Feedback Loops

LinguisticReq Perceiving Solution generator

Communicator

Updatedapp

Request

KB

Page 24: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

</ Current Implementation

Acceptancecriteria

Solution generator

Communicator

Updatedapp

Request

KB

Page 25: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

Add evolution mechanism.

Add inbound information analysis.

Add architectural analysis.

Add self optimization and self improvement.

Extend perceiving algorithm to use encyclopaedic resources to extend domain knowledge.

</ Future Plans

Page 26: Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov

Metafor:http://web.media.mit.edu/~lieber/Publications/Feasibility-Nat-Lang-Prog.pdf

Maven:http://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html

Stanford Parser:http://nlp.stanford.edu/software/lex-parser.shtml

Open NARS:http://code.google.com/p/open-nars/

Menta: http://code.google.com/p/menta/

</ References