using python to facilitate python facilitating code...
TRANSCRIPT
Python Facilitating Code Refactoring
Using Python to Facilitate
Code Refactoring
Using Python to Facilitate
Code RefactoringCode Refactoring
11
Code RefactoringCode Refactoring
Ben ChristensonAssociate Scientist – Process Optimization
Engineering Sciences – Core R&D
Dow Chemical Company
7/14/2011
Process Automation Legacy Migration
PALM
Yahya NazerM&E Consultant
Engineering Solutions Technology Center
Dow Chemical Company
Background
Code Refactoring is a systematic way of restructuring code without changing the intent of the code.
Background
Code Refactoring is a systematic way of restructuring code without changing the intent of the code.
� Historically code refactoring has been used for:
� Compilers
� Military
� Finance
� Communication� Communication
� Process Control
Background
Code Refactoring is a systematic way of restructuring code without changing the intent of the code.
� Historically code refactoring has been used for:
� Compilers
� Military
� Finance
� Communication� Communication
� Process Control
� PALM worked on Process Control for:
� Translation
� Simulation
� Transition Logic
� Data Mining
� Code Analysis
� Migration
Background
Process Control, within Dow, refers to using computers to control the equipment for the purpose of manufacturing chemicals.
Background
Process Control, within Dow, refers to using computers to control the equipment for the purpose of manufacturing chemicals.
� Dow has been developing the MOD process control systems since 1960
� The MOD system is a state based system that looks similar to FORTRAN
� The MOD system is cutting edge with how it manages process automationprocess automation
� http://www.controlglobal.com/articles/2006/029.html
Background
Process Control, within Dow, refers to using computers to control the equipment for the purpose of manufacturing chemicals.
� Dow has been developing the MOD process control systems since 1960
� The MOD system is a state based system that looks similar to FORTRAN
� The MOD system is cutting edge with how it manages process automationprocess automation
� http://www.controlglobal.com/articles/2006/029.html
Unfortunately the MOD system is also:
� Not object oriented
� Contains hardware tricks for math and logic
� Is based on the VAX system
EM
This will make the EM Finder that is
used to specify the EM and enablement
logic.
Inputs: *.dtn
Outputs: *.em.xlsb
Routines:
Remove_Non_Ascii
Parse_Out_Glossary
Remove_Line_Number
Setup_Database
Add_Const_Calc
Add_Sequence
Parse_Out_Comments
Combine_Continued_Lines
Associative_Term
Truncate_Constants
DK_AK_ZERO
Digsum
FNG
NOT_in_ALARM
Replace_Strings
Associative_Variables
DEV_Function
IMPORT_Function
� PALM as a research project investigate 30 scenarios containing over 200 transformations.
� Each of these scenarios constituted a mode in the PALM engine which was made from the listed transformation routines.
IMPORT_Function
PFS_Function
ABS_Function
-Bit_Shift_Function
Convert_Variables
Negative_Numbers
Convert_Not
Negative_Numbers
Convert_Not
Basic_IF
Remove_Irregular_text
AND_Prescedence
Trim_Variables
Irrelevant_Parentheses
Timer_Function
Adjust_Scale_Factor
Unravel_Formula
Dividing_By_Zero
VB_Syntax
SST_NOT
Overflow_Function
Floating_Point
-XOR_Syntax
…
EM
This will make the EM Finder that is
used to specify the EM and enablement
logic.
Inputs: *.dtn
Outputs: *.em.xlsb
Routines:
Remove_Non_Ascii
Parse_Out_Glossary
Remove_Line_Number
Setup_Database
Add_Const_Calc
Add_Sequence
Parse_Out_Comments
Combine_Continued_Lines
Associative_Term
Truncate_Constants
DK_AK_ZERO
Digsum
FNG
NOT_in_ALARM
Replace_Strings
Associative_Variables
DEV_Function
IMPORT_Function
� PALM as a research project investigate 30 scenarios containing over 200 transformations.
� Each of these scenarios constituted a mode in the PALM engine which was made from the listed transformation routines.
� A Transformation refers to a code refactor that handles a specific function.
IMPORT_Function
PFS_Function
ABS_Function
-Bit_Shift_Function
Convert_Variables
Negative_Numbers
Convert_Not
Negative_Numbers
Convert_Not
Basic_IF
Remove_Irregular_text
AND_Prescedence
Trim_Variables
Irrelevant_Parentheses
Timer_Function
Adjust_Scale_Factor
Unravel_Formula
Dividing_By_Zero
VB_Syntax
SST_NOT
Overflow_Function
Floating_Point
-XOR_Syntax
…
def Associative_Term(code):
""" This will replace TERM(NNN) = cond
with STEP(NNN+1) = cond + " and STEP(NNN) or STEP(NNN+1)"
"""
try:
for i in xrange(len(code)):
line = code[i].text
if(line.startswith("TERM")):
index = line[5:].split(')',1)[0]
term = line[:5+len(index)+1].strip()
current_step = 'STEP('+index+')'
new_step = 'STEP('+str(evl(index)+1)+')'
line = line.replace(term,new_step)
if(line.find(current_step) == -1):
line += " AND " + current_step
code[i].text = line + " OR "+new_step
# end if
# end for
except: exception()
def add_Signal(glos,um_dtn,links):
""" This will add the signal type
If I6B / I4B / I5B / I10B, then put 4..20mA
V5B = 0..5 VDC
V10B = 0..10 VDC
V20B = 0..20 VDC
"""
try:
global signal_types
add_column(glos,'Signal Type')
signal_types = {'I4B':'4-20mA','I5B':'4-20mA','I6B':'4-20mA','I10B':'4-20mA', \
'V5B':'0.5 VDC','V10B':'0.10 VDC','V20B':'0.20 VDC', \
'N2FB':'200 C nickel RTD','N4FB':'400 C nickel RTD', \
'P2DB':'200 C platinum RTD','P4DB':'400 C platinum RTD', \
'P8DB':'800 C platinum RTD','P8HD':'800 C platinum RTD', \
'P8HDB':'800 C platinum RTD', \
'T2JB':'200 C type J','T8JB':'400 C type J', \
'T2EB':'200 C type E TC','T4EB':'400 C type E TC', \
'T8EB':'800 C type E TC','T10KB':'1000 C type K TC', \
Data Mining Transformations
'T8EB':'800 C type E TC','T10KB':'1000 C type K TC', \
'T13KB':'1300 C type E TC','T15SB':'1500 C type S TC', \
'T4TB':'400 C type T TC'}
for k in signal_types.keys():
lines = line_find(um_dtn,'; CALL '+k+'(','Signal Type')
for line in lines:
ln = line.split(';')[0].strip()
for arg in parse_args(line):
head,index = arg.split('_')
stype = signal_types[k]+' ('+k+')'
index = evl(index) - 1
index = index - index%10 + 1
for i in range(index,index+10):
var = head+'_'+('0000'+str(i))[-4:]
glos[var][-1] = stype
links.append(['Signal Type',var,ln])
# end for
# end for
# end for
except: exception()
Transition Logic
� In MOD we have these global variables that tell what mode the process control is in.
� For migration these modes assignment statements need to be converted into transition logic between any two given modes.
� Transition logic were created by:
algebraic expansion and replacement
T3400_MWMaintance Wait
T3400_MW_PW T3400_PW_MW
� algebraic expansion and replacement
of the logic
� removing any logical paradoxes
� breaking the logic up into transitions
� And compressing the logic
T3400_PWProcess Wait
T3400_FILLFill
T3400_HEATHeat
T3400_RUNRun
T3400_CSHDNControl ShutDown
M
M
T3400_PW_FILL T3400_FILL_PW
T3400_FILL_HEAT
T3400_HEAT_RUN
T3400_HEAT_CSHDN
T3400_RUN_CSHDN
T3400_CSHDN_PW
COM & Excel
By using win32com and pythoncom, it was a very simple matter to control excel through python.� http://sourceforge.net/projects/pywin32/files/pywin32/Build%202
14/pywin32-214.win32-py2.3.exe/download
� http://snippets.dzone.com/posts/show/2036
� This creates a COM connection to Excel
def connect(self,visible=0):def connect(self,visible=0):
""" This will simply connect to the excel already running """
try:
if(self.excel == None):
cc('importing com')
from win32com.client import Dispatch
from win32com.client import GetActiveObject
import pythoncom
cc('Initilaizing python com')
pythoncom.CoInitialize()
cc('Dispatching Excel')
self.excel = Dispatch("Excel.Application")
self.excel.DisplayAlerts = False
if(DEBUG > 3 or visible): self.excel.Visible = 1
else: self.excel.Visible = 0
except: exception()
COM & Excel
� This opens an excel workbookdef open_excel(self,file,ws='',clear=1):
""" This will open a com interface to excel and open the file
it will then store that connection in a global variable
clear will close all worksheet already open in excel
"""
try:
local_echo = 100
self.connect()
if(clear): self.clear()
self.sheets = []
self.modules = {}
self.created_macro = []
self.file = os.path.basename(file)
self.working_directory = os.path.split(file)[0]+'\\' self.working_directory = os.path.split(file)[0]+'\\'
cc('Excel Opening file '+file,local_echo)
self.name = os.path.basename(file)
self.workbook = self.get_workbook(self.name)
if(self.workbook == None):
self.excel.Workbooks.Open(file)
self.workbook = self.get_workbook(self.name)
if(ws != ''):
self.active_sheet(ws)
self.excel.Visible = 1
self.excel.DisplayAlerts = False
self.reconnect()
except: exception({'file':file})
COM & Excel
� This will create a VBA macro within Exceldef create_macro(self,module,code):
""" This will create a VBA excel macro within the given module name """
try:
i = 0
code = code.replace('function ','Sub ').replace('Function ','Sub ').replace('sub ','Sub ')
for sub in code.split('Sub'):
if(sub != ''): self.created_macro.append(sub.split('(')[0].strip())
if(not self.modules.has_key(module)):
for j in xrange(self.workbook.VBProject.VBComponents.Count):
wb = self.read_workbook().VBProject.VBComponents(j+1)
if(wb.name == module):
self.modules[module] = wb
break
# end if# end if
# end for
self.modules[module] = self.read_workbook().VBProject.VBComponents.Add(1)
self.modules[module].name = module
while(code[0].strip() == ''): code = code[1:]
buffer = len(code[0]) - len(code[0].lstrip())
if(buffer < 0): buffer = 0
cc("removing buffer of size "+str(buffer),40)
for i in range(len(code)):
assert(code[i][:buffer].strip() == '')
code[i] = code[i][buffer:]
code.append('')
for i in range(0,len(code),500):
code_block = '\n'.join(code[i:i+500]).replace('\t',' ')
self.modules[module].CodeModule.AddFromString(code_block)
except: exception({'module':module,'i':i})
Variable Watch Table VBA Translation Simulation
Simulated Variable value with Original Code
PipeTran Simulation
Clock AC_2010
1 100.000
2 102.000
3 104.000
4 106.000
5 108.000
6 110.000
7 112.000
8 114.000
9 116.000
10 118.000
11 120.000
12 122.000
13 124.000
14 126.000
15 128.000
16 130.000
17 132.000
18 134.000
Predefined SimulationInputs
Historical Plotsof both Inputs and Outputs
� Process Automation refers to the enablement logic that turns on or off the process control.
� By clustering the inputs and outputs and then tracing the genealogy of the inputs to the outputs, the enablement logic can be identified.
Code Analysis
Genealogical Trace
Enablement Logic
� The process control code will need to be broken into blocks of code.
� Then one or more blocks can be replaced with an object.
Migrating to Object Oriented
� Translated code for the blocks can then be sent to the commercial vendor for mapping to their objects.
Next Step
� More research in:
� Code Clone Detection
� Control Module (Object) Refactoring
� Equipment Module (Object with Localized State Based Variables)
� Logic verification on migration
� The PALM project has successfully researched the proof of concept enough to begin implementation of a commercial product.
Lessons Learned
� Python� Development was Fast and readable
� String manipulation was very helpful
� Speed of execution was not a problem
� UI was not created
� Py2exe was problematic
� Win32COM� Versatile and easy to use
� Reliable minus Excel limitations
� Slow
� Not a substitute for Python UI
� PyGraphViz� Easy to Use
� Graphs were not very helpful
� Beyond Compare� Great at comparing Text files
� Bad at comparing Excel files
Python Tricks 1
� Replacing Visual Basic with Python
� Excel calls python as a shell command.
� Python then creates a COM interface back into Excel and executes what ever script the user asks for.
def main():
""" This will call the first argument as a function with the
remaining arguments as the argument for the function """
try:
import sys
if(len(sys.argv) == 1): return
command = sys.argv[1]+'('+','.join(sys.argv[2:])+')'
exec(command)
except: error()
Python Tricks 2
ProxyController.sourceforge.net
Python Pitfalls
� Lists or Dictionaries as default arguments
� def function (arg = []):
� Setting python script time stamp to 0
Python Pitfalls
� Lists or Dictionaries as default arguments
� def function (arg = []):
� Setting python script time stamp to 0
XKCD.com
EPOCHFAIL