sas, java & excelsas, java & excel: a ménage à trois / october 2016 / dana vinzelberg •...

23
PhUSE Annual Conference / October 2016 / Dana Vinzelberg A ménage à trois SAS, Java & Excel

Upload: others

Post on 12-Aug-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

PhUSE Annual Conference / October 2016 / Dana Vinzelberg

A ménage à trois SAS, Java & Excel

Page 2: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

PhUSE Annual Conference / October 2016 / Dana Vinzelberg

A ménage à trois SAS, Java & Excel

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

Page 3: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

•  Introduction

•  Communication

•  Obstacles

•  Improvement

•  Conclusion

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

Outline Introduction

Introduction

Page 4: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

•  Excel commonly known

•  for data exchange

•  Excel data accessible in SAS

•  process and output

Excel

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

Introduction

Page 5: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

Communication

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

Introduction Communication

Page 6: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

•  many different ways

•  PROC IM-/EXPORT

•  DBMS=EXCEL only for Windows

•  DBMS=XLS differently

•  ODS EXCEL

•  not on native Excel format

•  LIBNAME EXCEL

•  since SAS 9.3

•  lock on Excel file whilst active

SAS and Excel

LIBNAME EXCEL

PROC IM-/EXPORT

ODS EXCEL

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

Communication

Page 7: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

•  intermediate step: Java

•  many free packages

•  Apache POI HSSF/XSSF

•  Access to Excel file

Excel and Java

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

HSSF-/XSSFWorkbook

HSSF-/XSSFSheet

HSSF-/XSSFRow

HSSF-/XSSFCell

Communication

Page 8: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

Java and SAS

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

•  utilize Apache POI HSSF/XSSF

•  Java class for reading Excel

•  pack into JAR (Java ARchive)

Communication

Page 9: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

%Excel2SAS

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

•  utilize Apache POI HSSF/XSSF

•  Java class for reading Excel

•  pack into JAR (Java ARchive)

•  accessible in SAS

•  JavaObj statement in

Data Step

•  create Dataset(s)

Ø  Import Excel data to SAS

%MACRO(); %add_to_CLASSPATH; ... DATA; ATTRIB passing_value LENGTH=$200; DECLARE JavaObj wb (‘workbook’, Excel); wb.callStringMethod(‘get_cell_value’,

2, 4, passing_value);

RUN; ... %MEND;

Communication

Page 10: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

%SAS2Excel

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

•  Export Dataset(s) to Excel

•  get Dataset

•  utilize Apache for Writing

•  create Excel file in Java

•  pass Dataset values %MACRO(); ... %add_to_CLASSPATH; ... DATA; DECLARE JavaObj(); RUN; ... %MEND;

Communication

Page 11: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

Communication

Obstacles

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

Obstacles

Page 12: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

•  different amount of data types: 9

•  only two of them can match

•  Java interprets numeric as double and character as string

•  Java methods for communication with SAS:

•  only arguments of types string and double (or none)

Data Types vs. 2

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

Obstacles

Page 13: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

•  other way around

•  SAS functions for each Java data type

Data Types

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

Obstacles

TYPE JAVAOBJ METHODS void callVoidMethod double callDoubleMethod String callStringMethod char callCharMethod Boolean callBooleanMethod short callShortMethod byte callByteMethod long callLongMethod float callFloatMethod int callIntMethod

Page 14: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

GB

Writing (aka The Fight with the Heap Space)

Large Files

•  problem with large datasets

•  java.lang.OutOfMemoryError: Java heap space

•  Java working space = Heap Space

•  Apache provides SXSSF

•  Writing in parts (stream)

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

Obstacles

Page 15: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

Reading Large Files

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

Obstacles

•  Apache provides nothing for faster reading

•  reading with SAX parser in Java

•  not on native Excel file but XML version

•  adapted to provide same functionality as Apache

Page 16: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

•  unevenly filled tables

•  filled up with empty cells

•  format length

Specialties of the Liaison

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

$200. $203. $207. $212. $218. $223. $227. $231. $236. $245.

Obstacles

Page 17: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

Improvement

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

Obstacles Improvement

Page 18: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

•  through Parameter

•  %SAS2Excel

•  exclude variables

•  raw values

or formatted values

•  %Excel2SAS

sheet range

eference data

Enhanced Functionality

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

Improvement

Page 19: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

•  through Parameter

•  %SAS2Excel

•  exclude variables

•  raw values

or formatted values

•  %Excel2SAS

•  sheet range

•  reference dataset

Enhanced Functionality

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

Improvement

Page 20: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

•  SAS: statistics and big data

•  Java: platform independence

•  Excel: commonly known

•  all advantages

together in two macros:

•  %SAS2Excel

•  %Excel2SAS

The Power of Three

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

Improvement

Page 21: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

•  ménage à trois great tools

•  implementation easy for

SAS programmers

•  with interest in Java

Conclusion

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

Improvement Conclusion

Page 22: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

Questions?

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

Conclusion

Page 23: SAS, Java & ExcelSAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg • utilize Apache POI HSSF/XSSF • Java class for reading Excel • pack into JAR (Java ARchive)

Thank You!

SAS, Java & Excel: A ménage à trois / October 2016 / Dana Vinzelberg

Conclusion