data science academy student demo day--peggy sobolewski,analyzing transporation equities using r
DESCRIPTION
Data Science Academy, Student Demo day, Data science by R, Vivian S. Zhang, see www.nycdatascience.com for more details.TRANSCRIPT
AnalyzingTransportation
Equitiesusing R
with Peggy Sobolewski
NYC Data Science AcademyStudent Demo day 07-21-2014R005: Data Science by R(Beginner level)
Talking Points: Previous work on these factors How this is useful? Difficulties I faced during the process
› How I overcame them? What I learned throughout the class
and the project?
Getting the data…
Sys.setenv(JAVA_HOME='C:\\Program Files\\Java\\jre7')install.packages("Rbbg", repos="http://r.findata.org")
#establishing connecting to Bloomberg APIconn <- blpConnect()
securities <- c("ALK US Equity", "DAL US Equity", "JBLU US Equity", "LUV US Equity", "SAVE US Equity", "UAL US Equity", "CHRW US Equity", "EXPD US Equity", "FDX US Equity", "HUBG US Equity", "UPS US Equity", "UTIW US Equity",
"XPO US Equity", "CSX US Equity", "KSU US Equity", "NSC US Equity", "UNP US Equity", "CAR US Equity", "CNW US Equity", "HTZ US Equity",
"JBHT US Equity")
fields <- c("PX_LAST", "TOT_MKT_VAL", "VOLATILITY_90D", "EQY_SH_OUT", "VOLUME")
allsecurities <- bdh(conn, securities, fields, Sys.Date()-730, always.display.tickers=TRUE, nclude.non.trading.days=FALSE,
dates.as.row.names=FALSE)
Got the data… now what? Returns – c(NA, diff(log(maindata$PX_LAST)))
Examine data: › Head(maindata)› Tail(maindata)› Dim(maindata) #13440 by 8› Summary(maindata)› Str(maindata)› Sapply(maindata) – had to fix for date› Names(maindata)
”ticker”, “date”, “PX_LAST”, “TOT_MKT_VAL”, “VOLATILITY_90D”, “EQY_SH_OUT”, “VOLUME”, “returns”
Correlations (GICS sub-industries)freight_logistics <- c("CHRW US Equity", "EXPD US
Equity", "FDX US Equity", "HUBG US Equity", "UPS US Equity", "UTIW US Equity",
"XPO US Equity")
frlo <- bdh(conn, freight_logistics, fields, Sys.Date()- 730, always.display.tickers=TRUE,
include.non.trading.days=FALSE, dates.as.row.names=FALSE)
frloreturns <- c(NA,diff(log(frlo$PX_LAST)))
freightlogistics <- transform(frlo, returns=frloreturns)
Head(freightlogistics)
fl.data <- melt(freightlogistics,id=c("ticker","date"))
unique(fl.data$variable)
rfl.data <- cast(subset(fl.data,variable=="returns"),date~ticker, sum)
summary(rfl.data)
chart.Correlation(rfl.data)
All 4 GICS Sub-Industry Correlation Charts
Freight and Logistics
Railroads
Trucking
Airlines
Market Capmktcap <- ggplot(data=maindata, aes(x=ticker, y=TOT_MKT_VAL, colour=ticker)) +
geom_point() + theme_bw() + theme(panel.grid.major = element_blank(), panel.background = element_blank(),
legend.background=element_rect(fill="white", colour="white") ) + labs(title="Total Market Cap for Each Security for the Last 3 Years", x="Ticker",
y="Total Market Cap") print(mktcap)
Closing Priceprice <- ggplot(data=maindata, aes(x=ticker, y=PX_LAST, colour=ticker))+ geom_point() +theme_bw() + theme(panel.grid.major = element_blank(),
panel.background = element_blank(), legend.background=element_rect(fill="white", colour="white") ) + labs(title="Price for Each Security for the Last 3 Years", x="Ticker", y="Last Price") print(price)
Shares Outstandingshares <- ggplot(data=maindata, aes(x=ticker, y=EQY_SH_OUT, colour=ticker))+ geom_point() + theme_bw() + theme(panel.grid.major = element_blank(), panel.background = element_blank(), legend.background=element_rect(fill="white", colour="white") ) + labs(title="Amount of Shares for Each Security for the Last 3 Years", x="Ticker", y="Amount of Shares Outstanding") print(shares)
Volumevolume <- ggplot(data=maindata, aes(x=ticker, y=VOLUME, colour=ticker)) + geom_point() + theme_bw() + theme(panel.grid.major = element_blank(), panel.background = element_blank(), legend.background=element_rect(fill="white", colour="white") ) + labs(title="Volume for Each Security for the Last 3 Years", x="Ticker",
y="Volume per Day") print(volume)
Delta (DAL) Volume
DAL <- subset(maindata, ticker=="DAL US Equity")
DALvol <- ggplot(data=DAL, aes(x=date, y=VOLUME,
colour=ticker))+ geom_point() + theme_bw() + theme(panel.grid.major = element_blank(),
panel.background = element_blank(), legend.background=
element_rect(fill="white", colour="white")) + labs(title="Delta's (DAL US Equity) Daily
Volume for the Last 3 Years", x="Date", y="Volume per Day")
print(DALvol)