higher-performance r via c++ - part 1:...
TRANSCRIPT
![Page 1: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/1.jpg)
Higher-Performance R via C++Part 1 Introduction
Dirk EddelbuettelUZHETH Zuumlrich R CoursesJune 24-25 2015
058
Overview
What Are We Doing Today and Tomorrow
High-level motivation Three main questions
middot Why Several reasons discussed nextmiddot How Rcpp details usage tips hellipmiddot What We will cover examples
258
Focus on R and C++
middot R Our starting pointmiddot C++ Our extension approachmiddot why how tricks hellip
358
Before the WhyHowWhat
Maybe some mutual introductions
middot Your background (academic industry hellip)middot R experience (beginner intermediate advanced hellip)middot Created modified any R packages middot C andor C++ experience middot Main interest in Rcpp speed extensions hellip middot Following rcpp-devel or r-devel
458
Overview Why R
Why R Programming with Data
ChambersComputationalMethods for DataAnalysis Wiley1977
Becker Chambersand Wilks TheNew S LanguageChapman amp Hall1988
Chambers andHastie StatisticalModels in SChapman amp Hall1992
ChambersProgramming withData Springer1998
ChambersSoftware for DataAnalysisProgramming withR Springer 2008
Thanks to John Chambers for sending me high-resolution scans of the covers of his books
658
A Simple Example
xx lt- faithful[eruptions]fit lt- density(xx)plot(fit)
758
A Simple Example
1 2 3 4 5 6
00
01
02
03
04
05
densitydefault(x = xx)
N = 272 Bandwidth = 03348
Den
sity
858
A Simple Example - Refined
xx lt- faithful[eruptions]fit1 lt- density(xx)fit2 lt- replicate(10000
x lt- sample(xxreplace=TRUE)density(x from=min(fit1$x) to=max(fit1$x))$y
)fit3 lt- apply(fit2 1 quantilec(00250975))plot(fit1 ylim=range(fit3))polygon(c(fit1$xrev(fit1$x)) c(fit3[1]rev(fit3[2]))
col=grey border=F)lines(fit1)
958
A Simple Example - Refined
1 2 3 4 5 6
00
01
02
03
04
05
densitydefault(x = xx)
N = 272 Bandwidth = 03348
Den
sity
1058
So Why R
R enables us to
middot work interactivelymiddot explore and visualize datamiddot access retrieve andor generate datamiddot summarize and report into pdf html hellip
making it the key language for statistical computing and apreferred environment for many data analysts
1158
So Why R
R has always been extensible via
middot C via a bare-bones interface described in Writing RExtensions
middot Fortran which is also used internally by Rmiddot Java via rJava by Simon Urbanekmiddot C++ but essentially at the bare-bones level of C
So while in theory this always worked ndash it was tedious inpractice
1258
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1358
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1458
Why Extend R
Chambers proceeds with this rough map of the road ahead
middot Againstmiddot Itrsquos more workmiddot Bugs will bitemiddot Potential platform dependencymiddot Less readable software
middot In Favormiddot New and trusted computationsmiddot Speedmiddot Object references
1558
Why Extend R
The Why boils down to
middot speed Often a good enough reason for us hellip and a focus forus in this workshop
middot new things We can bind to libraries and tools that wouldotherwise be unavailable in R
middot references Chambers quote from 2008 foreshadowed thework on the new Reference Classes now in R and built uponvia Rcpp Modules Rcpp Classes (and also RcppR6)
1658
Overview Why C++
Why C++
middot Asking Google leads to about 52 million hitsmiddot Wikipedia C++ is a statically typed free-form
multi-paradigm compiled general-purpose powerfulprogramming language
middot C++ is industrial-strength vendor-independent widely-usedand still evolving
middot In science amp research one of the most frequently-usedlanguages If there is something you want to use connectto it probably has a CC++ API
middot As a widely used language it also has good tool support(debuggers profilers code analysis)
1858
Why C++
Scott Meyers View C++ as a federation of languages
middot C provides a rich inheritance and interoperability as UnixWindows hellip are all build on C
middot Object-Oriented C++ (maybe just to provide endlessdiscussions about exactly what OO is or should be)
middot Templated C++ which is mighty powerful template metaprogramming unequalled in other languages
middot The Standard Template Library (STL) is a specific templatelibrary which is powerful but has its own conventions
middot C++11 (and C++14 and beyond) add enough to be calleda fifth language
NB Meyers original list of four languages appeared years before C++11
1958
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 2: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/2.jpg)
Overview
What Are We Doing Today and Tomorrow
High-level motivation Three main questions
middot Why Several reasons discussed nextmiddot How Rcpp details usage tips hellipmiddot What We will cover examples
258
Focus on R and C++
middot R Our starting pointmiddot C++ Our extension approachmiddot why how tricks hellip
358
Before the WhyHowWhat
Maybe some mutual introductions
middot Your background (academic industry hellip)middot R experience (beginner intermediate advanced hellip)middot Created modified any R packages middot C andor C++ experience middot Main interest in Rcpp speed extensions hellip middot Following rcpp-devel or r-devel
458
Overview Why R
Why R Programming with Data
ChambersComputationalMethods for DataAnalysis Wiley1977
Becker Chambersand Wilks TheNew S LanguageChapman amp Hall1988
Chambers andHastie StatisticalModels in SChapman amp Hall1992
ChambersProgramming withData Springer1998
ChambersSoftware for DataAnalysisProgramming withR Springer 2008
Thanks to John Chambers for sending me high-resolution scans of the covers of his books
658
A Simple Example
xx lt- faithful[eruptions]fit lt- density(xx)plot(fit)
758
A Simple Example
1 2 3 4 5 6
00
01
02
03
04
05
densitydefault(x = xx)
N = 272 Bandwidth = 03348
Den
sity
858
A Simple Example - Refined
xx lt- faithful[eruptions]fit1 lt- density(xx)fit2 lt- replicate(10000
x lt- sample(xxreplace=TRUE)density(x from=min(fit1$x) to=max(fit1$x))$y
)fit3 lt- apply(fit2 1 quantilec(00250975))plot(fit1 ylim=range(fit3))polygon(c(fit1$xrev(fit1$x)) c(fit3[1]rev(fit3[2]))
col=grey border=F)lines(fit1)
958
A Simple Example - Refined
1 2 3 4 5 6
00
01
02
03
04
05
densitydefault(x = xx)
N = 272 Bandwidth = 03348
Den
sity
1058
So Why R
R enables us to
middot work interactivelymiddot explore and visualize datamiddot access retrieve andor generate datamiddot summarize and report into pdf html hellip
making it the key language for statistical computing and apreferred environment for many data analysts
1158
So Why R
R has always been extensible via
middot C via a bare-bones interface described in Writing RExtensions
middot Fortran which is also used internally by Rmiddot Java via rJava by Simon Urbanekmiddot C++ but essentially at the bare-bones level of C
So while in theory this always worked ndash it was tedious inpractice
1258
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1358
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1458
Why Extend R
Chambers proceeds with this rough map of the road ahead
middot Againstmiddot Itrsquos more workmiddot Bugs will bitemiddot Potential platform dependencymiddot Less readable software
middot In Favormiddot New and trusted computationsmiddot Speedmiddot Object references
1558
Why Extend R
The Why boils down to
middot speed Often a good enough reason for us hellip and a focus forus in this workshop
middot new things We can bind to libraries and tools that wouldotherwise be unavailable in R
middot references Chambers quote from 2008 foreshadowed thework on the new Reference Classes now in R and built uponvia Rcpp Modules Rcpp Classes (and also RcppR6)
1658
Overview Why C++
Why C++
middot Asking Google leads to about 52 million hitsmiddot Wikipedia C++ is a statically typed free-form
multi-paradigm compiled general-purpose powerfulprogramming language
middot C++ is industrial-strength vendor-independent widely-usedand still evolving
middot In science amp research one of the most frequently-usedlanguages If there is something you want to use connectto it probably has a CC++ API
middot As a widely used language it also has good tool support(debuggers profilers code analysis)
1858
Why C++
Scott Meyers View C++ as a federation of languages
middot C provides a rich inheritance and interoperability as UnixWindows hellip are all build on C
middot Object-Oriented C++ (maybe just to provide endlessdiscussions about exactly what OO is or should be)
middot Templated C++ which is mighty powerful template metaprogramming unequalled in other languages
middot The Standard Template Library (STL) is a specific templatelibrary which is powerful but has its own conventions
middot C++11 (and C++14 and beyond) add enough to be calleda fifth language
NB Meyers original list of four languages appeared years before C++11
1958
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 3: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/3.jpg)
What Are We Doing Today and Tomorrow
High-level motivation Three main questions
middot Why Several reasons discussed nextmiddot How Rcpp details usage tips hellipmiddot What We will cover examples
258
Focus on R and C++
middot R Our starting pointmiddot C++ Our extension approachmiddot why how tricks hellip
358
Before the WhyHowWhat
Maybe some mutual introductions
middot Your background (academic industry hellip)middot R experience (beginner intermediate advanced hellip)middot Created modified any R packages middot C andor C++ experience middot Main interest in Rcpp speed extensions hellip middot Following rcpp-devel or r-devel
458
Overview Why R
Why R Programming with Data
ChambersComputationalMethods for DataAnalysis Wiley1977
Becker Chambersand Wilks TheNew S LanguageChapman amp Hall1988
Chambers andHastie StatisticalModels in SChapman amp Hall1992
ChambersProgramming withData Springer1998
ChambersSoftware for DataAnalysisProgramming withR Springer 2008
Thanks to John Chambers for sending me high-resolution scans of the covers of his books
658
A Simple Example
xx lt- faithful[eruptions]fit lt- density(xx)plot(fit)
758
A Simple Example
1 2 3 4 5 6
00
01
02
03
04
05
densitydefault(x = xx)
N = 272 Bandwidth = 03348
Den
sity
858
A Simple Example - Refined
xx lt- faithful[eruptions]fit1 lt- density(xx)fit2 lt- replicate(10000
x lt- sample(xxreplace=TRUE)density(x from=min(fit1$x) to=max(fit1$x))$y
)fit3 lt- apply(fit2 1 quantilec(00250975))plot(fit1 ylim=range(fit3))polygon(c(fit1$xrev(fit1$x)) c(fit3[1]rev(fit3[2]))
col=grey border=F)lines(fit1)
958
A Simple Example - Refined
1 2 3 4 5 6
00
01
02
03
04
05
densitydefault(x = xx)
N = 272 Bandwidth = 03348
Den
sity
1058
So Why R
R enables us to
middot work interactivelymiddot explore and visualize datamiddot access retrieve andor generate datamiddot summarize and report into pdf html hellip
making it the key language for statistical computing and apreferred environment for many data analysts
1158
So Why R
R has always been extensible via
middot C via a bare-bones interface described in Writing RExtensions
middot Fortran which is also used internally by Rmiddot Java via rJava by Simon Urbanekmiddot C++ but essentially at the bare-bones level of C
So while in theory this always worked ndash it was tedious inpractice
1258
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1358
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1458
Why Extend R
Chambers proceeds with this rough map of the road ahead
middot Againstmiddot Itrsquos more workmiddot Bugs will bitemiddot Potential platform dependencymiddot Less readable software
middot In Favormiddot New and trusted computationsmiddot Speedmiddot Object references
1558
Why Extend R
The Why boils down to
middot speed Often a good enough reason for us hellip and a focus forus in this workshop
middot new things We can bind to libraries and tools that wouldotherwise be unavailable in R
middot references Chambers quote from 2008 foreshadowed thework on the new Reference Classes now in R and built uponvia Rcpp Modules Rcpp Classes (and also RcppR6)
1658
Overview Why C++
Why C++
middot Asking Google leads to about 52 million hitsmiddot Wikipedia C++ is a statically typed free-form
multi-paradigm compiled general-purpose powerfulprogramming language
middot C++ is industrial-strength vendor-independent widely-usedand still evolving
middot In science amp research one of the most frequently-usedlanguages If there is something you want to use connectto it probably has a CC++ API
middot As a widely used language it also has good tool support(debuggers profilers code analysis)
1858
Why C++
Scott Meyers View C++ as a federation of languages
middot C provides a rich inheritance and interoperability as UnixWindows hellip are all build on C
middot Object-Oriented C++ (maybe just to provide endlessdiscussions about exactly what OO is or should be)
middot Templated C++ which is mighty powerful template metaprogramming unequalled in other languages
middot The Standard Template Library (STL) is a specific templatelibrary which is powerful but has its own conventions
middot C++11 (and C++14 and beyond) add enough to be calleda fifth language
NB Meyers original list of four languages appeared years before C++11
1958
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 4: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/4.jpg)
Focus on R and C++
middot R Our starting pointmiddot C++ Our extension approachmiddot why how tricks hellip
358
Before the WhyHowWhat
Maybe some mutual introductions
middot Your background (academic industry hellip)middot R experience (beginner intermediate advanced hellip)middot Created modified any R packages middot C andor C++ experience middot Main interest in Rcpp speed extensions hellip middot Following rcpp-devel or r-devel
458
Overview Why R
Why R Programming with Data
ChambersComputationalMethods for DataAnalysis Wiley1977
Becker Chambersand Wilks TheNew S LanguageChapman amp Hall1988
Chambers andHastie StatisticalModels in SChapman amp Hall1992
ChambersProgramming withData Springer1998
ChambersSoftware for DataAnalysisProgramming withR Springer 2008
Thanks to John Chambers for sending me high-resolution scans of the covers of his books
658
A Simple Example
xx lt- faithful[eruptions]fit lt- density(xx)plot(fit)
758
A Simple Example
1 2 3 4 5 6
00
01
02
03
04
05
densitydefault(x = xx)
N = 272 Bandwidth = 03348
Den
sity
858
A Simple Example - Refined
xx lt- faithful[eruptions]fit1 lt- density(xx)fit2 lt- replicate(10000
x lt- sample(xxreplace=TRUE)density(x from=min(fit1$x) to=max(fit1$x))$y
)fit3 lt- apply(fit2 1 quantilec(00250975))plot(fit1 ylim=range(fit3))polygon(c(fit1$xrev(fit1$x)) c(fit3[1]rev(fit3[2]))
col=grey border=F)lines(fit1)
958
A Simple Example - Refined
1 2 3 4 5 6
00
01
02
03
04
05
densitydefault(x = xx)
N = 272 Bandwidth = 03348
Den
sity
1058
So Why R
R enables us to
middot work interactivelymiddot explore and visualize datamiddot access retrieve andor generate datamiddot summarize and report into pdf html hellip
making it the key language for statistical computing and apreferred environment for many data analysts
1158
So Why R
R has always been extensible via
middot C via a bare-bones interface described in Writing RExtensions
middot Fortran which is also used internally by Rmiddot Java via rJava by Simon Urbanekmiddot C++ but essentially at the bare-bones level of C
So while in theory this always worked ndash it was tedious inpractice
1258
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1358
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1458
Why Extend R
Chambers proceeds with this rough map of the road ahead
middot Againstmiddot Itrsquos more workmiddot Bugs will bitemiddot Potential platform dependencymiddot Less readable software
middot In Favormiddot New and trusted computationsmiddot Speedmiddot Object references
1558
Why Extend R
The Why boils down to
middot speed Often a good enough reason for us hellip and a focus forus in this workshop
middot new things We can bind to libraries and tools that wouldotherwise be unavailable in R
middot references Chambers quote from 2008 foreshadowed thework on the new Reference Classes now in R and built uponvia Rcpp Modules Rcpp Classes (and also RcppR6)
1658
Overview Why C++
Why C++
middot Asking Google leads to about 52 million hitsmiddot Wikipedia C++ is a statically typed free-form
multi-paradigm compiled general-purpose powerfulprogramming language
middot C++ is industrial-strength vendor-independent widely-usedand still evolving
middot In science amp research one of the most frequently-usedlanguages If there is something you want to use connectto it probably has a CC++ API
middot As a widely used language it also has good tool support(debuggers profilers code analysis)
1858
Why C++
Scott Meyers View C++ as a federation of languages
middot C provides a rich inheritance and interoperability as UnixWindows hellip are all build on C
middot Object-Oriented C++ (maybe just to provide endlessdiscussions about exactly what OO is or should be)
middot Templated C++ which is mighty powerful template metaprogramming unequalled in other languages
middot The Standard Template Library (STL) is a specific templatelibrary which is powerful but has its own conventions
middot C++11 (and C++14 and beyond) add enough to be calleda fifth language
NB Meyers original list of four languages appeared years before C++11
1958
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 5: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/5.jpg)
Before the WhyHowWhat
Maybe some mutual introductions
middot Your background (academic industry hellip)middot R experience (beginner intermediate advanced hellip)middot Created modified any R packages middot C andor C++ experience middot Main interest in Rcpp speed extensions hellip middot Following rcpp-devel or r-devel
458
Overview Why R
Why R Programming with Data
ChambersComputationalMethods for DataAnalysis Wiley1977
Becker Chambersand Wilks TheNew S LanguageChapman amp Hall1988
Chambers andHastie StatisticalModels in SChapman amp Hall1992
ChambersProgramming withData Springer1998
ChambersSoftware for DataAnalysisProgramming withR Springer 2008
Thanks to John Chambers for sending me high-resolution scans of the covers of his books
658
A Simple Example
xx lt- faithful[eruptions]fit lt- density(xx)plot(fit)
758
A Simple Example
1 2 3 4 5 6
00
01
02
03
04
05
densitydefault(x = xx)
N = 272 Bandwidth = 03348
Den
sity
858
A Simple Example - Refined
xx lt- faithful[eruptions]fit1 lt- density(xx)fit2 lt- replicate(10000
x lt- sample(xxreplace=TRUE)density(x from=min(fit1$x) to=max(fit1$x))$y
)fit3 lt- apply(fit2 1 quantilec(00250975))plot(fit1 ylim=range(fit3))polygon(c(fit1$xrev(fit1$x)) c(fit3[1]rev(fit3[2]))
col=grey border=F)lines(fit1)
958
A Simple Example - Refined
1 2 3 4 5 6
00
01
02
03
04
05
densitydefault(x = xx)
N = 272 Bandwidth = 03348
Den
sity
1058
So Why R
R enables us to
middot work interactivelymiddot explore and visualize datamiddot access retrieve andor generate datamiddot summarize and report into pdf html hellip
making it the key language for statistical computing and apreferred environment for many data analysts
1158
So Why R
R has always been extensible via
middot C via a bare-bones interface described in Writing RExtensions
middot Fortran which is also used internally by Rmiddot Java via rJava by Simon Urbanekmiddot C++ but essentially at the bare-bones level of C
So while in theory this always worked ndash it was tedious inpractice
1258
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1358
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1458
Why Extend R
Chambers proceeds with this rough map of the road ahead
middot Againstmiddot Itrsquos more workmiddot Bugs will bitemiddot Potential platform dependencymiddot Less readable software
middot In Favormiddot New and trusted computationsmiddot Speedmiddot Object references
1558
Why Extend R
The Why boils down to
middot speed Often a good enough reason for us hellip and a focus forus in this workshop
middot new things We can bind to libraries and tools that wouldotherwise be unavailable in R
middot references Chambers quote from 2008 foreshadowed thework on the new Reference Classes now in R and built uponvia Rcpp Modules Rcpp Classes (and also RcppR6)
1658
Overview Why C++
Why C++
middot Asking Google leads to about 52 million hitsmiddot Wikipedia C++ is a statically typed free-form
multi-paradigm compiled general-purpose powerfulprogramming language
middot C++ is industrial-strength vendor-independent widely-usedand still evolving
middot In science amp research one of the most frequently-usedlanguages If there is something you want to use connectto it probably has a CC++ API
middot As a widely used language it also has good tool support(debuggers profilers code analysis)
1858
Why C++
Scott Meyers View C++ as a federation of languages
middot C provides a rich inheritance and interoperability as UnixWindows hellip are all build on C
middot Object-Oriented C++ (maybe just to provide endlessdiscussions about exactly what OO is or should be)
middot Templated C++ which is mighty powerful template metaprogramming unequalled in other languages
middot The Standard Template Library (STL) is a specific templatelibrary which is powerful but has its own conventions
middot C++11 (and C++14 and beyond) add enough to be calleda fifth language
NB Meyers original list of four languages appeared years before C++11
1958
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 6: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/6.jpg)
Overview Why R
Why R Programming with Data
ChambersComputationalMethods for DataAnalysis Wiley1977
Becker Chambersand Wilks TheNew S LanguageChapman amp Hall1988
Chambers andHastie StatisticalModels in SChapman amp Hall1992
ChambersProgramming withData Springer1998
ChambersSoftware for DataAnalysisProgramming withR Springer 2008
Thanks to John Chambers for sending me high-resolution scans of the covers of his books
658
A Simple Example
xx lt- faithful[eruptions]fit lt- density(xx)plot(fit)
758
A Simple Example
1 2 3 4 5 6
00
01
02
03
04
05
densitydefault(x = xx)
N = 272 Bandwidth = 03348
Den
sity
858
A Simple Example - Refined
xx lt- faithful[eruptions]fit1 lt- density(xx)fit2 lt- replicate(10000
x lt- sample(xxreplace=TRUE)density(x from=min(fit1$x) to=max(fit1$x))$y
)fit3 lt- apply(fit2 1 quantilec(00250975))plot(fit1 ylim=range(fit3))polygon(c(fit1$xrev(fit1$x)) c(fit3[1]rev(fit3[2]))
col=grey border=F)lines(fit1)
958
A Simple Example - Refined
1 2 3 4 5 6
00
01
02
03
04
05
densitydefault(x = xx)
N = 272 Bandwidth = 03348
Den
sity
1058
So Why R
R enables us to
middot work interactivelymiddot explore and visualize datamiddot access retrieve andor generate datamiddot summarize and report into pdf html hellip
making it the key language for statistical computing and apreferred environment for many data analysts
1158
So Why R
R has always been extensible via
middot C via a bare-bones interface described in Writing RExtensions
middot Fortran which is also used internally by Rmiddot Java via rJava by Simon Urbanekmiddot C++ but essentially at the bare-bones level of C
So while in theory this always worked ndash it was tedious inpractice
1258
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1358
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1458
Why Extend R
Chambers proceeds with this rough map of the road ahead
middot Againstmiddot Itrsquos more workmiddot Bugs will bitemiddot Potential platform dependencymiddot Less readable software
middot In Favormiddot New and trusted computationsmiddot Speedmiddot Object references
1558
Why Extend R
The Why boils down to
middot speed Often a good enough reason for us hellip and a focus forus in this workshop
middot new things We can bind to libraries and tools that wouldotherwise be unavailable in R
middot references Chambers quote from 2008 foreshadowed thework on the new Reference Classes now in R and built uponvia Rcpp Modules Rcpp Classes (and also RcppR6)
1658
Overview Why C++
Why C++
middot Asking Google leads to about 52 million hitsmiddot Wikipedia C++ is a statically typed free-form
multi-paradigm compiled general-purpose powerfulprogramming language
middot C++ is industrial-strength vendor-independent widely-usedand still evolving
middot In science amp research one of the most frequently-usedlanguages If there is something you want to use connectto it probably has a CC++ API
middot As a widely used language it also has good tool support(debuggers profilers code analysis)
1858
Why C++
Scott Meyers View C++ as a federation of languages
middot C provides a rich inheritance and interoperability as UnixWindows hellip are all build on C
middot Object-Oriented C++ (maybe just to provide endlessdiscussions about exactly what OO is or should be)
middot Templated C++ which is mighty powerful template metaprogramming unequalled in other languages
middot The Standard Template Library (STL) is a specific templatelibrary which is powerful but has its own conventions
middot C++11 (and C++14 and beyond) add enough to be calleda fifth language
NB Meyers original list of four languages appeared years before C++11
1958
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 7: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/7.jpg)
Why R Programming with Data
ChambersComputationalMethods for DataAnalysis Wiley1977
Becker Chambersand Wilks TheNew S LanguageChapman amp Hall1988
Chambers andHastie StatisticalModels in SChapman amp Hall1992
ChambersProgramming withData Springer1998
ChambersSoftware for DataAnalysisProgramming withR Springer 2008
Thanks to John Chambers for sending me high-resolution scans of the covers of his books
658
A Simple Example
xx lt- faithful[eruptions]fit lt- density(xx)plot(fit)
758
A Simple Example
1 2 3 4 5 6
00
01
02
03
04
05
densitydefault(x = xx)
N = 272 Bandwidth = 03348
Den
sity
858
A Simple Example - Refined
xx lt- faithful[eruptions]fit1 lt- density(xx)fit2 lt- replicate(10000
x lt- sample(xxreplace=TRUE)density(x from=min(fit1$x) to=max(fit1$x))$y
)fit3 lt- apply(fit2 1 quantilec(00250975))plot(fit1 ylim=range(fit3))polygon(c(fit1$xrev(fit1$x)) c(fit3[1]rev(fit3[2]))
col=grey border=F)lines(fit1)
958
A Simple Example - Refined
1 2 3 4 5 6
00
01
02
03
04
05
densitydefault(x = xx)
N = 272 Bandwidth = 03348
Den
sity
1058
So Why R
R enables us to
middot work interactivelymiddot explore and visualize datamiddot access retrieve andor generate datamiddot summarize and report into pdf html hellip
making it the key language for statistical computing and apreferred environment for many data analysts
1158
So Why R
R has always been extensible via
middot C via a bare-bones interface described in Writing RExtensions
middot Fortran which is also used internally by Rmiddot Java via rJava by Simon Urbanekmiddot C++ but essentially at the bare-bones level of C
So while in theory this always worked ndash it was tedious inpractice
1258
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1358
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1458
Why Extend R
Chambers proceeds with this rough map of the road ahead
middot Againstmiddot Itrsquos more workmiddot Bugs will bitemiddot Potential platform dependencymiddot Less readable software
middot In Favormiddot New and trusted computationsmiddot Speedmiddot Object references
1558
Why Extend R
The Why boils down to
middot speed Often a good enough reason for us hellip and a focus forus in this workshop
middot new things We can bind to libraries and tools that wouldotherwise be unavailable in R
middot references Chambers quote from 2008 foreshadowed thework on the new Reference Classes now in R and built uponvia Rcpp Modules Rcpp Classes (and also RcppR6)
1658
Overview Why C++
Why C++
middot Asking Google leads to about 52 million hitsmiddot Wikipedia C++ is a statically typed free-form
multi-paradigm compiled general-purpose powerfulprogramming language
middot C++ is industrial-strength vendor-independent widely-usedand still evolving
middot In science amp research one of the most frequently-usedlanguages If there is something you want to use connectto it probably has a CC++ API
middot As a widely used language it also has good tool support(debuggers profilers code analysis)
1858
Why C++
Scott Meyers View C++ as a federation of languages
middot C provides a rich inheritance and interoperability as UnixWindows hellip are all build on C
middot Object-Oriented C++ (maybe just to provide endlessdiscussions about exactly what OO is or should be)
middot Templated C++ which is mighty powerful template metaprogramming unequalled in other languages
middot The Standard Template Library (STL) is a specific templatelibrary which is powerful but has its own conventions
middot C++11 (and C++14 and beyond) add enough to be calleda fifth language
NB Meyers original list of four languages appeared years before C++11
1958
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 8: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/8.jpg)
A Simple Example
xx lt- faithful[eruptions]fit lt- density(xx)plot(fit)
758
A Simple Example
1 2 3 4 5 6
00
01
02
03
04
05
densitydefault(x = xx)
N = 272 Bandwidth = 03348
Den
sity
858
A Simple Example - Refined
xx lt- faithful[eruptions]fit1 lt- density(xx)fit2 lt- replicate(10000
x lt- sample(xxreplace=TRUE)density(x from=min(fit1$x) to=max(fit1$x))$y
)fit3 lt- apply(fit2 1 quantilec(00250975))plot(fit1 ylim=range(fit3))polygon(c(fit1$xrev(fit1$x)) c(fit3[1]rev(fit3[2]))
col=grey border=F)lines(fit1)
958
A Simple Example - Refined
1 2 3 4 5 6
00
01
02
03
04
05
densitydefault(x = xx)
N = 272 Bandwidth = 03348
Den
sity
1058
So Why R
R enables us to
middot work interactivelymiddot explore and visualize datamiddot access retrieve andor generate datamiddot summarize and report into pdf html hellip
making it the key language for statistical computing and apreferred environment for many data analysts
1158
So Why R
R has always been extensible via
middot C via a bare-bones interface described in Writing RExtensions
middot Fortran which is also used internally by Rmiddot Java via rJava by Simon Urbanekmiddot C++ but essentially at the bare-bones level of C
So while in theory this always worked ndash it was tedious inpractice
1258
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1358
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1458
Why Extend R
Chambers proceeds with this rough map of the road ahead
middot Againstmiddot Itrsquos more workmiddot Bugs will bitemiddot Potential platform dependencymiddot Less readable software
middot In Favormiddot New and trusted computationsmiddot Speedmiddot Object references
1558
Why Extend R
The Why boils down to
middot speed Often a good enough reason for us hellip and a focus forus in this workshop
middot new things We can bind to libraries and tools that wouldotherwise be unavailable in R
middot references Chambers quote from 2008 foreshadowed thework on the new Reference Classes now in R and built uponvia Rcpp Modules Rcpp Classes (and also RcppR6)
1658
Overview Why C++
Why C++
middot Asking Google leads to about 52 million hitsmiddot Wikipedia C++ is a statically typed free-form
multi-paradigm compiled general-purpose powerfulprogramming language
middot C++ is industrial-strength vendor-independent widely-usedand still evolving
middot In science amp research one of the most frequently-usedlanguages If there is something you want to use connectto it probably has a CC++ API
middot As a widely used language it also has good tool support(debuggers profilers code analysis)
1858
Why C++
Scott Meyers View C++ as a federation of languages
middot C provides a rich inheritance and interoperability as UnixWindows hellip are all build on C
middot Object-Oriented C++ (maybe just to provide endlessdiscussions about exactly what OO is or should be)
middot Templated C++ which is mighty powerful template metaprogramming unequalled in other languages
middot The Standard Template Library (STL) is a specific templatelibrary which is powerful but has its own conventions
middot C++11 (and C++14 and beyond) add enough to be calleda fifth language
NB Meyers original list of four languages appeared years before C++11
1958
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 9: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/9.jpg)
A Simple Example
1 2 3 4 5 6
00
01
02
03
04
05
densitydefault(x = xx)
N = 272 Bandwidth = 03348
Den
sity
858
A Simple Example - Refined
xx lt- faithful[eruptions]fit1 lt- density(xx)fit2 lt- replicate(10000
x lt- sample(xxreplace=TRUE)density(x from=min(fit1$x) to=max(fit1$x))$y
)fit3 lt- apply(fit2 1 quantilec(00250975))plot(fit1 ylim=range(fit3))polygon(c(fit1$xrev(fit1$x)) c(fit3[1]rev(fit3[2]))
col=grey border=F)lines(fit1)
958
A Simple Example - Refined
1 2 3 4 5 6
00
01
02
03
04
05
densitydefault(x = xx)
N = 272 Bandwidth = 03348
Den
sity
1058
So Why R
R enables us to
middot work interactivelymiddot explore and visualize datamiddot access retrieve andor generate datamiddot summarize and report into pdf html hellip
making it the key language for statistical computing and apreferred environment for many data analysts
1158
So Why R
R has always been extensible via
middot C via a bare-bones interface described in Writing RExtensions
middot Fortran which is also used internally by Rmiddot Java via rJava by Simon Urbanekmiddot C++ but essentially at the bare-bones level of C
So while in theory this always worked ndash it was tedious inpractice
1258
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1358
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1458
Why Extend R
Chambers proceeds with this rough map of the road ahead
middot Againstmiddot Itrsquos more workmiddot Bugs will bitemiddot Potential platform dependencymiddot Less readable software
middot In Favormiddot New and trusted computationsmiddot Speedmiddot Object references
1558
Why Extend R
The Why boils down to
middot speed Often a good enough reason for us hellip and a focus forus in this workshop
middot new things We can bind to libraries and tools that wouldotherwise be unavailable in R
middot references Chambers quote from 2008 foreshadowed thework on the new Reference Classes now in R and built uponvia Rcpp Modules Rcpp Classes (and also RcppR6)
1658
Overview Why C++
Why C++
middot Asking Google leads to about 52 million hitsmiddot Wikipedia C++ is a statically typed free-form
multi-paradigm compiled general-purpose powerfulprogramming language
middot C++ is industrial-strength vendor-independent widely-usedand still evolving
middot In science amp research one of the most frequently-usedlanguages If there is something you want to use connectto it probably has a CC++ API
middot As a widely used language it also has good tool support(debuggers profilers code analysis)
1858
Why C++
Scott Meyers View C++ as a federation of languages
middot C provides a rich inheritance and interoperability as UnixWindows hellip are all build on C
middot Object-Oriented C++ (maybe just to provide endlessdiscussions about exactly what OO is or should be)
middot Templated C++ which is mighty powerful template metaprogramming unequalled in other languages
middot The Standard Template Library (STL) is a specific templatelibrary which is powerful but has its own conventions
middot C++11 (and C++14 and beyond) add enough to be calleda fifth language
NB Meyers original list of four languages appeared years before C++11
1958
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 10: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/10.jpg)
A Simple Example - Refined
xx lt- faithful[eruptions]fit1 lt- density(xx)fit2 lt- replicate(10000
x lt- sample(xxreplace=TRUE)density(x from=min(fit1$x) to=max(fit1$x))$y
)fit3 lt- apply(fit2 1 quantilec(00250975))plot(fit1 ylim=range(fit3))polygon(c(fit1$xrev(fit1$x)) c(fit3[1]rev(fit3[2]))
col=grey border=F)lines(fit1)
958
A Simple Example - Refined
1 2 3 4 5 6
00
01
02
03
04
05
densitydefault(x = xx)
N = 272 Bandwidth = 03348
Den
sity
1058
So Why R
R enables us to
middot work interactivelymiddot explore and visualize datamiddot access retrieve andor generate datamiddot summarize and report into pdf html hellip
making it the key language for statistical computing and apreferred environment for many data analysts
1158
So Why R
R has always been extensible via
middot C via a bare-bones interface described in Writing RExtensions
middot Fortran which is also used internally by Rmiddot Java via rJava by Simon Urbanekmiddot C++ but essentially at the bare-bones level of C
So while in theory this always worked ndash it was tedious inpractice
1258
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1358
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1458
Why Extend R
Chambers proceeds with this rough map of the road ahead
middot Againstmiddot Itrsquos more workmiddot Bugs will bitemiddot Potential platform dependencymiddot Less readable software
middot In Favormiddot New and trusted computationsmiddot Speedmiddot Object references
1558
Why Extend R
The Why boils down to
middot speed Often a good enough reason for us hellip and a focus forus in this workshop
middot new things We can bind to libraries and tools that wouldotherwise be unavailable in R
middot references Chambers quote from 2008 foreshadowed thework on the new Reference Classes now in R and built uponvia Rcpp Modules Rcpp Classes (and also RcppR6)
1658
Overview Why C++
Why C++
middot Asking Google leads to about 52 million hitsmiddot Wikipedia C++ is a statically typed free-form
multi-paradigm compiled general-purpose powerfulprogramming language
middot C++ is industrial-strength vendor-independent widely-usedand still evolving
middot In science amp research one of the most frequently-usedlanguages If there is something you want to use connectto it probably has a CC++ API
middot As a widely used language it also has good tool support(debuggers profilers code analysis)
1858
Why C++
Scott Meyers View C++ as a federation of languages
middot C provides a rich inheritance and interoperability as UnixWindows hellip are all build on C
middot Object-Oriented C++ (maybe just to provide endlessdiscussions about exactly what OO is or should be)
middot Templated C++ which is mighty powerful template metaprogramming unequalled in other languages
middot The Standard Template Library (STL) is a specific templatelibrary which is powerful but has its own conventions
middot C++11 (and C++14 and beyond) add enough to be calleda fifth language
NB Meyers original list of four languages appeared years before C++11
1958
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 11: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/11.jpg)
A Simple Example - Refined
1 2 3 4 5 6
00
01
02
03
04
05
densitydefault(x = xx)
N = 272 Bandwidth = 03348
Den
sity
1058
So Why R
R enables us to
middot work interactivelymiddot explore and visualize datamiddot access retrieve andor generate datamiddot summarize and report into pdf html hellip
making it the key language for statistical computing and apreferred environment for many data analysts
1158
So Why R
R has always been extensible via
middot C via a bare-bones interface described in Writing RExtensions
middot Fortran which is also used internally by Rmiddot Java via rJava by Simon Urbanekmiddot C++ but essentially at the bare-bones level of C
So while in theory this always worked ndash it was tedious inpractice
1258
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1358
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1458
Why Extend R
Chambers proceeds with this rough map of the road ahead
middot Againstmiddot Itrsquos more workmiddot Bugs will bitemiddot Potential platform dependencymiddot Less readable software
middot In Favormiddot New and trusted computationsmiddot Speedmiddot Object references
1558
Why Extend R
The Why boils down to
middot speed Often a good enough reason for us hellip and a focus forus in this workshop
middot new things We can bind to libraries and tools that wouldotherwise be unavailable in R
middot references Chambers quote from 2008 foreshadowed thework on the new Reference Classes now in R and built uponvia Rcpp Modules Rcpp Classes (and also RcppR6)
1658
Overview Why C++
Why C++
middot Asking Google leads to about 52 million hitsmiddot Wikipedia C++ is a statically typed free-form
multi-paradigm compiled general-purpose powerfulprogramming language
middot C++ is industrial-strength vendor-independent widely-usedand still evolving
middot In science amp research one of the most frequently-usedlanguages If there is something you want to use connectto it probably has a CC++ API
middot As a widely used language it also has good tool support(debuggers profilers code analysis)
1858
Why C++
Scott Meyers View C++ as a federation of languages
middot C provides a rich inheritance and interoperability as UnixWindows hellip are all build on C
middot Object-Oriented C++ (maybe just to provide endlessdiscussions about exactly what OO is or should be)
middot Templated C++ which is mighty powerful template metaprogramming unequalled in other languages
middot The Standard Template Library (STL) is a specific templatelibrary which is powerful but has its own conventions
middot C++11 (and C++14 and beyond) add enough to be calleda fifth language
NB Meyers original list of four languages appeared years before C++11
1958
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 12: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/12.jpg)
So Why R
R enables us to
middot work interactivelymiddot explore and visualize datamiddot access retrieve andor generate datamiddot summarize and report into pdf html hellip
making it the key language for statistical computing and apreferred environment for many data analysts
1158
So Why R
R has always been extensible via
middot C via a bare-bones interface described in Writing RExtensions
middot Fortran which is also used internally by Rmiddot Java via rJava by Simon Urbanekmiddot C++ but essentially at the bare-bones level of C
So while in theory this always worked ndash it was tedious inpractice
1258
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1358
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1458
Why Extend R
Chambers proceeds with this rough map of the road ahead
middot Againstmiddot Itrsquos more workmiddot Bugs will bitemiddot Potential platform dependencymiddot Less readable software
middot In Favormiddot New and trusted computationsmiddot Speedmiddot Object references
1558
Why Extend R
The Why boils down to
middot speed Often a good enough reason for us hellip and a focus forus in this workshop
middot new things We can bind to libraries and tools that wouldotherwise be unavailable in R
middot references Chambers quote from 2008 foreshadowed thework on the new Reference Classes now in R and built uponvia Rcpp Modules Rcpp Classes (and also RcppR6)
1658
Overview Why C++
Why C++
middot Asking Google leads to about 52 million hitsmiddot Wikipedia C++ is a statically typed free-form
multi-paradigm compiled general-purpose powerfulprogramming language
middot C++ is industrial-strength vendor-independent widely-usedand still evolving
middot In science amp research one of the most frequently-usedlanguages If there is something you want to use connectto it probably has a CC++ API
middot As a widely used language it also has good tool support(debuggers profilers code analysis)
1858
Why C++
Scott Meyers View C++ as a federation of languages
middot C provides a rich inheritance and interoperability as UnixWindows hellip are all build on C
middot Object-Oriented C++ (maybe just to provide endlessdiscussions about exactly what OO is or should be)
middot Templated C++ which is mighty powerful template metaprogramming unequalled in other languages
middot The Standard Template Library (STL) is a specific templatelibrary which is powerful but has its own conventions
middot C++11 (and C++14 and beyond) add enough to be calleda fifth language
NB Meyers original list of four languages appeared years before C++11
1958
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 13: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/13.jpg)
So Why R
R has always been extensible via
middot C via a bare-bones interface described in Writing RExtensions
middot Fortran which is also used internally by Rmiddot Java via rJava by Simon Urbanekmiddot C++ but essentially at the bare-bones level of C
So while in theory this always worked ndash it was tedious inpractice
1258
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1358
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1458
Why Extend R
Chambers proceeds with this rough map of the road ahead
middot Againstmiddot Itrsquos more workmiddot Bugs will bitemiddot Potential platform dependencymiddot Less readable software
middot In Favormiddot New and trusted computationsmiddot Speedmiddot Object references
1558
Why Extend R
The Why boils down to
middot speed Often a good enough reason for us hellip and a focus forus in this workshop
middot new things We can bind to libraries and tools that wouldotherwise be unavailable in R
middot references Chambers quote from 2008 foreshadowed thework on the new Reference Classes now in R and built uponvia Rcpp Modules Rcpp Classes (and also RcppR6)
1658
Overview Why C++
Why C++
middot Asking Google leads to about 52 million hitsmiddot Wikipedia C++ is a statically typed free-form
multi-paradigm compiled general-purpose powerfulprogramming language
middot C++ is industrial-strength vendor-independent widely-usedand still evolving
middot In science amp research one of the most frequently-usedlanguages If there is something you want to use connectto it probably has a CC++ API
middot As a widely used language it also has good tool support(debuggers profilers code analysis)
1858
Why C++
Scott Meyers View C++ as a federation of languages
middot C provides a rich inheritance and interoperability as UnixWindows hellip are all build on C
middot Object-Oriented C++ (maybe just to provide endlessdiscussions about exactly what OO is or should be)
middot Templated C++ which is mighty powerful template metaprogramming unequalled in other languages
middot The Standard Template Library (STL) is a specific templatelibrary which is powerful but has its own conventions
middot C++11 (and C++14 and beyond) add enough to be calleda fifth language
NB Meyers original list of four languages appeared years before C++11
1958
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 14: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/14.jpg)
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1358
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1458
Why Extend R
Chambers proceeds with this rough map of the road ahead
middot Againstmiddot Itrsquos more workmiddot Bugs will bitemiddot Potential platform dependencymiddot Less readable software
middot In Favormiddot New and trusted computationsmiddot Speedmiddot Object references
1558
Why Extend R
The Why boils down to
middot speed Often a good enough reason for us hellip and a focus forus in this workshop
middot new things We can bind to libraries and tools that wouldotherwise be unavailable in R
middot references Chambers quote from 2008 foreshadowed thework on the new Reference Classes now in R and built uponvia Rcpp Modules Rcpp Classes (and also RcppR6)
1658
Overview Why C++
Why C++
middot Asking Google leads to about 52 million hitsmiddot Wikipedia C++ is a statically typed free-form
multi-paradigm compiled general-purpose powerfulprogramming language
middot C++ is industrial-strength vendor-independent widely-usedand still evolving
middot In science amp research one of the most frequently-usedlanguages If there is something you want to use connectto it probably has a CC++ API
middot As a widely used language it also has good tool support(debuggers profilers code analysis)
1858
Why C++
Scott Meyers View C++ as a federation of languages
middot C provides a rich inheritance and interoperability as UnixWindows hellip are all build on C
middot Object-Oriented C++ (maybe just to provide endlessdiscussions about exactly what OO is or should be)
middot Templated C++ which is mighty powerful template metaprogramming unequalled in other languages
middot The Standard Template Library (STL) is a specific templatelibrary which is powerful but has its own conventions
middot C++11 (and C++14 and beyond) add enough to be calleda fifth language
NB Meyers original list of four languages appeared years before C++11
1958
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 15: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/15.jpg)
Why Extend R
Chambers (2008) opens Chapter 11 Interfaces I Using C andFortran
Since the core of R is in fact a program written in the Clanguage itrsquos not surprising that the most direct interfaceto non-R software is for code written in C or directlycallable from C All the same including additional C codeis a serious step with some added dangers and often asubstantial amount of programming and debuggingrequired You should have a good reason
1458
Why Extend R
Chambers proceeds with this rough map of the road ahead
middot Againstmiddot Itrsquos more workmiddot Bugs will bitemiddot Potential platform dependencymiddot Less readable software
middot In Favormiddot New and trusted computationsmiddot Speedmiddot Object references
1558
Why Extend R
The Why boils down to
middot speed Often a good enough reason for us hellip and a focus forus in this workshop
middot new things We can bind to libraries and tools that wouldotherwise be unavailable in R
middot references Chambers quote from 2008 foreshadowed thework on the new Reference Classes now in R and built uponvia Rcpp Modules Rcpp Classes (and also RcppR6)
1658
Overview Why C++
Why C++
middot Asking Google leads to about 52 million hitsmiddot Wikipedia C++ is a statically typed free-form
multi-paradigm compiled general-purpose powerfulprogramming language
middot C++ is industrial-strength vendor-independent widely-usedand still evolving
middot In science amp research one of the most frequently-usedlanguages If there is something you want to use connectto it probably has a CC++ API
middot As a widely used language it also has good tool support(debuggers profilers code analysis)
1858
Why C++
Scott Meyers View C++ as a federation of languages
middot C provides a rich inheritance and interoperability as UnixWindows hellip are all build on C
middot Object-Oriented C++ (maybe just to provide endlessdiscussions about exactly what OO is or should be)
middot Templated C++ which is mighty powerful template metaprogramming unequalled in other languages
middot The Standard Template Library (STL) is a specific templatelibrary which is powerful but has its own conventions
middot C++11 (and C++14 and beyond) add enough to be calleda fifth language
NB Meyers original list of four languages appeared years before C++11
1958
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 16: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/16.jpg)
Why Extend R
Chambers proceeds with this rough map of the road ahead
middot Againstmiddot Itrsquos more workmiddot Bugs will bitemiddot Potential platform dependencymiddot Less readable software
middot In Favormiddot New and trusted computationsmiddot Speedmiddot Object references
1558
Why Extend R
The Why boils down to
middot speed Often a good enough reason for us hellip and a focus forus in this workshop
middot new things We can bind to libraries and tools that wouldotherwise be unavailable in R
middot references Chambers quote from 2008 foreshadowed thework on the new Reference Classes now in R and built uponvia Rcpp Modules Rcpp Classes (and also RcppR6)
1658
Overview Why C++
Why C++
middot Asking Google leads to about 52 million hitsmiddot Wikipedia C++ is a statically typed free-form
multi-paradigm compiled general-purpose powerfulprogramming language
middot C++ is industrial-strength vendor-independent widely-usedand still evolving
middot In science amp research one of the most frequently-usedlanguages If there is something you want to use connectto it probably has a CC++ API
middot As a widely used language it also has good tool support(debuggers profilers code analysis)
1858
Why C++
Scott Meyers View C++ as a federation of languages
middot C provides a rich inheritance and interoperability as UnixWindows hellip are all build on C
middot Object-Oriented C++ (maybe just to provide endlessdiscussions about exactly what OO is or should be)
middot Templated C++ which is mighty powerful template metaprogramming unequalled in other languages
middot The Standard Template Library (STL) is a specific templatelibrary which is powerful but has its own conventions
middot C++11 (and C++14 and beyond) add enough to be calleda fifth language
NB Meyers original list of four languages appeared years before C++11
1958
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 17: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/17.jpg)
Why Extend R
The Why boils down to
middot speed Often a good enough reason for us hellip and a focus forus in this workshop
middot new things We can bind to libraries and tools that wouldotherwise be unavailable in R
middot references Chambers quote from 2008 foreshadowed thework on the new Reference Classes now in R and built uponvia Rcpp Modules Rcpp Classes (and also RcppR6)
1658
Overview Why C++
Why C++
middot Asking Google leads to about 52 million hitsmiddot Wikipedia C++ is a statically typed free-form
multi-paradigm compiled general-purpose powerfulprogramming language
middot C++ is industrial-strength vendor-independent widely-usedand still evolving
middot In science amp research one of the most frequently-usedlanguages If there is something you want to use connectto it probably has a CC++ API
middot As a widely used language it also has good tool support(debuggers profilers code analysis)
1858
Why C++
Scott Meyers View C++ as a federation of languages
middot C provides a rich inheritance and interoperability as UnixWindows hellip are all build on C
middot Object-Oriented C++ (maybe just to provide endlessdiscussions about exactly what OO is or should be)
middot Templated C++ which is mighty powerful template metaprogramming unequalled in other languages
middot The Standard Template Library (STL) is a specific templatelibrary which is powerful but has its own conventions
middot C++11 (and C++14 and beyond) add enough to be calleda fifth language
NB Meyers original list of four languages appeared years before C++11
1958
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 18: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/18.jpg)
Overview Why C++
Why C++
middot Asking Google leads to about 52 million hitsmiddot Wikipedia C++ is a statically typed free-form
multi-paradigm compiled general-purpose powerfulprogramming language
middot C++ is industrial-strength vendor-independent widely-usedand still evolving
middot In science amp research one of the most frequently-usedlanguages If there is something you want to use connectto it probably has a CC++ API
middot As a widely used language it also has good tool support(debuggers profilers code analysis)
1858
Why C++
Scott Meyers View C++ as a federation of languages
middot C provides a rich inheritance and interoperability as UnixWindows hellip are all build on C
middot Object-Oriented C++ (maybe just to provide endlessdiscussions about exactly what OO is or should be)
middot Templated C++ which is mighty powerful template metaprogramming unequalled in other languages
middot The Standard Template Library (STL) is a specific templatelibrary which is powerful but has its own conventions
middot C++11 (and C++14 and beyond) add enough to be calleda fifth language
NB Meyers original list of four languages appeared years before C++11
1958
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 19: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/19.jpg)
Why C++
middot Asking Google leads to about 52 million hitsmiddot Wikipedia C++ is a statically typed free-form
multi-paradigm compiled general-purpose powerfulprogramming language
middot C++ is industrial-strength vendor-independent widely-usedand still evolving
middot In science amp research one of the most frequently-usedlanguages If there is something you want to use connectto it probably has a CC++ API
middot As a widely used language it also has good tool support(debuggers profilers code analysis)
1858
Why C++
Scott Meyers View C++ as a federation of languages
middot C provides a rich inheritance and interoperability as UnixWindows hellip are all build on C
middot Object-Oriented C++ (maybe just to provide endlessdiscussions about exactly what OO is or should be)
middot Templated C++ which is mighty powerful template metaprogramming unequalled in other languages
middot The Standard Template Library (STL) is a specific templatelibrary which is powerful but has its own conventions
middot C++11 (and C++14 and beyond) add enough to be calleda fifth language
NB Meyers original list of four languages appeared years before C++11
1958
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 20: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/20.jpg)
Why C++
Scott Meyers View C++ as a federation of languages
middot C provides a rich inheritance and interoperability as UnixWindows hellip are all build on C
middot Object-Oriented C++ (maybe just to provide endlessdiscussions about exactly what OO is or should be)
middot Templated C++ which is mighty powerful template metaprogramming unequalled in other languages
middot The Standard Template Library (STL) is a specific templatelibrary which is powerful but has its own conventions
middot C++11 (and C++14 and beyond) add enough to be calleda fifth language
NB Meyers original list of four languages appeared years before C++11
1958
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 21: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/21.jpg)
Why C++
middot Mature yet currentmiddot Strong performance focus
middot You donrsquot pay for what you donrsquot usemiddot Leave no room for another language between the machine
level and C++
middot Yet also powerfully abstract and high-levelmiddot C++11 is a big deal giving us new language featuresmiddot While there are complexities Rcpp users are mostly shielded
2058
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 22: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/22.jpg)
Overview Vision
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 23: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/23.jpg)
Bell Labs May 1976
2258
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 24: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/24.jpg)
Interface Vision
R offers us the best of both worlds
middot Compiled code withmiddot Access to proven libraries and algorithms in CC++Fortranmiddot Extremely high performance (in both serial and parallel modes)
middot Interpreted code withmiddot An accessible high-level language made for Programming with
Datamiddot An interactive workflow for data analysismiddot Support for rapid prototyping research and experimentation
2358
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 25: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/25.jpg)
Why Rcpp
middot Easy to learn as it really does not have to be thatcomplicated ndash we will see numerous few examples
middot Easy to use as it avoids build and OS system complexitiesthanks to the R infrastrucure
middot Expressive as it allows for vectorised C++ using Rcpp Sugarmiddot Seamless access to all R objects vector matrix list
S3S4RefClass Environment Function hellipmiddot Speed gains for a variety of tasks Rcpp excels precisely
where R struggles loops function calls hellipmiddot Extensions greatly facilitates access to external libraries
using eg Rcpp modules
2458
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 26: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/26.jpg)
Overview Speed
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 27: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/27.jpg)
Speed Example 1 (due to Christian Robert)
Five different ways to compute 1(1 + x)
f lt- function(n x=1) for(i in 1n) x lt- 1(1+x)g lt- function(n x=1) for(i in 1n) x lt- (1(1+x))h lt- function(n x=1) for(i in 1n) x lt- (1+x)^(-1)j lt- function(n x=1) for(i in 1n) x lt- 11+xk lt- function(n x=1) for(i in 1n) x lt- 11+xlibrary(rbenchmark)N lt- 1e5benchmark(f(N1)g(N1)h(N1)j(N1)k(N1)order=relative)[14]
2658
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 28: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/28.jpg)
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 5 k(N 1) 100 6435 1000 1 f(N 1) 100 6609 1027 2 g(N 1) 100 7757 1205 4 j(N 1) 100 7882 1225 3 h(N 1) 100 11766 1828
2758
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 29: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/29.jpg)
Speed Example 1 (due to Christian Robert)
Adding a C++ variant is easy
cppFunction(double m(int n double x)
for (int i=0 iltn i++)x = 1 (1+x)
return x
)
(We will learn more about cppFunction() later)
2858
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 30: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/30.jpg)
Speed Example 1 (due to Christian Robert)
test replications elapsed relative 6 m(N 1) 100 0170 1000 1 f(N 1) 100 6854 40318 5 k(N 1) 100 7811 45947 4 j(N 1) 100 9183 54018 2 g(N 1) 100 9489 55818 3 h(N 1) 100 11725 68971
2958
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 31: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/31.jpg)
Speed Example 2 (due to StackOverflow)
Consider a function defined as
f(n) such that n when n lt 2
f(n minus 1) + f(n minus 2) when n ge 2
3058
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 32: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/32.jpg)
Speed Example 2 (due to StackOverflow)
R implementation and use
f lt- function(n) if (n lt 2) return(n)return(f(n-1) + f(n-2))
Using it on first 11 argumentssapply(010 f)
[1] 0 1 1 2 3 5 8 13 21 34 55
3158
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 33: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/33.jpg)
Speed Example 2 (due to StackOverflow)
Timing
library(rbenchmark)benchmark(f(10) f(15) f(20))[14]
test replications elapsed relative 1 f(10) 100 0030 1000 2 f(15) 100 0335 11167 3 f(20) 100 3517 117233
3258
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 34: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/34.jpg)
Speed Example 2 (due to StackOverflow)
int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2))
deployed as
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
sapply(010 g)
[1] 0 1 1 2 3 5 8 13 21 34 55
3358
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 35: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/35.jpg)
Speed Example 2 (due to StackOverflow)
Timing
RcppcppFunction(int g(int n) if (n lt 2) return(n)return(g(n-1) + g(n-2)) )
library(rbenchmark)benchmark(f(20) g(20) order=relative)[14]
test replications elapsed relative 2 g(20) 100 0011 1000 1 f(20) 100 3776 343273
A nice gain of a few orders of magnitude3458
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 36: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/36.jpg)
Another Angle on Speed
Run-time performance is just one example
Time to code is another metric
We feel quite strongly that helps you code more succinctlyleading to fewer bugs and faster development
A good environment helps RStudio integrates R and C++development quite nicely (eg the compiler error messageparsing is very helpful) and also helps with package building
3558
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 37: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/37.jpg)
What Next
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 38: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/38.jpg)
Hands-on
Programming with C++
middot C++ Basicsmiddot Debuggingmiddot Best Practices
and then on to Rcpp itself
3758
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 39: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/39.jpg)
C++ Basics
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 40: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/40.jpg)
Compiled not Interpreted
Need to compile and link
include ltcstdiogt
int main(void) printf(Hello worldn)return 0
3958
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 41: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/41.jpg)
Compiled not Interpreted
Or streams output rather than printf
include ltiostreamgt
int main(void) stdcout ltlt Hello world ltlt stdendlreturn 0
4058
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 42: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/42.jpg)
Compiled not Interpreted
g++ -o will compile and link
We will now look at an examples with explicit linking
4158
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 43: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/43.jpg)
Compiled not Interpreted
include ltcstdiogt
define MATHLIB_STANDALONEinclude ltRmathhgt
int main(void) printf(N(01) 95th percentile 98fn
qnorm(095 00 10 1 0))
4258
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 44: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/44.jpg)
Compiled not Interpreted
We may need to supply
middot header location via -Imiddot library location via -Lmiddot library via -llibraryname
g++ -Iusrinclude -c qnorm_rmathcppg++ -o qnorm_rmath qnorm_rmatho -Lusrlib -lRmath
4358
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 45: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/45.jpg)
Statically Typed
middot R is dynamically typed x lt- 314 x lt- foo is validmiddot In C++ each variable must be declared before first usemiddot Common types are int and long (possibly with unsigned)float and double bool as well as char
middot No standard string type though stdstring is closemiddot All these variables types are scalars which is fundamentally
different from R where everything is a vectormiddot class (and struct) allow creation of composite types
classes add behaviour to data to form objectsmiddot Variables need to be declared cannot change
4458
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 46: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/46.jpg)
C++ Basics A Better C
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 47: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/47.jpg)
C++ is a Better C
middot control structures similar to what R offers for while ifswitch
middot functions are similar too but note the difference inpositional-only matching also same function name butdifferent arguments allowed in C++
middot pointers and memory management very different but lotsof issues people had with C can be avoided via STL (whichis something Rcpp promotes too)
middot sometimes still useful to know what a pointer is hellip
4658
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 48: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/48.jpg)
Object-Oriented
This is a second key feature of C++ and it does it differentlyfrom S3 and S4
struct Date unsigned int yearunsigned int monthunsigned int day
struct Person char firstname[20]char lastname[20]struct Date birthdayunsigned long id
4758
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 49: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/49.jpg)
Object-Oriented
Object-orientation in the C++ sense matches data with codeoperating on it
class Date private
unsigned int yearunsigned int monthunsigned int date
publicvoid setDate(int y int m int d)int getDay()int getMonth()int getYear()
4858
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 50: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/50.jpg)
Generic Programming and the STL
The STL promotes generic programming
For example the sequence container types vector dequeand list all support
middot push_back() to insert at the endmiddot pop_back() to remove from the frontmiddot begin() returning an iterator to the first elementmiddot end() returning an iterator to just after the last elementmiddot size() for the number of elements
but only list has push_front() and pop_front()
Other useful containers set multiset map and multimap4958
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 51: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/51.jpg)
Generic Programming and the STL
Traversal of containers can be achieved via iterators whichrequire suitable member functions begin() and end()
stdvectorltdoublegtconst_iterator sifor (si=sbegin() si = send() si++)
stdcout ltlt si ltlt stdendl
5058
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 52: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/52.jpg)
Generic Programming and the STL
Another key STL part are algorithms
double sum = accumulate(sbegin() send() 0)
Some other STL algorithms are
middot find finds the first element equal to the supplied valuemiddot count counts the number of matching elementsmiddot transform applies a supplied function to each elementmiddot for_each sweeps over all elements does not altermiddot inner_product inner product of two vectors
5158
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 53: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/53.jpg)
Template Programming
Template programming provides a lsquolanguage within C++rsquocode gets evaluated during compilation
One of the simplest template examples is
template lttypename Tgtconst Tamp min(const Tamp x const Tamp y)
return y lt x y x
This can now be used to compute the minimum between twoint variables or double or in fact any admissible typeproviding an operatorlt() for less-than comparison
5258
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 54: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/54.jpg)
Template Programming
Another template example is a class squaring its argument
template lttypename Tgtclass square public stdunary_functionltTTgt public
T operator()(T t) const return tt
which can be used along with STL algorithms
transform(xbegin() xend() square)
5358
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 55: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/55.jpg)
Further Reading
Books by Meyers are excellent
I also like the (free) C++ Annotations
C++ FAQ
Resources on StackOverflow such as
middot general info and its links egmiddot booklist
5458
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 56: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/56.jpg)
Debugging
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 57: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/57.jpg)
Debugging
Some tips
middot Generally painful old-school printf() still pervasivemiddot Debuggers go along with compilers gdb for gcc and g++lldb for the clang llvm family
middot Extra tools such as valgrind helpful for memory debuggingmiddot ldquoSanitizerrdquo (ASANUBSAN) in newer versions of g++ andclang++
5658
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 58: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/58.jpg)
Best Practices
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-
![Page 59: Higher-Performance R via C++ - Part 1: Introductiondirk.eddelbuettel.com/papers/rcpp_uzuerich_2015_part1_intro.pdf · · C++11 is a big deal giving us new language features · While](https://reader034.vdocuments.us/reader034/viewer/2022042412/5f2bd3a798ca4349551a9aac/html5/thumbnails/59.jpg)
Best Practices
Version control highly recommended to become familiar withgit or svn
Editor in the long-run recommended to learn productivitytricks for one editor emacs vi eclipse RStudio hellip
5858
- Overview
-
- Why R
- Why C++
- Vision
- Speed
-
- What Next
- C++ Basics
-
- A Better C
-
- Debugging
- Best Practices
-