sql server r services: what every sql professional should know

23
Understanding SQL Server R Services Bob Ward Principal Architect, Microsoft Data Group, Tiger Team [email protected] m @bobwardms http://aka.ms/bobsql Want decks and demos now? http://aka.ms/bobwardms Credits to Joe Sack, Arvind Shyamsundar, and the SQL R Team

Upload: bob-ward

Post on 03-Mar-2017

37 views

Category:

Software


0 download

TRANSCRIPT

Page 1: SQL Server R Services: What Every SQL Professional Should Know

Understanding SQL Server R ServicesBob WardPrincipal Architect, MicrosoftData Group, Tiger Team

[email protected]@bobwardmshttp://aka.ms/bobsql

Want decks and demos now? http://aka.ms/bobwardms

Credits to Joe Sack, Arvind

Shyamsundar, and the SQL R

Team

Page 2: SQL Server R Services: What Every SQL Professional Should Know

What will we cover todayWhy SQL Server R Services?

What gets installed?

The SQL Server Extensibility Architecture

R and SQL Server Together

R with SQL Server is Scalable and SecureResource Pools, Best Practices, Monitoring, and Troubleshooting

SQL Server and R at Scale

Page 3: SQL Server R Services: What Every SQL Professional Should Know

Demo: Just Show Us!

Page 4: SQL Server R Services: What Every SQL Professional Should Know

Why SQL Server and R?

Familiar

Scalable

Secure

SQL Server and Microsoft R

T-SQLSQLOSDMVsResource GovernorXEventQuery StoreLOG filesSeconds and ms

R OpenR Client (R Studio)ScaleRR ServerR Data SourcesHours and Days

R Services (In-Database)

Page 5: SQL Server R Services: What Every SQL Professional Should Know

What Gets Installed?SQL

Server R Services

Microsoft R Server

Some differences by edition

Not installed by default in Azure

VM

Page 6: SQL Server R Services: What Every SQL Professional Should Know

What Gets Installed?

A few things to do for Azure

VM

Open Source R Package

Microsoft R Package

The offline experience

You must download both

• CUs and SPs will have new download packages• Don’t forget /IACCEPTROPENLICENSETERMS for unattended installs• Rsetup.exe and rsetup.log

Page 7: SQL Server R Services: What Every SQL Professional Should Know

The SQL Extensibility Architecture

launchpad.exe

sp_execute_external_script

sqlservr.exeNamed pipe

Each SQL instance has a

launchpad

SQLOSXEvent

MSSQLSERVER Service MSSQLLAUNCHPAD Service

“What” and “How” to “launch”

“launcher”

Windows “satellite” processsqlsatellite.d

ll

Windows “satellite” processWindows “satellite” processWindows “satellite” processWindows “satellite” process

Page 8: SQL Server R Services: What Every SQL Professional Should Know

sp_execute_external_scriptexecute sp_execute_external_script @language = N'R' , @script = N' x <- as.matrix(InputDataSet); y <- array(dim1:dim2); OutputDataSet <- as.data.frame(x %*% y);' , @input_data_1 = N' SELECT [Col1] from MyData;', @params = N'@dim1 int, @dim2 int' , @dim1 = 12, @dim2 = 15WITH RESULT SETS (([Col1] int, [Col2] int, [Col3] int, [Col4] int));

Getting started with R docs installed in

R_SERVICES\doc

R is only currently supported script language today

R script. Use a @var or read from

a fileInput data for script. Can

be any T-SQL SELECT.Parameters for script. OUTPUT

supported

Result set bindingMessages can also be returned including

STDOUT and STDERR

Page 9: SQL Server R Services: What Every SQL Professional Should Know

T-SQL and RLessons learned with customers from SQLCATSQL query tuningSome R scripts work better as T-SQL (Ex. Result set aggregation)

Develop, Train, and OperationalizeR Client to develop, explore and experimentTrain a model with sp_execute_external_script and save the result to a tableOperationalize by using sp_execute_external_script to “run” the model

sp_execute_external_script from T-SQL client

SQL Server Compute Context from R client

RODBC data source in R scripts

“Input data” queries traced like any other

query

Encapsulate in stored procedure for SQL clients

Page 10: SQL Server R Services: What Every SQL Professional Should Know

R Integration with SQL Server

sp_execute_external_script

sqlservr.exeMSSQLSERVER Service

launchpad.exe

MSSQLLAUNCHPAD Service

rlauncher.dll

BxlServer.exesqlsatellite.d

ll

rterm.exe conhost.exe

process pool

compile input data querySend message to pipeExecute input queryPush resultsPull results SNI/TCP – Comm technology as

SQLRetrieve input rows and paramsSend back results and output paramsstdout and stderr

R script

pipe CreateProcesspipe

Windows Job Object

CreateProcessSQLOSXEventScaleR“satellite”

process

interleaved

Open R

docs

rxlink.dll

Local User Account

Local User Account

Service SIDService SID

This is all local!

SATELLITE_* wait type

rlauncher.dll

rlauncher.dll

Page 11: SQL Server R Services: What Every SQL Professional Should Know

Demo: Inside SQL Server and R

Page 12: SQL Server R Services: What Every SQL Professional Should Know

R Services (In-Database) is scalableMore efficient than standalone R clientsData does not have to all fit in memoryReduced data transmission over the network

Most R Open functions are single threadedUse the ScaleR APIs for scalable R scripts that are multi-threaded on the SQL Server computer

We can stream data in parallel and batches from SQL Server

Use the power of SQL Server and R Server to develop, train, and executionSQL Server Compute ContextT-SQL queriesColumnstore indexesData compressionParallel query executionStored procedures

Enterprise Edition gives

you the optimum scalability

It’s all about using the compute power of the

server close to the data

Page 13: SQL Server R Services: What Every SQL Professional Should Know

R Services (In-Database) is secureReduced surface area and isolation

‘external scripts enabled’ required

R script execution outside of SQL Server

process space

Script execution requires explicit

permissionsp_execute_external_script requires EXECUTE ANY

EXTERNAL SCRIPT for non-admins

SQL Server login/user required and db/table

access

Satellite processes has limited privileges

Satellite processes run under local user accounts

in the SQLRUserGroup

Each execution is isolated. Different users with different accounts

Windows firewall rules to block outbound traffic

MSSQLSERVR0n

Page 14: SQL Server R Services: What Every SQL Professional Should Know

Best Practices and Performance ConsiderationsComputer with enough cores, memory, and disk speedHigh Performance Power OptionBalance memory needed by SQL Server and external poolLaunchpad needs specific privilegesBe sure SQLRUserGroup has log on local rightsRestart the SQL Server Service not stop/start (Launchpad is dependent)8dot3 notation needs to be enabled. Read more hereRemote ODBC execution requires SQLRUserGroup login20 unique users allowed to execute R scripts concurrently by defaultSQL Server Query and Index design still applyR scripts can often benefit from tuning

docs recommend min 32Gb

Default max

memory is 20% of

RAM

Need to add more?

Page 15: SQL Server R Services: What Every SQL Professional Should Know

SQL Server External Resource Pools internal, default, “user”, and now external

Controls resources for external processes through Launchpad.Default external pool and user external pools. User classifier function supportedThe controls

• MAX_CPU_PERCENT – Max CPU percentage for external processes• MAX_PROCESSES – Max number of external processes• MAX_MEMORY – Max committed memory % for external processes• AFFINITY – Control NODEs or CPUs for external processes

Windows Job Objects

Each pool requires a

separate job object

dynamicMin is 12 due to

process pool. conhost.exe doesn’t count. 0 = unlimited

Page 16: SQL Server R Services: What Every SQL Professional Should Know

Managing, Monitoring, and, TroubleshootingDMVs• dm_external_script_execution_stats – monitor ScaleR API executions• dm_external_script_requests – active script executionPerfmon counters: SQL Server:External ScriptsExtensibility LOGs and R Services LOGs in LOG\ExtensibilityLog directoryXEvents• Events for SQL Server, Launchpad, and external processes• Configuration file needed for Launchpad and external processesprocexp and procmon from sysinternalsSQL Server 2016 CU1 has some important fixesR profiler for performance tuning of R scripts

Page 17: SQL Server R Services: What Every SQL Professional Should Know

The SQL Server R Lifecycle

Install R Services (In Database)

Enable and verify

Develop model

Train and save model

Operationalize the model

Tune and configure for production

sp_configure“hello world

test”

R ClientMigrate to ScaleR

SQL Compute ContextSome R scripts to T-

SQL

Encapsulate in a stored procedure

SQL query tuningR script tuning

Batch sizeResource Governor

sp_execute_external_script

Page 20: SQL Server R Services: What Every SQL Professional Should Know

Bonus Material

Page 21: SQL Server R Services: What Every SQL Professional Should Know

What is installed for R?

R libraries• library

R documentation• doc

R tools• bin

Microsoft ScaleR libraries• library\RevoScaleR

<sql install dir>\Microsoft SQL Server\MSSQL13.<instance>\R_SERVICES

How to install other R packages

SQL specific binaries are installed in MSSQL\BINN

Page 22: SQL Server R Services: What Every SQL Professional Should Know

R Integration with SQL Server – SQL Server Compute Context

sqlservr.exe

BxlServer.exesqlsatellite.d

ll

R Client

rterm.exeOpen R

rxlink.dll

BxlServer.exe

CreateProcesspipe

launchpad.exe

rterm.exe

sqlCompute <- RxInSqlServer(connectionString = sqlConnString, wait = TRUE, consoleOutput = TRUE)rxSetComputeContext("sqlCompute")

sp_execute_external-script

RODBC

Page 23: SQL Server R Services: What Every SQL Professional Should Know

© Copyright Microsoft Corporation. All rights reserved.