Skip to content

Commit d3e9608

Browse files
authored
Merge pull request MicrosoftDocs#3886 from jeannt/cu3update
updated download info for CU2
2 parents 5ba44de + 1e5dc87 commit d3e9608

6 files changed

+106
-176
lines changed

docs/advanced-analytics/r/installing-ml-components-without-internet-access.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: "Installing machine learning components without internet access | Microsoft Docs"
33
ms.custom: ""
4-
ms.date: "10/31/2017"
4+
ms.date: "11/30/2017"
55
ms.prod:
66
- "sql-server-2016"
77
- "sql-server-2017"
@@ -30,7 +30,7 @@ Typically, setup of the machine components used in SQL Server 2016 and SQL Serve
3030

3131
+ **If the computer has an internet connection**
3232

33-
SQL Server locates and download the components for you, then installs them during setup. You must accept the license terms separately for each open source component (R or Python) that you install.
33+
SQL Server locates and downloads the components for you, and then installs them during setup. Accept the license terms separately for each open source component (R or Python) that you install.
3434

3535
+ **If the computer does not have internet access**
3636

@@ -126,6 +126,11 @@ Microsoft R Open |use previous|
126126
Microsoft R Server |[SRS_9.2.0.100_1033.cab](https://go.microsoft.com/fwlink/?LinkId=851501)|
127127
Microsoft Python Open |use previous |
128128
Microsoft Python Server |[SPS_9.2.0.100_1033.cab](https://go.microsoft.com/fwlink/?LinkId=851500) |
129+
**SQL Server 2017 CU2** |
130+
Microsoft R Open |use previous|
131+
Microsoft R Server |use previous|
132+
Microsoft Python Open |use previous |
133+
Microsoft Python Server |use previous|
129134

130135
### <a name="bkmk_2016Installers"></a>Downloads for SQL Server 2016
131136

docs/advanced-analytics/tutorials/sqldev-operationalize-the-model.md

Lines changed: 72 additions & 123 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,10 @@
11
---
22
title: "Lesson 6: Operationalize the R model| Microsoft Docs"
33
ms.custom: ""
4-
ms.date: "08/23/2016"
5-
ms.prod: "sql-server-2016"
4+
ms.date: "11/10/2017"
5+
ms.prod:
6+
- "sql-server-2016"
7+
- "sql-server-2017"
68
ms.reviewer: ""
79
ms.suite: ""
810
ms.technology:
@@ -18,7 +20,7 @@ ms.assetid: 52b05828-11f5-4ce3-9010-59c213a674d1
1820
caps.latest.revision: 11
1921
author: "jeannt"
2022
ms.author: "jeannt"
21-
manager: "jhubbard"
23+
manager: "cgronlund"
2224
ms.workload: "Inactive"
2325
---
2426
# Lesson 6: Operationalize the R model
@@ -38,40 +40,36 @@ First, let's see how scoring works in general.
3840
The stored procedure _PredictTip_ illustrates the basic syntax for wrapping a prediction call in a stored procedure.
3941

4042
```SQL
41-
CREATE PROCEDURE [dbo].[PredictTip] @inquery nvarchar(max)
42-
AS
43-
BEGIN
44-
45-
DECLARE @lmodel2 varbinary(max) = (SELECT TOP 1 model
46-
FROM nyc_taxi_models);
47-
EXEC sp_execute_external_script @language = N'R',
48-
@script = N'
49-
mod <- unserialize(as.raw(model));
50-
print(summary(mod))
51-
OutputDataSet<-rxPredict(modelObject = mod, data = InputDataSet, outData = NULL,
52-
predVarNames = "Score", type = "response", writeModelVars = FALSE, overwrite = TRUE);
53-
str(OutputDataSet)
54-
print(OutputDataSet)
55-
',
43+
CREATE PROCEDURE [dbo].[PredictTip] @inquery nvarchar(max)
44+
AS
45+
BEGIN
46+
47+
DECLARE @lmodel2 varbinary(max) = (SELECT TOP 1 model FROM nyc_taxi_models);
48+
EXEC sp_execute_external_script @language = N'R',
49+
@script = N'
50+
mod <- unserialize(as.raw(model));
51+
print(summary(mod))
52+
OutputDataSet<-rxPredict(modelObject = mod, data = InputDataSet, outData = NULL, predVarNames = "Score", type = "response", writeModelVars = FALSE, overwrite = TRUE);
53+
str(OutputDataSet)
54+
print(OutputDataSet)
55+
',
5656
@input_data_1 = @inquery,
5757
@params = N'@model varbinary(max)',
58-
@model = @lmodel2
59-
WITH RESULT SETS ((Score float));
60-
61-
END
62-
58+
@model = @lmodel2
59+
WITH RESULT SETS ((Score float));
60+
END
6361
GO
6462
```
6563

66-
- The SELECT statement gets the serialized model from the database, and stores the model in the R variable `mod` for further processing using R.
64+
+ The SELECT statement gets the serialized model from the database, and stores the model in the R variable `mod` for further processing using R.
6765

68-
- The new cases for scoring are obtained from the [!INCLUDE[tsql](../../includes/tsql-md.md)] query specified in `@inquery`, the first parameter to the stored procedure. As the query data is read, the rows are saved in the default data frame, `InputDataSet`. This data frame is passed to the `rxPredict` function in R, which generates the scores.
66+
+ The new cases for scoring are obtained from the [!INCLUDE[tsql](../../includes/tsql-md.md)] query specified in `@inquery`, the first parameter to the stored procedure. As the query data is read, the rows are saved in the default data frame, `InputDataSet`. This data frame is passed to the `rxPredict` function in R, which generates the scores.
6967

70-
`OutputDataSet<-rxPredict(modelObject = mod, data = InputDataSet, outData = NULL, predVarNames = "Score", type = "response", writeModelVars = FALSE, overwrite = TRUE);`
68+
`OutputDataSet<-rxPredict(modelObject = mod, data = InputDataSet, outData = NULL, predVarNames = "Score", type = "response", writeModelVars = FALSE, overwrite = TRUE);`
7169

7270
Because a data.frame can contain a single row, you can use the same code for batch or single scoring.
7371

74-
- The value returned by the `rxPredict` function is a **float** that represents the probability that the driver gets a tip of any amount.
72+
+ The value returned by the `rxPredict` function is a **float** that represents the probability that the driver gets a tip of any amount.
7573

7674
## Batch scoring
7775

@@ -80,23 +78,16 @@ Now let's see how batch scoring works.
8078
1. Let's start by getting a smaller set of input data to work with. This query creates a "top 10" list of trips with passenger count and other features needed to make a prediction.
8179

8280
```SQL
83-
SELECT TOP 10 a.passenger_count AS passenger_count,
84-
a.trip_time_in_secs AS trip_time_in_secs,
85-
a.trip_distance AS trip_distance,
86-
a.dropoff_datetime AS dropoff_datetime,
87-
dbo.fnCalculateDistance(pickup_latitude, pickup_longitude, dropoff_latitude,dropoff_longitude) AS direct_distance
88-
FROM
89-
(
90-
SELECT medallion, hack_license, pickup_datetime, passenger_count,trip_time_in_secs,trip_distance,
91-
dropoff_datetime, pickup_latitude, pickup_longitude, dropoff_latitude, dropoff_longitude FROM nyctaxi_sample)a
92-
LEFT OUTERJOIN
93-
(
94-
SELECT medallion, hack_license, pickup_datetime
95-
FROM nyctaxi_sample
96-
TABLESAMPLE (70 percent) REPEATABLE (98052)
97-
)b
81+
SELECT TOP 10 a.passenger_count AS passenger_count, a.trip_time_in_secs AS trip_time_in_secs, a.trip_distance AS trip_distance, a.dropoff_datetime AS dropoff_datetime, dbo.fnCalculateDistance(pickup_latitude, pickup_longitude, dropoff_latitude,dropoff_longitude) AS direct_distance
82+
83+
FROM (SELECT medallion, hack_license, pickup_datetime, passenger_count,trip_time_in_secs,trip_distance, dropoff_datetime, pickup_latitude, pickup_longitude, dropoff_latitude, dropoff_longitude FROM nyctaxi_sample)a
84+
85+
LEFT OUTER JOIN
86+
87+
(SELECT medallion, hack_license, pickup_datetime FROM nyctaxi_sample TABLESAMPLE (70 percent) REPEATABLE (98052) )b
88+
9889
ON a.medallion=b.medallion AND a.hack_license=b.hack_license
99-
AND a.pickup_datetime=b.pickup_datetime
90+
AND a.pickup_datetime=b.pickup_datetime
10091
WHERE b.medallion IS NULL
10192
```
10293

@@ -118,16 +109,16 @@ Now let's see how batch scoring works.
118109
CREATE PROCEDURE [dbo].[PredictTipBatchMode] @inquery nvarchar(max)
119110
AS
120111
BEGIN
121-
DECLARE @lmodel2 varbinary(max) = (SELECT TOP 1 model
122-
FROM nyc_taxi_models);
123-
EXEC sp_execute_external_script @language = N'R',
124-
@script = N'
125-
mod <- unserialize(as.raw(model));
126-
print(summary(mod))
127-
OutputDataSet<-rxPredict(modelObject = mod, data = InputDataSet, outData = NULL, predVarNames = "Score", type = "response", writeModelVars = FALSE, overwrite = TRUE);
128-
str(OutputDataSet)
129-
print(OutputDataSet)
130-
',
112+
DECLARE @lmodel2 varbinary(max) = (SELECT TOP 1 model FROM nyc_taxi_models);
113+
EXEC sp_execute_external_script
114+
@language = N'R',
115+
@script = N'
116+
mod <- unserialize(as.raw(model));
117+
print(summary(mod))
118+
OutputDataSet<-rxPredict(modelObject = mod, data = InputDataSet, outData = NULL, predVarNames = "Score", type = "response", writeModelVars = FALSE, overwrite = TRUE);
119+
str(OutputDataSet)
120+
print(OutputDataSet)
121+
',
131122
@input_data_1 = @inquery,
132123
@params = N'@model varbinary(max)',
133124
@model = @lmodel2
@@ -140,93 +131,51 @@ Now let's see how batch scoring works.
140131
```SQL
141132
-- Define the input data
142133
DECLARE @query_string nvarchar(max)
143-
SET @query_string='
144-
select top 10 a.passenger_count as passenger_count,
145-
a.trip_time_in_secs as trip_time_in_secs,
146-
a.trip_distance as trip_distance,
147-
a.dropoff_datetime as dropoff_datetime,
148-
dbo.fnCalculateDistance(pickup_latitude, pickup_longitude, dropoff_latitude,dropoff_longitude) as direct_distance
149-
from
150-
select medallion, hack_license, pickup_datetime, passenger_count,trip_time_in_secs,trip_distance,
151-
dropoff_datetime, pickup_latitude, pickup_longitude, dropoff_latitude, dropoff_longitude
152-
from nyctaxi_sample
153-
)a
154-
LEFT OUTER JOIN
155-
(
156-
SELECT medallion, hack_license, pickup_datetime
157-
FROM nyctaxi_sample
158-
TABLESAMPLE (70 percent) REPEATABLE (98052)
159-
)b
160-
ON a.medallion=b.medallion AND a.hack_license=b.hack_license AND a.pickup_datetime=b.pickup_datetime
161-
WHERE b.medallion is null'
134+
SET @query_string='SELECT TOP 10 a.passenger_count as passenger_count, a.trip_time_in_secs AS trip_time_in_secs, a.trip_distance AS trip_distance, a.dropoff_datetime AS dropoff_datetime, dbo.fnCalculateDistance(pickup_latitude, pickup_longitude, dropoff_latitude,dropoff_longitude) AS direct_distance FROM (SELECT medallion, hack_license, pickup_datetime, passenger_count,trip_time_in_secs,trip_distance, dropoff_datetime, pickup_latitude, pickup_longitude, dropoff_latitude, dropoff_longitude FROM nyctaxi_sample )a LEFT OUTER JOIN (SELECT medallion, hack_license, pickup_datetime FROM nyctaxi_sample TABLESAMPLE (70 percent) REPEATABLE (98052))b ON a.medallion=b.medallion AND a.hack_license=b.hack_license AND a.pickup_datetime=b.pickup_datetime WHERE b.medallion is null'
162135
163136
-- Call the stored procedure for scoring and pass the input data
164137
EXEC [dbo].[PredictTip] @inquery = @query_string;
165138
```
166139

167-
4. The stored procedure returns a series of values representing the prediction for each of the top ten trips. However, the top trips are also single-passenger trips with a relatively short trip distance, for which the driver is unlikely to get a tip.
140+
4. The stored procedure returns a series of values representing the prediction for each of the top 10 trips. However, the top trips are also single-passenger trips with a relatively short trip distance, for which the driver is unlikely to get a tip.
168141

169142

170143
> [!TIP]
171144
>
172-
> Rather than returning just the yes-tip/no-tip results, you could also return the probability score for the prediction, and then apply a WHERE clause to the _Score_ column values to categorize the score as "likely to tip" or "unlikely to tip", using a threshold value such as 0.5 or 0.7. This step is not included in the stored procedure but it would be easy to implement.
145+
> Rather than returning just the "yes-tip" and "no-tip" results, you could also return the probability score for the prediction, and then apply a WHERE clause to the _Score_ column values to categorize the score as "likely to tip" or "unlikely to tip", using a threshold value such as 0.5 or 0.7. This step is not included in the stored procedure but it would be easy to implement.
173146

174147
## Single-row scoring
175148

176149
Sometimes you want to pass in individual values from an application and get a single result based on those values. For example, you could set up an Excel worksheet, web application, or Reporting Services report to call the stored procedure and provide inputs typed or selected by users.
177150

178-
In this section, you'll learn how to create single predictions using a stored procedure.
151+
In this section, you learn how to create single predictions using a stored procedure.
179152

180153
1. Take a minute to review the code of the stored procedure _PredictTipSingleMode_, which was included as part of the download.
181154

182155
```SQL
183-
CREATE PROCEDURE [dbo].[PredictTipSingleMode] @passenger_count int = 0,
184-
@trip_distance float = 0,
185-
@trip_time_in_secs int = 0,
186-
@pickup_latitude float = 0,
187-
@pickup_longitude float = 0,
188-
@dropoff_latitude float = 0,
189-
@dropoff_longitude float = 0
190-
AS
191-
BEGIN
192-
DECLARE @inquery nvarchar(max) = N'
193-
SELECT * FROM [dbo].[fnEngineerFeatures](@passenger_count,
194-
@trip_distance,
195-
@trip_time_in_secs,
196-
@pickup_latitude,
197-
@pickup_longitude,
198-
@dropoff_latitude,
199-
@dropoff_longitude)
200-
'
201-
DECLARE @lmodel2 varbinary(max) = (SELECT TOP 1 model
202-
FROM nyc_taxi_models);
203-
EXEC sp_execute_external_script @language = N'R',
204-
@script = N'
205-
mod <- unserialize(as.raw(model));
206-
print(summary(mod))
207-
OutputDataSet<-rxPredict(modelObject = mod, data = InputDataSet, outData = NULL,
208-
predVarNames = "Score", type = "response", writeModelVars = FALSE, overwrite = TRUE);
209-
str(OutputDataSet)
210-
print(OutputDataSet)
211-
',
212-
@input_data_1 = @inquery,
213-
@params = N'@model varbinary(max),@passenger_count int,@trip_distance float,@trip_time_in_secs int ,
214-
@pickup_latitude float ,@pickup_longitude float ,@dropoff_latitude float ,@dropoff_longitude float',
215-
@model = @lmodel2,
216-
@passenger_count =@passenger_count ,
217-
@trip_distance=@trip_distance,
218-
@trip_time_in_secs=@trip_time_in_secs,
219-
@pickup_latitude=@pickup_latitude,
220-
@pickup_longitude=@pickup_longitude,
221-
@dropoff_latitude=@dropoff_latitude,
222-
@dropoff_longitude=@dropoff_longitude
156+
CREATE PROCEDURE [dbo].[PredictTipSingleMode] @passenger_count int = 0, @trip_distance float = 0, @trip_time_in_secs int = 0, @pickup_latitude float = 0, @pickup_longitude float = 0, @dropoff_latitude float = 0, @dropoff_longitude float = 0
157+
AS
158+
BEGIN
159+
DECLARE @inquery nvarchar(max) = N'SELECT * FROM [dbo].[fnEngineerFeatures](@passenger_count, @trip_distance, @trip_time_in_secs, @pickup_latitude, @pickup_longitude, @dropoff_latitude, @dropoff_longitude)';
160+
DECLARE @lmodel2 varbinary(max) = (SELECT TOP 1 model FROM nyc_taxi_models);
161+
EXEC sp_execute_external_script
162+
@language = N'R',
163+
@script = N'
164+
mod <- unserialize(as.raw(model));
165+
print(summary(mod));
166+
OutputDataSet<-rxPredict(modelObject = mod, data = InputDataSet, outData = NULL, predVarNames = "Score", type = "response", writeModelVars = FALSE, overwrite = TRUE);
167+
str(OutputDataSet);
168+
print(OutputDataSet);
169+
',
170+
@input_data_1 = @inquery,
171+
@params = N'@model varbinary(max),@passenger_count int,@trip_distance float,@trip_time_in_secs int , @pickup_latitude float ,@pickup_longitude float ,@dropoff_latitude float ,@dropoff_longitude float', @model = @lmodel2, @passenger_count =@passenger_count, @trip_distance=@trip_distance, @trip_time_in_secs=@trip_time_in_secs, @pickup_latitude=@pickup_latitude, @pickup_longitude=@pickup_longitude, @dropoff_latitude=@dropoff_latitude, @dropoff_longitude=@dropoff_longitude
223172
WITH RESULT SETS ((Score float));
224173
END
225174
```
226175

227176
- This stored procedure takes multiple single values as input, such as passenger count, trip distance, and so forth.
228177

229-
If you call the stored procedure from an external application, make sure that the data matches the requirements of the R model. This might include ensuring that the input data can be cast or converted to an R data type, or validating data type and data length. For more information, see [Working with R Data Types](https://msdn.microsoft.com/library/mt590948.aspx).
178+
If you call the stored procedure from an external application, make sure that the data matches the requirements of the R model. This might include ensuring that the input data can be cast or converted to an R data type, or validating data type and data length.
230179

231180
- The stored procedure creates a score based on the stored R model.
232181

@@ -236,12 +185,12 @@ In this section, you'll learn how to create single predictions using a stored pr
236185

237186
```
238187
EXEC [dbo].[PredictTipSingleMode] @passenger_count = 0,
239-
@trip_distance float = 2.5,
240-
@trip_time_in_secs int = 631,
241-
@pickup_latitude float = 40.763958,
242-
@pickup_longitude float = -73.973373,
243-
@dropoff_latitude float = 40.782139,
244-
@dropoff_longitude float = 73.977303
188+
@trip_distance = 2.5,
189+
@trip_time_in_secs = 631,
190+
@pickup_latitude = 40.763958,
191+
@pickup_longitude = -73.973373,
192+
@dropoff_latitude = 40.782139,
193+
@dropoff_longitude = 73.977303
245194
```
246195

247196
Or, use this shorter form supported for [parameters to a stored procedure](https://docs.microsoft.com/sql/relational-databases/stored-procedures/specify-parameters):
@@ -250,7 +199,7 @@ In this section, you'll learn how to create single predictions using a stored pr
250199
EXEC [dbo].[PredictTipSingleMode] 1, 2.5, 631, 40.763958,-73.973373, 40.782139,-73.977303
251200
```
252201

253-
3. The results indicate that the probability of getting a tip is very low on these top 10 trips, since all are single-passenger trips over a relatively short distance.
202+
3. The results indicate that the probability of getting a tip is low on these top 10 trips, since all are single-passenger trips over a relatively short distance.
254203

255204
## Conclusions
256205

docs/advanced-analytics/tutorials/walkthrough-create-graphs-and-plots-using-r.md

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,6 @@
11
---
22
title: "Create graphs and plots using SQL and R (walkthrough) | Microsoft Docs"
3-
ms.custom:
4-
- "SQL2016_New_Updated"
5-
ms.date: "07/03/2017"
3+
ms.date: "11/10/2017"
64
ms.prod: "sql-server-2016"
75
ms.reviewer: ""
86
ms.suite: ""
@@ -18,12 +16,12 @@ ms.assetid: 5f70f0a6-fd4a-410f-9f44-1605503f77ec
1816
caps.latest.revision: 16
1917
author: "jeannt"
2018
ms.author: "jeannt"
21-
manager: "jhubbard"
19+
manager: "cgronlund"
2220
ms.workload: "On Demand"
2321
---
2422
# Create graphs and plots using SQL and R (walkthrough)
2523

26-
In this part of the walkthrough, you'll learn techniques for generating plots and maps using R with SQL Server data. You'll create a simple histogram, to get some practice, and then develop a more complex map plot.
24+
In this part of the walkthrough, you learn techniques for generating plots and maps using R with SQL Server data. You create a simple histogram, to get some practice, and then develop a more complex map plot.
2725

2826
### Create a histogram
2927

@@ -51,8 +49,6 @@ In this part of the walkthrough, you'll learn techniques for generating plots an
5149

5250
Typically, database servers block Internet access. This can be inconvenient when using R packages that need to download maps or other images to generate plots. However, there is a workaround that you might find useful when developing your own applications. Basically, you generate the map representation on the client, and then overlay on the map the points that are stored as attributes in the SQL Server table.
5351

54-
We'll walk you through it in this lesson.
55-
5652
1. Define the function that creates the R plot object. The custom function *mapPlot* creates a scatter plot that uses the taxi pickup locations, and plots the number of rides that started from each location. It uses the **ggplot2** and **ggmap** packages, which should already be installed and loaded.
5753

5854
```R

0 commit comments

Comments
 (0)