Workshop: SQL Server Big Data Clusters - Architecture (CTP 3.2)

A Microsoft workshop from the SQL Server team

Security

In this workshop you'll cover using a Process and various Platform components to create a SQL Server Big Data Clusters (BDC) solution you can deploy on premises, in the cloud, or in a hybrid architecture. In each module you'll get more references, which you should follow up on to learn more. Also watch for links within the text - click on each one to explore that topic.

(Make sure you check out the prerequisites page before you start. You'll need all of the items loaded there before you can proceed with the workshop.)

You'll cover the following topics in this Module:

6.0 Managing BDC Security
6.1 Access
6.2 Authentication and Authorization

6.0 Managing BDC Security

Authentication is the process of verifying the identity of a user or service and ensuring they are who they are claiming to be. Authorization refers to granting or denying of access to specific resources based on the requesting user's identity. This step is performed after a user is identified through authentication.

NOTE: Security will change prior to the General Availability (GA) Release. Active Directory integration is planned for production implementations.

6.1 Access

There are three endpoints for entry points to the BDC:

Endpoint	Description
HDFS/Spark (Knox) gateway	An HTTPS-based endpoint that proxies other endpoints. The HDFS/Spark gateway is used for accessing services like webHDFS and Livy. Wherever you see references to Knox, this is the endpoint
Controller endpoint	The endpoint for the BDC management service that exposes REST APIs for managing the cluster. Some tools, such as Azure Data Studio, access the system using this endpoint
Master Instance	Get a detailed description of a specific pod in json format output. It includes details, such as the current Kubernetes node that the pod is placed on, the containers running within the pod, and the image used to bootstrap the containers. It also shows other details, such as labels, status, and persisted volumes claims that are associated with the pod

You can see these endpoints in this diagram:

6.2 Authentication and Authorization

When you create the cluster, a number of logins are created. Some of these logins are for services to communicate with each other, and others are for end users to access the cluster. Non-SQL Server End-user passwords currently are set using environment variables. These are passwords that cluster administrators use to access services:

Use	Variable
Controller username	CONTROLLER_USERNAME=controller_username
Controller password	CONTROLLER_PASSWORD=controller_password
SQL Master SA password	MSSQL_SA_PASSWORD=controller_sa_password
Password for accessing the HDFS/Spark endpoint	KNOX_PASSWORD=knox_password

Intra-cluster authentication Upon deployment of the cluster, a number of SQL logins are created:

A special SQL login is created in the Controller SQL instance that is system managed, with sysadmin role. The password for this login is captured as a K8s secret. A sysadmin login is created in all SQL instances in the cluster, that Controller owns and manages. It is required for Controller to perform administrative tasks, such as HA setup or upgrade, on these instances. These logins are also used for intra-cluster communication between SQL instances, such as the SQL master instance communicating with a data pool.

Note: In current release, only basic authentication is supported. Fine-grained access control to HDFS objects, the BDC compute and data pools, is not yet available.

For Intra-cluster communication with non-SQL services within the BDC, such as Livy to Spark or Spark to the storage pool, security uses certificates. All SQL Server to SQL Server communication is secured using SQL logins.

Activity: Review Security Endpoints

In this activity, you will review the endpoints exposed on the cluster.

Steps

Open this reference, and read the information you see for the Service Endpoints section. This shows the addresses and ports exposed to the end-users.

For Further Study

Security concepts for SQL Server big data clusters

Congratulations! You have completed this workshop on SQL Server big data clusters Architecture. You now have the tools, assets, and processes you need to extrapolate this information into other applications.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

06 - Security.md

06 - Security.md

Workshop: SQL Server Big Data Clusters - Architecture (CTP 3.2)

A Microsoft workshop from the SQL Server team

Security

6.0 Managing BDC Security

6.1 Access

6.2 Authentication and Authorization

Files

06 - Security.md

Latest commit

History

06 - Security.md

File metadata and controls

Workshop: SQL Server Big Data Clusters - Architecture (CTP 3.2)

A Microsoft workshop from the SQL Server team

Security

6.0 Managing BDC Security

6.1 Access

6.2 Authentication and Authorization