Skip to content

Latest commit

 

History

History
103 lines (61 loc) · 6.61 KB

06 - Security.md

File metadata and controls

103 lines (61 loc) · 6.61 KB

Workshop: SQL Server Big Data Clusters - Architecture (CTP 3.2)

A Microsoft workshop from the SQL Server team

Security

In this workshop you'll cover using a Process and various Platform components to create a SQL Server Big Data Clusters (BDC) solution you can deploy on premises, in the cloud, or in a hybrid architecture. In each module you'll get more references, which you should follow up on to learn more. Also watch for links within the text - click on each one to explore that topic.

(Make sure you check out the prerequisites page before you start. You'll need all of the items loaded there before you can proceed with the workshop.)

You'll cover the following topics in this Module:

6.0 Managing BDC Security
6.1 Access
6.2 Authentication and Authorization

Authentication is the process of verifying the identity of a user or service and ensuring they are who they are claiming to be. Authorization refers to granting or denying of access to specific resources based on the requesting user's identity. This step is performed after a user is identified through authentication.

NOTE: Security will change prior to the General Availability (GA) Release. Active Directory integration is planned for production implementations.

There are three endpoints for entry points to the BDC:

Endpoint Description
HDFS/Spark (Knox) gatewayAn HTTPS-based endpoint that proxies other endpoints. The HDFS/Spark gateway is used for accessing services like webHDFS and Livy. Wherever you see references to Knox, this is the endpoint
Controller endpointThe endpoint for the BDC management service that exposes REST APIs for managing the cluster. Some tools, such as Azure Data Studio, access the system using this endpoint
Master InstanceGet a detailed description of a specific pod in json format output. It includes details, such as the current Kubernetes node that the pod is placed on, the containers running within the pod, and the image used to bootstrap the containers. It also shows other details, such as labels, status, and persisted volumes claims that are associated with the pod

You can see these endpoints in this diagram:




When you create the cluster, a number of logins are created. Some of these logins are for services to communicate with each other, and others are for end users to access the cluster. Non-SQL Server End-user passwords currently are set using environment variables. These are passwords that cluster administrators use to access services:

Use Variable
Controller username
CONTROLLER_USERNAME=controller_username
Controller password
CONTROLLER_PASSWORD=controller_password
SQL Master SA password
MSSQL_SA_PASSWORD=controller_sa_password
Password for accessing the HDFS/Spark endpoint
KNOX_PASSWORD=knox_password

Intra-cluster authentication Upon deployment of the cluster, a number of SQL logins are created:

A special SQL login is created in the Controller SQL instance that is system managed, with sysadmin role. The password for this login is captured as a K8s secret. A sysadmin login is created in all SQL instances in the cluster, that Controller owns and manages. It is required for Controller to perform administrative tasks, such as HA setup or upgrade, on these instances. These logins are also used for intra-cluster communication between SQL instances, such as the SQL master instance communicating with a data pool.

Note: In current release, only basic authentication is supported. Fine-grained access control to HDFS objects, the BDC compute and data pools, is not yet available.

For Intra-cluster communication with non-SQL services within the BDC, such as Livy to Spark or Spark to the storage pool, security uses certificates. All SQL Server to SQL Server communication is secured using SQL logins.


Activity: Review Security Endpoints


In this activity, you will review the endpoints exposed on the cluster.

Steps

Open this reference, and read the information you see for the Service Endpoints section. This shows the addresses and ports exposed to the end-users.



For Further Study

Congratulations! You have completed this workshop on SQL Server big data clusters Architecture. You now have the tools, assets, and processes you need to extrapolate this information into other applications.