IEP |
2 |
Title |
Azure Virtual Networks for Cluster Segregation |
Author |
|
Status |
📜 Complete |
Type |
Architecture |
Created |
2016-11-16 |
Currently the only network connecting various infrastructure services is the public internet. Regardless of the security level of a service, if other services must connect to it they must do so over the public internet. This is not an explicit design decision, but rather a consequence of the organic, cross-datacenter manner in which infrastructure grew.
As part of the Azure migration the Jenkins project now has access to "modern" software-defined networking tools which can allow designing and implementing better network topologies than "just the public internet."
This can be accomplished with the Azure Virtual Networks feature, similar to the AWS VPC feature, which allows definition of software defined networks into which Azure resources can be deployed (e.g. VMs, SQL Server DBs, etc).
This proposal is for the creation of three Virtual Networks for the Jenkins project infrastructure:
-
Public Production
-
Private Production
-
Development
At some point in the future, there might be more networks necessary, but at this point this seems sufficient to bootstrap the project infrastructure on Azure in a reasonably sane fashion.
The "Public Production" Virtual Network would contain end-user (developer or Jenkins user) facing services such as, but not limited to:
-
jenkins.io - Primary (static) website
-
ci.jenkins.io - Jenkins-on-Jenkins cluster
-
accounts.jenkins.io - Account app
-
JIRA and Confluence
Services in this network should be appropriately protected with Network Security Groups, only allowing the necessary application ports to be services, but should be otherwise considered "public."
The "Private Production" network is for services which are internal, or highly sensitive, within the Jenkins project’s infrastructure. These are services which include, but are not limited to:
-
Puppet Master - the holder of all secrets for provisioning Public and Private Production-level services
-
"trusted.ci" - a behind-the-scenes Jenkins cluster with release/signing keys
This network would also utilize Network Security Groups and will be peered with the Public Production network via Virtual Network Peering which allows the two networks to route between each other via the Azure network backbone using private IP addresses. This peering is required to manage services via Puppet in the Public Production network.
The Private Production network will be locked down and unaccessible from the public internet. There are however some contributors, such as board members and team leads, which will need access to services within the Private Production network.
These contributors will need to be granted access via a VPN Gateway in Azure. This creates some minor additional cost and management overhead per-user, but the security provided by the Private Production network enables projects such as fully automated core releases.
The "Development" network is somewhat of a "catch-all" for services which are not yet ready for production usage, testing, and demonstration environments. Services which live in this environment will not have access to the Puppet Master and therefore will not be capable of being fully provisioned in the same manner as "production" services.
ℹ️
|
At some point in the future, a staging Puppet master may be provisioned in this network but that is outside the scope of this document. |
+---------------------+
| |
+---------------> | Public Production <-------+
| | | |
| +---------------------+ VNet Peering
| |
| +-------------v--------+
+-------------+ | |
The Internet ---------> + VPN Gateway |-| Private Production |
+-------------+ | |
| +----------------------+
|
| +----------------+
| | |
+---------------> | Development |
| |
+----------------+
Structuring the Azure-based infrastructure across the three proprosed Virtual Networks will create an additional level of service balkanization which we are currently (pre-Azure) are unable to provide. Per our possible infrastructure compromise earlier this year [1], the infrastructure should be more balkanized whenever possible to reduce the impact, or remove the possibility, of incursions into Jenkins project infrastructure.
The three Virtual Networks proposed represent a "minimum" structure to get infrastructure provisioned safely into Azure from a network perspective.
A completely flat network topology was briefly considered, and does make management very easy, but leaves us with little network-based protection against unknown vulnerability in some of the non-end-user-facing services. Additionally, it is a requirement for at least "trusted.ci" to exist off the public internet as it contains signing keys and other highly sensitive secrets.
Trusting Network Security Groups alone may inadvertently leave open holes in our infrastructure, whereas implementing a fundamental layer two [2] separation ensures misconfigurations and/or accidents don’t leave sensitive services in the Jenkins project infrastructure exposed.
The cost of maintenance/implementation of these networks cannot be estimated at this point in time.
The monetary cost only plays a factor when routing traffic between two networks, which would would be:
Inbound data transfer |
$0.01 per GB |
Outbound data transfer |
$0.01 per GB |
As of right now there is no reference implementation of the various Virtual Networks.