Install and configure the Apache NiFi dataflow automation software.
This module will download the Apache NiFi tarball to /var/tmp/
.
Please make sure you have space for this file.
The tarball will be unpacked to a subdirectory under /opt/nifi
by default,
where it will require about the same disk space. For ease of access, the
symlink /opt/nifi/current
will point to the managed nifi directory.
NiFi defaults to store logs and state and configuration within the installation directory. This module changes this behaviour.
The module will create /var/opt/nifi
, for persistent storage outside
the software install root. This will also configure the following nifi
properties to create directories under this path.
- nifi.content.repository.directory.default
- nifi.database.directory
- nifi.documentation.working.directory
- nifi.flowfile.repository.directory
- nifi.nar.working.directory
- nifi.provenance.repository.directory.default
- nifi.web.jetty.working.directory
The module will create /var/log/nifi
, and configures NiFi to write log files
to this directory. NiFi handles log rotation by itself. See Managing
logs for more information.
The module will create /opt/nifi/conf
to store puppet managed configuration
files. The NiFi generated configuration files and the flow.xml
configuration
archive will also be stored here.
NiFi requires Java Runtime Environment. NiFi 1.14.0 runs on Java 8 or Java 11.
NiFi requires ~ 1.3 GiB download, temporary storage and unpacked
storage. Ensure /opt/nifi
and /var/tmp
has room for the downloaded
and unpacked software.
When installing on local infrastructure, consider download the distribution tarballs, validate them with the Apache distribution keys, and store it on a local repository. Adjust the configuration variables to point to your local repository. The NiFi download page also documents how to verify the integrity and authenticity of the downloaded files.
Add dependency modules to your puppet environment:
- camptocamp/systemd
- puppet/archive
- puppetlabs/inifile
- puppetlabs/stdlib
You need to ensure java 8 or 11 is installed. If in doubt, use this module:
- puppetlabs/java
By default, NiFi 1.14.0 and later starts with a self-signed TLS certificate,
listens on the lo
interface only, and generates a random username and
password for access. You will need to add nifi properties to override this.
Follow the NiFi administration guide for configuration, or see the example
further down in this README.
To download and install NiFi, include the module. This will download nifi,
unpack it under /opt/nifi/nifi-<version>
, and start the service with default
configuration and storage locations.
By default, NiFi is not available over the network. It will bind to 127.0.0.1
port 8443
, using HTTPS with a self signed certificate. To make NiFi available
over the network, you will need to ensure it listens on an external interface.
Set the property nifi.web.https.host
to a hostname or an external IP address.
To change the port number, set nifi.web.https.port
.
A minimal manifest for installing Java and NiFi, then making NiFi available over the network is:
class { 'java': }
class { 'nifi':
nifi_properties => {
'nifi.web.https.host' => $trusted['certname'],
}
}
Class['java'] -> Class['nifi::service']
This module installs a specific version of NiFi. If a newer version of NiFi has been released available, the older one will generally not be downloadable from the Apache download CDN site. You will need to adjust 'version' and the 'download_checksum' parameters:
class { 'nifi':
version => 'x.y.z',
download_checksum => 'abcde...' # sha256 checksum
}
The SHA256 checksum of the NiFi tar.gz is available on the NiFi download page.
NiFi is a big download. Please consider hosting a copy locally for your own
use. To use a local repository, set the download_url
, download_checksum
and
version
parameters.
Example using puppet manifests:
class { 'nifi':
version => '1.14.0',
download_checksum => '858e12bce1da9bef24edbff8d3369f466dd0c48a4f9892d4eb3478f896f3e68b',
download_url => 'https://repo.example.com/nifi/nifi-1.14.0-bin.tar.gz',
}
Example using hieradata:
include nifi
nifi::version: "1.14.0"
nifi::download_checksum: "858e12bce1da9bef24edbff8d3369f466dd0c48a4f9892d4eb3478f896f3e68b"
nifi::download_url: "https://repo.example.com/nifi/nifi-1.14.0-bin.tar.gz"
Please keep download_url
, download_checksum
and version
in sync. The
URL, checksum and version should match. Otherwise, Puppet will become
confused.
To set nifi properties, like the 'sensitive properties key', add them
to the nifi_properties
class parameter. Example:
class { 'nifi':
nifi_properties => {
'nifi.sensitive.props.key' => 'keep it secret, keep it safe',
},
}
(I recommend you use hiera-eyaml
to store this somewhat securely.)
NiFi can use TLS for traffic encryption as well as authentication. Providing the TLS certificate and key is outside the scope of this module.
This example assumes you have certificates and keys stored under /etc/pki/tls, and use the system CA trust store.
Add the puppetlabs-java_ks
module to your environment to manage the Java
keystore used by NiFi.
class profile::nifi (
Stdlib::Fqdn $hostname = $trusted['certname'],
Sensitive[String] $keystorepassword = 'changeme',
) {
$hostcert = "/etc/pki/tls/certs/${hostname}.pem"
$hostprivkey = "/etc/pki/tls/private/${hostname}.pem"
class { 'java': }
class { 'nifi':
nifi_properties => {
# Web properties
'nifi.web.https.host' => $hostname,
# TLS properties
'nifi.security.keystore' => '/opt/nifi/config/kesystore.jks',
'nifi.security.keystoreType' => 'jks',
'nifi.security.keystorePasswd' => $keystorepassword,
'nifi.security.truststore' => '/etc/pki/ca-trust/extracted/java/cacerts',
'nifi.security.truststoreType' => 'jks',
'nifi.security.truststorePasswd' => '',
}
}
Package['java'] -> Service['nifi.service']
java_ks { "${hostname}:/opt/nifi/config/keystore.jks":
ensure => latest,
password => $keystorepassword,
certificate => $hostcert,
private_key => $hostprivkey,
require => Class['nifi::config'],
before => Class['nifi::service'],
}
}
To create a cluster, set the cluster
class parameter to true, and add cluster
members to the cluster_nodes
hash. This configures the cluster to use
zookeeper for shared state.
Nifi requires you to set a nifi.sensitive.props.key
on all cluster nodes.
If you cluster nifi and also override the authorizers.xml
file, ensure you
also include the cluster nodes in this file.
Also, you need to configure TLS:
- Generate TLS certificates
- Set the property
nifi.cluster.protocol.is.secure = true
Or continue without TLS:
- Set the property
nifi.web.http.port
class profile::nifi {
class { 'java': }
class { 'nifi':
cluster => true,
nifi_properties => {
'nifi.sensitive.props.key' => 'a shared secret for encrypting properties',
},
cluster_nodes => {
'node1.example.com' => { 'id' => 1 },
'node2.example.com' => { 'id' => 2 },
'node3.example.com' => { 'id' => 3 },
}
}
Class['java'] -> Class['nifi::service']
}
In addition to the clustering parameters you need to add TLS certificates from a trusted Certificate Authority for cluster communication.
User authentication is managed using the
nifi.login.identity.providers.configuration.file
and
nifi.security.user.login.identity.provider
properties. On a fresh install,
NiFi uses the single-user-provider
. A random username and password is created
and written to the nifi-app.log
file. This is documented at
https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#user_authentication
This module does not manage login identity provider configuration. If you want to connect your NiFi to Active Directory or other LDAP server, you need to manage this property and provide a file.
class profile::nifi {
$login_identity_providers => '/opt/nifi/conf/custom-login-identity-providers.xml'
class nifi {
nifi_properties => {
nifi.login.identity.providers.configuration.file => $login_identity_providers,
nifi.security.user.login.identity.provider => 'my-custom-identity-provider',
}
}
$template_params = {
# [...]
}
file { $login_identity_providers:
content => epp('profile/nifi/my-custom-login-identity-providers.epp, $template_params')
# [...]
}
}
Authorization is managed using the nifi.authorizer.configuration.file
and
nifi.security.user.authorizer
properties. This is documented at
'https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#multi-tenant-authorization'
This module manages /opt/nifi/conf/authorizers.xml
to support clustering, it
is otherwise similar to the default content.
You can override this file using a collector (using the File <| ... |> {}
syntax) to use your own template by overriding the content
parameter of the
file managed by the nifi module.
class profile::nifi {
$authorizers => '/opt/nifi/conf/authorizers.xml'
class { 'nifi':
nifi_properties => {
nifi.authorizer.configuration.file => $authorizers, # module default, added for clarity
nifi.security.user.authorizer => 'my-custom-authorizer-provider',
}
}
$template_params = {
# [...]
}
File <| title == $authorizers |> {
content => epp('profile/nifi/my-custom-authorizers.epp, $template_params')
}
}
Note: The example above assumes that the module parameter
nifi::config_directory
is left at its default /opt/nifi/conf
.
The Upgrade Recommendations lists properties which should be set to enable NiFi upgrades to keep the same configuration and state.
The module has defaults for data storage outside the installation
directory. For now, you need to add add settings to point to the
config_resource_dir
used in the examples above.
nifi.flow.configuration.file
nifi.flow.configuration.archive.dir
nifi.authorizer.configuration.file
The NiFi logs are written to $nifi::log_directory
(default /var/log/nifi
).
The directory prevents access for "other", but the files within are otherwise
readable. You can use ACLs on the directory to permit access to your favourite
log reading program. The
puppet-posix_acl module
can be used like this:
class profile::nifi (
$log_directory => '/var/log/nifi',
) {
class { 'nifi':
log_directory => $log_directory,
# [...]
}
posix_acl { $log_directory:
action => set,
permission => [ 'user:logreader:r-x' ],
require => File[$log_directory],
}
}
This module configures NiFi to use /opt/nifi/conf/state-management.xml
instead of the ./conf/state-management.xml
in the NiFi install directory. The
values in this file are NiFi defaults, apart from the local state management
directory or the cluster state management connect string.
To override this file with your own values, provide a nifi_properties
class
parameter which includes nifi.state.management.configuration.file
pointing to
your own file.
class profile::nifi (
$custom_state_management => '/path/to/custom/state-management.xml',
) {
class { 'nifi':
nifi_properties => {
'nifi.state.management.configuration.file' => $custom_state_management
}
}
file { $custom_state_management:
notify => Class['nifi::service'],
}
}
About the ZooKeeper connection string. The NiFi administration guide says "This should containe a list of all ZooKeeper instances in the ZooKeeper quorum", while the ZooKeeper overview says "a client connects to one node". This module follows assumes that the NiFi cluster runs its own ZooKeeper and lets any node connect as client to any other node.
nifi 1 nifi 2 nifi 3
| | |
zookeeper 1 --- zookeeper 2 --- zookeeper 3
Java Keystore: NiFi administration guide says "JKS is the preferred type", while the "keytool" utility provided by the java package says "JKS is deprecated, use PKCS12".
This module is under development, and therefore somewhat light on functionality and sensible defaults.
State management: This module configures rudimentary NiFi state management for
local state and with zookeeper for cluster state. The redis
method is not
managed with this module.
To manage more configuration files, add a file resource of your own, and set
the related property using the nifi_properties
class parameter.
In the Development section, tell other users the ground rules for contributing to your project and how they should submit their work.