Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -25,5 +25,5 @@ See [Managing tenants](/enterprise-tenant/managing-tenants/) for details on subs
For the offloading of {{< product-c8y-iot >}} data, you need the connection settings and credentials for a cloud data lake service. During offloading, the data will be written into a data lake folder named after the tenant name.

{{< c8y-admon-info >}}
This section provides instructions on how to configure the data lake so that it is accessible via Dremio. More details can be found in the [Dremio data source documentation](https://docs.dremio.com/current/sonar/data-sources/). Note that you must not create the target table, which connects to the data lake, in Dremio; this is done by {{< product-c8y-iot >}} DataHub.
This section provides instructions on how to configure the data lake so that it is accessible via Dremio. More details can be found in the [Dremio data source documentation](https://docs.dremio.com/current/data-sources/). Note that you must not create the target table, which connects to the data lake, in Dremio; this is done by {{< product-c8y-iot >}} DataHub.
{{< /c8y-admon-info >}}
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ Note that the account type must be **StorageV2**, and the **Hierarchical namespa
While the other settings are fixed once the initial configuration was saved, the **AWS access key** and the **Access secret** can be changed afterwards. Click **Edit**, set new values, and either click **Save credentials** to save the update or **Cancel** to keep the old values.

{{< c8y-admon-req >}}
An S3 bucket with default settings works. If specific security policies are applied, make sure that the minimum policy requirements listed in [https://docs.dremio.com/current/sonar/data-sources/object/s3](https://docs.dremio.com/current/sonar/data-sources/object/s3) are satisfied.
An S3 bucket with default settings works. If specific security policies are applied, make sure that the minimum policy requirements listed in [https://docs.dremio.com/current/data-sources/object/s3/](https://docs.dremio.com/current/data-sources/object/s3/) are satisfied.
{{< /c8y-admon-req >}}

**Server-side encryption** is supported while client-side encryption is not. S3 offers three key management mechanisms:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ convert_from(convert_to("_fragments", 'JSON'), 'UTF8') LIKE '%"c8y_IsDevice"%'

### Querying additional data with {{< product-c8y-iot >}} DataHub {#querying-additional-data-with-datahub}
Main use case of {{< product-c8y-iot >}} DataHub is to offload data from the internal {{< product-c8y-iot >}} database to a data lake and query the data lake contents afterwards. In some use cases, {{< product-c8y-iot >}} DataHub is required to query additional data which is not kept in the {{< product-c8y-iot >}} platform. For a cloud environment, the
additional data must be provided as Parquet files and must be located in the data lake as configured in the initial configuration of {{< product-c8y-iot >}} DataHub. The Parquet files must not be stored in folders that are used as targets for offloadings as this could corrupt offloading pipelines of {{< product-c8y-iot >}} DataHub (if the schema doesn't match with the schema of the Parquet files created via offloading jobs). In addition, the Parquet files must be compliant with the [Dremio limitations for Parquet files](https://docs.dremio.com/current/sonar/query-manage/data-formats/parquet-files).
additional data must be provided as Parquet files and must be located in the data lake as configured in the initial configuration of {{< product-c8y-iot >}} DataHub. The Parquet files must not be stored in folders that are used as targets for offloadings as this could corrupt offloading pipelines of {{< product-c8y-iot >}} DataHub (if the schema doesn't match with the schema of the Parquet files created via offloading jobs). In addition, the Parquet files must be compliant with the [Dremio limitations for Parquet files](https://docs.dremio.com/current/developer/data-formats/parquet/).

For a dedicated environment, the additional data can be located somewhere else, provided it can be accessed via Dremio, for example, in a relational database. For performance and cost reasons, however, data and processing should always be co-located.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -89,4 +89,4 @@ To import the selected configurations, click **Import**. Click **Cancel** to can

As the export does not include whether a configuration was active, you must manually activate the configurations after an import.

For the specific case of inventory offloadings, their definition may not yet be based on views as described in [Configure inventory collection](#configure-inventory-collection). When importing such an offloading, it will be configured so that it still reads directly from the inventory collection. It is advisable, however, to change the configuration and use a view instead in order to ensure that only relevant data is offloaded.
For the specific case of inventory offloadings, their definition may not yet be based on views as described in [Configure inventory collection](#configuring-inventory-collection). When importing such an offloading, it will be configured so that it still reads directly from the inventory collection. It is advisable, however, to change the configuration and use a view instead in order to ensure that only relevant data is offloaded.
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ The inventory collection keeps track of managed objects. During offloading, the
| c8y_IsDevice | BOOLEAN |
| c8y_IsDeviceGroup | BOOLEAN |

The inventory collection keeps track of managed objects. Note that {{< product-c8y-iot >}} DataHub automatically filters out internal objects of the {{< product-c8y-iot >}} platform. These internal objects are also not returned when using the {{< product-c8y-iot >}} REST API. As described in [Configure inventory collection](#configure-inventory-collection), pre-defined views over the inventory collection allow you to confine your offloading to the relevant data. Those views all share the above schema.
The inventory collection keeps track of managed objects. Note that {{< product-c8y-iot >}} DataHub automatically filters out internal objects of the {{< product-c8y-iot >}} platform. These internal objects are also not returned when using the {{< product-c8y-iot >}} REST API. As described in [Configure inventory collection](#configuring-inventory-collection), pre-defined views over the inventory collection allow you to confine your offloading to the relevant data. Those views all share the above schema.

A managed object may change its state over time. The inventory collection also supports updates to incorporate these changes. Therefore an offloading pipeline for the inventory encompasses additional steps:

Expand Down