Skip to content

kawa-analytics/kawa-install

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KAWA Installation

This installation process is meant for simple on premise deployments.

1. Prerequisites

1.a General requirements

We currently support Ubuntu Systems 20.04, 22.04 and 24.04 LTS. Compatibility with other linux distributions should work fine but was not tested.

Here is what you will need:

  • You need an account with the ability to run sudo on the target machine

  • Access and Credentials to our registry here: Gitlab registry.

  • A valid KAWA license.

1.b Hardware requirements

RAM

For small amounts of data (up to ~200 GB compressed), it is best to use as much memory as the volume of data. For large amounts of data and when processing interactive (online) queries, you should use a reasonable amount of RAM (128 GB or more) so the hot data subset will fit in the cache of pages. Even for data volumes of ~50 TB per server, using 128 GB of RAM significantly improves query performance compared to 64 GB.

CPU

KAWA will use all available CPU to maximize performance. So the more CPU - the better. For processing up to hundreds of millions / billions of rows, the recommended number of CPUs is at least 64-cores. We only support AMD64 architecture.

Storage Subsystem

SSD is preferred. HDD is the second best option, SATA HDDs 7200 RPM will do. The capacity of the storage subsystem directly depends on the target analytics perimeter.

2. Installation procedure

The installation procedure will install all the KAWA components:

  • A postgres database
  • A clickhouse data warehouse
  • The KAWA server
  • The KAWA script runner

All these components can be installed separately if you wish.

2.a Installation steps

  1. Clone this repository on the target machine
git clone https://github.com/kawa-analytics/kawa-install.git
cd kawa-install
  1. Input your token:
echo 'gldt-*******' > configuration/deploy.token
  1. Run the installation script as root:
sudo ./install.sh

⚡ Important: During the installation process, you will be prompted for the password for the system user on clickhouse. Keep it safe, it will be necessary further down the installation.

2.b Test login on the WEB UI

Connect to the web server from a web browser to test the installation:

By default, KAWA will listen on port 8080.

The default credentials are:

login: setup-admin@kawa.io
password: changeme

Login page

3. Initial configuration

The initial configuration can be done following the documentation hosted here: KYWY doc github.

Follow the README and then: Initial setup Notebook

4. Exploitation

Please refer to the full documentation here: https://github.com/kawa-analytics/kawa-docker-install

The KAWA Server and the KAWA Python runner are both started with the kawa-system user. Both are started as systemd services.

sudo systemctl status kawa
sudo systemctl status kawa-python-runner

You can use stop, start and restart to control the services.

4.a Log files

The log files can be found here: /var/log/kawa.

  • The server is generating the kawa-standalone.log file.
  • The python runner: kawapythonserver.log.

4.b Configuration files

They are located in the /etc/kawa directory. The main parameters are located in the kawa.env file.

4.c User data

The /var/lib/kawa will contain user data such as scripts and uploaded csvs. Please make sure that it contains enough space.

4.d Configuring other warehouses

KAWA is compatible with the following data warehouses/data lakes:

  • Clickhouse
  • Snowflake
  • Trino
  • Big Query
  • Starrocks

In order to configure them, please refer to the kawa.env files, which contains more details. Please contact support@kawa.ai for assistance regarding this configuration.

About

Installation of KAWA without docker

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages