Skip to content

Commit

Permalink
Proposal: Adapt Multiple Type of Database for Harbor
Browse files Browse the repository at this point in the history
Update support-multiDB.md

Basic structure

add png

feat: db proposal initial verion (#2)

Update the figure

Update adapt_multiple_type_of_database.md (#3)

* Update adapt_multiple_type_of_database.md

Some update

* Update adapt_multiple_type_of_database.md

Update authors

Update adapt_multiple_type_of_database.md

Signed-off-by: JinXingYoung <yvonnexyang@tencent.com>
  • Loading branch information
JinXingYoung committed Mar 9, 2022
1 parent 29b42cd commit ff74aa5
Show file tree
Hide file tree
Showing 6 changed files with 400 additions and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ Workgroup is a virtual team of aggregating the efforts of all the interested par
|[Multi Architecture](https://github.com/goharbor/community/tree/master/workgroups/wg-multiarch)|The purpose of this working group is to enable Harbor to support multiple architectures, such as x86 arm.|ZhiPeng Yu@Alauda ([yuzp1996](https://github.com/yuzp1996))| Steven Zou@VMware ([steven-zou](https://github.com/steven-zou)) |[8 Members](https://github.com/goharbor/community/tree/master/workgroups/wg-multiarch#members)|
|[Performance](https://github.com/goharbor/community/tree/master/workgroups/wg-performance)|The purpose of this work group is to improve the performance and scalability of harbor in the use case of large data volumes.|ChenYu Zhang@Alauda ([chlins](https://github.com/chlins))| Steven Zou@VMware ([steven-zou](https://github.com/steven-zou))/ Daniel Morinigo@Alauda ([danielfbm](https://github.com/danielfbm)) |[11 Members](https://github.com/goharbor/community/tree/master/workgroups/wg-performance#members)|
|[Image Acceleration](https://github.com/goharbor/community/tree/master/workgroups/wg-image-accel)|The objective of this working group is to aggregate efforts to discuss, design and finally integrate accelerated container image format support in Harbor.|Bo Liu@AlibabaCloud ([liubogithub](https://github.com/liubogithub))|Steven Zou@VMware ([steven-zou](https://github.com/steven-zou))/Wang Yan@VMWare ([wy65701436](https://github.com/wy65701436))|[7 members](https://github.com/goharbor/community/tree/master/workgroups/wg-image-accel#members)|
|[Databases](https://github.com/goharbor/community/tree/master/workgroups/wg-databases)|The objective of this working group is to aggregate efforts to support different databases in Harbor in addition to PostgreSQL.|Yvonne@Tencent[@JinXingYoung](https://github.com/JinXingYoung)|Yiyang Huang@ByteDance[@hyy0322](https://github.com/hyy0322)|[3 members](https://github.com/goharbor/community/tree/master/workgroups/wg-databases#members)|

## Structure

Expand Down
383 changes: 383 additions & 0 deletions proposals/adapt_multiple_type_of_database.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,383 @@
# Proposal: `Adapt Multiple Type of Database for Harbor`

Author:

- Yvonne [@JinXingYoung](https://github.com/JinXingYoung)
- Yiyang Huang [@hyy0322](https://github.com/hyy0322)
- De Chen [@cd1989](https://github.com/cd1989)
- Minglu [@ConnieHan2019](18302010059@fudan.edu.cn)

Links:

- Previous Discussion: [goharbor/harbor#6534](https://github.com/goharbor/harbor/issues/6534)
- Related PR from Others Before: [goharbor/harbor#14265](https://github.com/goharbor/harbor/pull/14265)

## Abstract

Propose to support other databases in Harbor, the first step is to support MySQL/MariaDB. This proposal introduces an abstract DAO layer in Harbor, different database can have their drivers to implement the interface. So that Harbor can adapt to other databases as long as a well tested database driver provided.


## Background

As previous discussion([goharbor/harbor#6534](https://github.com/goharbor/harbor/issues/6534)) shown, there are certain amount of users(especially in China) lack the experiences of maintaining PostgreSQL for HA, disaster recovery, etc. And meanwhile they are more familiar with other databases such as MySQL/MariaDB. They prefer to use MariaDB/MySQL instead of PostgreSQL to keep their production environments stable.

As we all know that Harbor used MySQL before. But scanner clair use PostgreSQL as database. In order to keep consistency with clair and reduce maintenance difficulties. Harbor unified database using PostgreSQL.

Since Harbor v2.0 use trivy as default scanner instead of clair,there's no strongly requirement to use PostgreSQL anymore. Therefore, it is possible to adapt different kind of database now.

## Proposal

Support other databases in Harbor other than PostgreSQL.

### Goals

- Keep using PostgreSQL as default database. The Implementation will be compatible with current version of Harbor.
- Abstract DAO layer for different type of databases.
- Support MariaDB(10.5.9), MySQL(8.0) by implementing corresponding drivers and resolving sql compatibility.
- Provide migration tool or guide for users to migrate data from PostgreSQL to MariaDB/MySQL.

### Non-Goals

- Support other type of database, such as MongoDB, Oracle.
- Implement Mariadb/MySQL operator for internal database case.

## Implementation

### Overview

![overview.png](images/multidb/overview.png)

![initdb.png](images/multidb/initdb.png)

![dbdriver.png](images/multidb/dbdriver.png)

### Component Detail

**Common**

Add mysql config settings: src/lib/config/metadata/metadatalist.go
```go
var (
// ConfigList - All configure items used in harbor
// Steps to onboard a new setting
// 1. Add configure item in metadatalist.go
// 2. Get/Set config settings by CfgManager
// 3. CfgManager.Load()/CfgManager.Save() to load/save from configure storage.
ConfigList = []Item{
...
{Name: common.MySQLDatabase, Scope: SystemScope, Group: DatabaseGroup, EnvKey: "MYSQL_DATABASE", DefaultValue: "registry", ItemType: &StringType{}, Editable: false},
{Name: common.MySQLHOST, Scope: SystemScope, Group: DatabaseGroup, EnvKey: "MYSQL_HOST", DefaultValue: "mysql", ItemType: &StringType{}, Editable: false},
{Name: common.MySQLPassword, Scope: SystemScope, Group: DatabaseGroup, EnvKey: "MYSQL_PASSWORD", DefaultValue: "root123", ItemType: &PasswordType{}, Editable: false},
{Name: common.MySQLPort, Scope: SystemScope, Group: DatabaseGroup, EnvKey: "MYSQL_PORT", DefaultValue: "3306", ItemType: &PortType{}, Editable: false},
{Name: common.MySQLUsername, Scope: SystemScope, Group: DatabaseGroup, EnvKey: "MYSQL_USERNAME", DefaultValue: "root", ItemType: &StringType{}, Editable: false},
{Name: common.MySQLMaxIdleConns, Scope: SystemScope, Group: DatabaseGroup, EnvKey: "MYSQL_MAX_IDLE_CONNS", DefaultValue: "2", ItemType: &IntType{}, Editable: false},
{Name: common.MySQLMaxOpenConns, Scope: SystemScope, Group: DatabaseGroup, EnvKey: "MYSQL_MAX_OPEN_CONNS", DefaultValue: "0", ItemType: &IntType{}, Editable: false},
...
}
```

Add DB type in configs: make/photon/prepare/utils/configs.py
```
if external_db_configs:
config_dict['external_database'] = True
# harbor db
config_dict['harbor_db_type'] = external_db_configs['harbor']['type']
config_dict['harbor_db_host'] = external_db_configs['harbor']['host']
config_dict['harbor_db_port'] = external_db_configs['harbor']['port']
config_dict['harbor_db_name'] = external_db_configs['harbor']['db_name']
config_dict['harbor_db_username'] = external_db_configs['harbor']['username']
...
```

MySQL struct: src/common/models/database.go
```go
type MySQL struct {
Host string `json:"host"`
Port int `json:"port"`
Username string `json:"username"`
Password string `json:"password,omitempty"`
Database string `json:"database"`
MaxIdleConns int `json:"max_idle_conns"`
MaxOpenConns int `json:"max_open_conns"`
}
```

Get database infos for different db types: src/common/dao/base.go
```go
func getDatabase(database *models.Database) (db Database, err error) {
switch database.Type {
case "", "postgresql":
db = NewPGSQL(
database.PostGreSQL.Host,
strconv.Itoa(database.PostGreSQL.Port),
database.PostGreSQL.Username,
database.PostGreSQL.Password,
database.PostGreSQL.Database,
database.PostGreSQL.SSLMode,
database.PostGreSQL.MaxIdleConns,
database.PostGreSQL.MaxOpenConns,
)
case "mariadb", "mysql":
db = NewMySQL(
database.MySQL.Host,
strconv.Itoa(database.MySQL.Port),
database.MySQL.Username,
database.MySQL.Password,
database.MySQL.Database,
database.MySQL.MaxIdleConns,
database.MySQL.MaxOpenConns,
)
default:
err = fmt.Errorf("invalid database: %s", database.Type)
}
return
}
```

MigrateDB with certain sql file: src/migration/migration.go
```go
// MigrateDB upgrades DB schema and do necessary transformation of the data in DB
func MigrateDB(database *models.Database) error {
var migrator *migrate.Migrate
var err error

// check the database schema version
switch database.Type {
case "", "postgresql":
migrator, err = dao.NewMigrator(database.PostGreSQL) // migrate db with postgres sql file
case "mariadb", "mysql":
migrator, err = dao.NewMysqlMigrator(database.MySQL) // migrate db with mysql sql file
}

...
}
```

Migration sql file architecture
```
harbor/
├── make
│ ├── migrations
│ │ ├── mysql
│ │ │ ├── 0001_initial_schema.up.sql // File content may different with postgresql
│ │ │ ├── ...
│ │ ├── postgresql
│ │ │ ├── 0001_initial_schema.up.sql
│ │ │ ├── ...
```

DAO layer make different databases compatible. We can extend database support by implementing mysql implementation.

DAO interface in src/pkg/repository/dao/dao.go
```go
// DAO is the data access object interface for repository
type DAO interface {
// Count returns the total count of repositories according to the query
Count(ctx context.Context, query *q.Query) (count int64, err error)
// List repositories according to the query
List(ctx context.Context, query *q.Query) (repositories []*model.RepoRecord, err error)
// Get the repository specified by ID
Get(ctx context.Context, id int64) (repository *model.RepoRecord, err error)
// Create the repository
Create(ctx context.Context, repository *model.RepoRecord) (id int64, err error)
// Delete the repository specified by ID
Delete(ctx context.Context, id int64) (err error)
// Update updates the repository. Only the properties specified by "props" will be updated if it is set
Update(ctx context.Context, repository *model.RepoRecord, props ...string) (err error)
// AddPullCount increase pull count for the specified repository
AddPullCount(ctx context.Context, id int64, count uint64) error
// NonEmptyRepos returns the repositories without any artifact or all the artifacts are untagged.
NonEmptyRepos(ctx context.Context) ([]*model.RepoRecord, error)
}
```

**Core**

make/photon/prepare/templates/core/env.jinja:
```
DATABASE_TYPE={{harbor_db_type}}
{% if ( harbor_db_type == "mysql" or harbor_db_type == "mariadb" ) %}
MYSQL_HOST={{harbor_db_host}}
MYSQL_PORT={{harbor_db_port}}
MYSQL_USERNAME={{harbor_db_username}}
MYSQL_PASSWORD={{harbor_db_password}}
MYSQL_DATABASE={{harbor_db_name}}
MYSQL_MAX_IDLE_CONNS={{harbor_db_max_idle_conns}}
MYSQL_MAX_OPEN_CONNS={{harbor_db_max_open_conns}}
{% else %}
POSTGRESQL_HOST={{harbor_db_host}}
POSTGRESQL_PORT={{harbor_db_port}}
POSTGRESQL_USERNAME={{harbor_db_username}}
POSTGRESQL_PASSWORD={{harbor_db_password}}
POSTGRESQL_DATABASE={{harbor_db_name}}
POSTGRESQL_SSLMODE={{harbor_db_sslmode}}
POSTGRESQL_MAX_IDLE_CONNS={{harbor_db_max_idle_conns}}
POSTGRESQL_MAX_OPEN_CONNS={{harbor_db_max_open_conns}}
{% endif %}
```

**Exporter**

make/photon/prepare/templates/exporter/env.jinja
```
HARBOR_DATABASE_TYPE={{harbor_db_type}}
HARBOR_DATABASE_HOST={{harbor_db_host}}
HARBOR_DATABASE_PORT={{harbor_db_port}}
HARBOR_DATABASE_USERNAME={{harbor_db_username}}
HARBOR_DATABASE_PASSWORD={{harbor_db_password}}
HARBOR_DATABASE_DBNAME={{harbor_db_name}}
HARBOR_DATABASE_SSLMODE={{harbor_db_sslmode}}
```

**JobService**

JobService get configs from Core, then initDatabase with db config obtained.
src/pkg/config/manager.go
```go
// GetDatabaseCfg - Get database configurations
func (c *CfgManager) GetDatabaseCfg() *models.Database {
ctx := context.Background()
database := &models.Database{}
database.Type = c.Get(ctx, common.DatabaseType).GetString()

switch database.Type {
case "", "postgresql":
postgresql := &models.PostGreSQL{
Host: c.Get(ctx, common.PostGreSQLHOST).GetString(),
Port: c.Get(ctx, common.PostGreSQLPort).GetInt(),
Username: c.Get(ctx, common.PostGreSQLUsername).GetString(),
Password: c.Get(ctx, common.PostGreSQLPassword).GetString(),
Database: c.Get(ctx, common.PostGreSQLDatabase).GetString(),
SSLMode: c.Get(ctx, common.PostGreSQLSSLMode).GetString(),
MaxIdleConns: c.Get(ctx, common.PostGreSQLMaxIdleConns).GetInt(),
MaxOpenConns: c.Get(ctx, common.PostGreSQLMaxOpenConns).GetInt(),
}
database.PostGreSQL = postgresql
case "mariadb", "mysql":
mysql := &models.MySQL{
Host: c.Get(ctx, common.MySQLHOST).GetString(),
Port: c.Get(ctx, common.MySQLPort).GetInt(),
Username: c.Get(ctx, common.MySQLUsername).GetString(),
Password: c.Get(ctx, common.MySQLPassword).GetString(),
Database: c.Get(ctx, common.MySQLDatabase).GetString(),
MaxIdleConns: c.Get(ctx, common.MySQLMaxIdleConns).GetInt(),
MaxOpenConns: c.Get(ctx, common.MySQLMaxOpenConns).GetInt(),
}
database.MySQL = mysql
}

return database
}
```

### Migration

#### Write a document Using official migration tool

MySQL community provide a [tool](https://dev.mysql.com/doc/workbench/en/wb-migration-database-postgresql.html) to migrate data.(Need further test)

#### Write a script to migrate data.

1. Define data models for postgreSQL and MariaDB/MySQL.
2. Read data from postgreSQL and map to data model.
3. Transfer postgreSQL data model to MariaDB/MySQL data model.
4. Write data to MariaDB/MySQL database

### Database Compatibility Testing

**MySQL 8.0**

We have done some test for SQL compatibility. Here we list some SQL Incompatible points.

- TRIGGER is not needed in MySQL. Just use default CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP.
- SERIAL type in MySQL is bigint unsigned. So a column reference other column type SERIAL must define as bigint unsigned.
- Alter column type in MySQL must use MODIFY.
- Loop keyword is different between MySQL and PostgreSQL.
- Create index not support IF NOT EXIST in MySQL.

sql file | Compatibility test | comment
------------|------------|------------
| 0001_initial_schema.up.sql | Pass | xxx
| 0002_1.7.0_schema.up.sql | Pass | xxx
| 0003_add_replication_op_uuid.up.sql | Pass | xxx
| 0004_1.8.0_schema.up.sql | Pass | xxx
| 0005_1.8.2_schema.up.sql | Pass | xxx
| 0010_1.9.0_schema.up.sql | Pass | xxx
| 0011_1.9.1_schema.up.sql | Pass | xxx
| 0012_1.9.4_schema.up.sql | Pass | xxx
| 0015_1.10.0_schema.up.sql | Pass | xxx
| 0030_2.0.0_schema.up.sql | Pass | xxx
| 0031_2.0.3_schema.up.sql | Pass | xxx
| 0040_2.1.0_schema.up.sql | Pass | xxx
| 0041_2.1.4_schema.up.sql | Pass | xxx
| 0050_2.2.0_schema.up.sql | Pass | xxx
| 0051_2.2.1_schema.up.sql | Pass | xxx
| 0052_2.2.2_schema.up.sql | Pass | xxx
| 0053_2.2.3_schema.up.sql | Pass | xxx
| 0060_2.3.0_schema.up.sql | Pass | xxx
| 0061_2.3.4_schema.up.sql | Pass | xxx
| 0070_2.4.0_schema.up.sql | Pass | xxx
| 0071_2.4.2_schema.up.sql | Pass | xxx
| 0080_2.5.0_schema.up.sql | Pass | xxx

**MariaDB 10.5.9**

sql file | Compatibility test | comment
------------|------------|------------
| 0001_initial_schema.up.sql | Pass | xxx
| 0002_1.7.0_schema.up.sql | Pass | xxx
| 0003_add_replication_op_uuid.up.sql | Pass | xxx
| 0004_1.8.0_schema.up.sql | Pass | xxx
| 0005_1.8.2_schema.up.sql | Pass | xxx
| 0010_1.9.0_schema.up.sql | Pass | xxx
| 0011_1.9.1_schema.up.sql | Pass | xxx
| 0012_1.9.4_schema.up.sql | Pass | xxx
| 0015_1.10.0_schema.up.sql | Pass | xxx
| 0030_2.0.0_schema.up.sql | Pass | xxx
| 0031_2.0.3_schema.up.sql | Pass | xxx
| 0040_2.1.0_schema.up.sql | | xxx
| 0041_2.1.4_schema.up.sql | | xxx
| 0050_2.2.0_schema.up.sql | | xxx
| 0051_2.2.1_schema.up.sql | | xxx
| 0052_2.2.2_schema.up.sql | | xxx
| 0053_2.2.3_schema.up.sql | | xxx
| 0060_2.3.0_schema.up.sql | | xxx
| 0061_2.3.4_schema.up.sql | | xxx
| 0070_2.4.0_schema.up.sql | | xxx
| 0071_2.4.2_schema.up.sql | | xxx
| 0080_2.5.0_schema.up.sql | | xxx

### How To Use

Users can configure database type to use MariaDB/MySQL in external_database mode. PostgreSQL will be used by default.

Set harbor_db_type configuration under external_database and db type under notary db configuration:
```
external_database:
harbor:
# database type, default is postgresql, options include postgresql, mariadb and mysql
type: mysql
host: mysql.test.cn
port: 3306
db_name: harbor
username: root
password: root
ssl_mode: disable
max_idle_conns: 2
max_open_conns: 0
notary_signer:
host: mysql.test.cn
port: 3306
db_name: notary_signer
username: root
password: root
ssl_mode: disable
notary_server:
host: mysql.test.cn
port: 3306
db_name: notary_server
username: root
password: root
ssl_mode: disable
```

Binary file added proposals/images/multidb/dbdriver.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added proposals/images/multidb/initdb.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added proposals/images/multidb/overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit ff74aa5

Please sign in to comment.