Skip to content

Conversation

@ssalinas
Copy link
Member

@ssalinas ssalinas commented Jun 17, 2019

The first in a short series of performance PRs. Our mysql setup has some easy wins in efficiency that I will try to take advantage of:

  • We don't support utf8 in IDs anywhere else (wreaks havoc in zk sometimes), make mysql ascii too
  • Add missing index on requestId, deployId, updatedAt as well as on host, updatedAt for task history searching
  • message fields should be utf8mb4 ont plain utf8
  • lastTaskStatus, requestState, and deployState should be ENUMs

TODO:

  • Decide if it's worth converting our blob columns to real json columns. Mysql/postgres both optimize storage for those, and it would open up a realm of new history searching based on fields in the json. However, we would have to write a job to backfill from old bytes columns before dropping them, which could be a pain. Thoughts (maybe @baconmania @pschoenfelder ?)
  • Finish adding a way to kick off the sql backfill from bytes -> json columns

@ssalinas
Copy link
Member Author

ssalinas commented Jun 19, 2019

Documenting so I can add it to release docs later. ptosc alter statements look like:

requestHistory table -> --alter "CHARACTER SET ascii COLLATE ascii_bin, MODIFY COLUMN request blob DEFAULT NULL, MODIFY COLUMN requestId varchar(100) CHARACTER SET ascii COLLATE ascii_bin NOT NULL, MODIFY COLUMN requestState ENUM ('CREATED', 'UPDATED', 'DELETING', 'DELETED', 'PAUSED', 'UNPAUSED', 'ENTERED_COOLDOWN', 'EXITED_COOLDOWN', 'FINISHED', 'DEPLOYED_TO_UNPAUSE', 'BOUNCED', 'SCALED', 'SCALE_REVERTED') NOT NULL, MODIFY COLUMN user varchar(100) CHARACTER SET ascii COLLATE ascii_bin DEFAULT NULL, MODIFY COLUMN message varchar(280) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL, ADD COLUMN json JSON DEFAULT NULL"

deployHistory table -> --alter "CHARACTER SET ascii COLLATE ascii_bin, MODIFY COLUMN bytes MEDIUMBLOB DEFAULT NULL, MODIFY COLUMN requestId varchar(100) CHARACTER SET ascii COLLATE ascii_bin NOT NULL, MODIFY COLUMN deployId varchar(100) CHARACTER SET ascii COLLATE ascii_bin NOT NULL, MODIFY COLUMN user varchar(100) CHARACTER SET ascii COLLATE ascii_bin DEFAULT NULL, MODIFY COLUMN message varchar(280) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL, MODIFY COLUMN deployState ENUM ('SUCCEEDED', 'FAILED_INTERNAL_STATE', 'CANCELING', 'WAITING', 'OVERDUE', 'FAILED', 'CANCELED') NOT NULL, ADD COLUMN json JSON DEFAULT NULL"

taskHistory table -> --alter "CHARACTER SET ascii COLLATE ascii_bin, MODIFY COLUMN bytes MEDIUMBLOB DEFAULT NULL, MODIFY COLUMN taskId varchar(200) CHARACTER SET ascii COLLATE ascii_bin NOT NULL, MODIFY COLUMN requestId varchar(100) CHARACTER SET ascii COLLATE ascii_bin NOT NULL, MODIFY COLUMN lastTaskStatus ENUM ('TASK_LAUNCHED', 'TASK_STAGING', 'TASK_STARTING', 'TASK_RUNNING', 'TASK_CLEANING', 'TASK_KILLING', 'TASK_FINISHED', 'TASK_FAILED', 'TASK_KILLED', 'TASK_LOST', 'TASK_LOST_WHILE_DOWN', 'TASK_ERROR', 'TASK_DROPPED', 'TASK_GONE', 'TASK_UNREACHABLE', 'TASK_GONE_BY_OPERATOR', 'TASK_UNKNOWN') NOT NULL, MODIFY COLUMN runId varchar(100) CHARACTER SET ascii COLLATE ascii_bin DEFAULT NULL, MODIFY COLUMN deployId varchar(100) CHARACTER SET ascii COLLATE ascii_bin DEFAULT NULL, ADD COLUMN json JSON DEFAULT NULL, ADD KEY requestDeployUpdated (requestId, deployId, updatedAt), ADD KEY hostUpdated (host, updatedAt)"

taskUsage table -> --alter "CHARACTER SET ascii COLLATE ascii_bin ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8, MODIFY COLUMN requestId varchar(100) CHARACTER SET ascii COLLATE ascii_bin NOT NULL DEFAULT '', MODIFY COLUMN taskId varchar(200) CHARACTER SET ascii COLLATE ascii_bin NOT NULL DEFAULT ''"

@ssalinas
Copy link
Member Author

Also notes for later release docs since we have the taskUsage migration in there as well. If folks are using ptosc the order will be:

  • db migrate --count 2 (task usage create + alter migrations)
    • alternatively if tracking master branch, run with --count 1 then run the ptosc taskUsage
  • ptosc taskHistory
  • ptosc requestHistory
  • ptosc deployHistory
  • db fast-forward --all

@ssalinas ssalinas added the hs_qa label Jun 20, 2019
@baconmania
Copy link
Contributor

🚢

@baconmania
Copy link
Contributor

🚢

@ssalinas ssalinas merged commit 4aa6edb into master Jul 10, 2019
@ssalinas ssalinas deleted the sql_rework branch July 10, 2019 14:48
@ssalinas ssalinas added this to the 0.23.0 milestone Jul 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants