Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Deployment - Session Upgrade #2334

Open
Tracked by #2089
penghuo opened this issue Oct 19, 2023 · 0 comments
Open
Tracked by #2089

[FEATURE] Deployment - Session Upgrade #2334

penghuo opened this issue Oct 19, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@penghuo
Copy link
Collaborator

penghuo commented Oct 19, 2023

Requirements:

  • EMR-S application associate with the session could be upgraded.
  • Upgrade should not impact running/waiting interactive query.
  • Upgrade should not impact streaming query.

Opt-1 Client manage EMR-S Job during BG upgrade

  1. new job will start, sessionId is same is oldJob
    spark job update state as RUNNING.
    spark job update new JobId and new AppId
    keep update the heartbeat.
    found statements by query
    select * form request_index
    where
    sessionId="sessionId"
    and statemtnState = "waiting"
    and appId = "newAppId"
    and jobId = "newJobId"
    order by submitTime

  2. old job keep running, until finish all the tasks.
    found statements by query
    select * form request_index
    where
    sessionId="sessionId"
    and statemtnState = "waiting"
    and appId = "oldAppId"
    and jobId = "oldJobId"
    order by submitTime

Opt-2 Plugin manage EMR-S job during BG upgrade

  1. Blue-Green deployment, create newApp
  2. Update setting with newAppId, BG=true.
  3. DP-Deployment monitor is scheduled to run every 30mis, and detect BG=True, newAppId.

for session in Running-Sessions
if it is streaming

  1. old job will stop, spark job update state as DEAD.
    - SQL in old job is CREATE SKIPPING INDEX on mys3.default.http_logs
  2. new job will start, spark job update state as RUNNING. keep update the heartbeat.
    - SQL in new job is RECOVER INDEX JOB flintJobName

if it is interactive session

  1. start newJob, sessionId is same is oldJob
    new jobs check if session.BG=true and myJobId == newJobId, it will continue running.
  2. set session.BG=true, session.jobId = newJobId.
  3. when old job finish existing work, it check the condition if session.BG=true, session.myJobId != myJobId, oldJob exit.

Use cases

  • case-1, old job failed when process the query
    opt-1. SessionStateMonitor should not retry failed job
    opt-2, oldJob detect the BG=true. does not update sessionState.

  • case-2, old job failed when CP force close the oldApp
    opt-1, SessionStateMonitor should not retry failed job
    opt-2, oldJob detect the BG=true. does not update sessionState.

  • case-3, too much tasks on old job
    opt-1, How long client should wait for oldJob finish?
    opt-2, At max, 10mins. Session will timeout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants