Skip to content

Conversation

@sperlingxx
Copy link
Contributor

Add a fixed random seed in sql which generating a random table for trainAndValData splitation. With fixed seed, we can keep data splitation return consist trainAndValDataSet.

Copy link
Collaborator

@weiguoz weiguoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@sperlingxx sperlingxx merged commit b7ce643 into sql-machine-learning:develop Sep 2, 2019
}
// create a table, then split it into train and val tables
stmt := fmt.Sprintf("CREATE TABLE %s LIFECYCLE %d AS SELECT *, RAND() AS %s FROM (%s) AS %s_ori", target, temporaryTableLifecycle, randomColumn, slct, target)
stmt := fmt.Sprintf("CREATE TABLE %s LIFECYCLE %d AS SELECT *, RAND(42) AS %s FROM (%s) AS %s_ori", target, temporaryTableLifecycle, randomColumn, slct, target)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sperlingxx Please:

  1. define the ultimate number 42 as a const value.
  2. add comments why need to fix the random seed for better readability.

@sperlingxx sperlingxx deleted the xgboost_dev branch September 3, 2019 02:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants