Skip to content
This repository has been archived by the owner on Dec 17, 2024. It is now read-only.

Feature/2.1.0 #14

Merged
merged 8 commits into from
Sep 23, 2019
Merged

Feature/2.1.0 #14

merged 8 commits into from
Sep 23, 2019

Conversation

seddonm1
Copy link
Contributor

Add change to Lifecycle Hook API and add additional config reader types

@seddonm1 seddonm1 requested a review from jbruce September 20, 2019 04:26
val leftOutputColumns = leftView.columns.map{columnName => col(s"datasetA.${columnName}")}
val rightOutputColumns = rightView.columns.map{columnName => col(s"datasetB.${columnName}")}

pipelineModel.stages(3).asInstanceOf[MinHashLSHModel]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider searching for the MinHashLSHModel instead of by index in case the position changes in the future

}

// build locality-sensitive hashing model
val minHashLSH = { new MinHashLSH()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor style suggestion to remove the surrounding brackets as it doesn't require a block (same for above)


pipelineModel.stages(3).asInstanceOf[MinHashLSHModel]
.approxSimilarityJoin(datasetA, datasetB, (1.0-stage.threshold))
.select((leftOutputColumns ++ rightOutputColumns ++ List((lit(1.0)-col("distCol")).alias("similarity"))):_*)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you specifically need a list here or can it just be a generic Seq?

before(head)
val result = processStage(head)
after(head, result, true)
val stage = head._1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible to do this in the match:

case (stage, index) :: Nil

val result = processStage(head)
after(head, result, false)
val stage = head._1
val index = head._2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible to do this in the match:

case (stage, index) :: tail

@seddonm1 seddonm1 merged commit 0dd6ffe into develop Sep 23, 2019
@seddonm1 seddonm1 deleted the feature/2.1.0 branch September 23, 2019 05:40
@seddonm1 seddonm1 restored the feature/2.1.0 branch September 23, 2019 06:38
@seddonm1 seddonm1 deleted the feature/2.1.0 branch October 8, 2019 10:00
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants