Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it easier to get around Tuple22 restrictions with Shapeless/HLists #1236

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions project/Build.scala
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ object ScaldingBuild extends Build {
val scalaTestVersion = "2.2.2"
val scalameterVersion = "0.6"
val scroogeVersion = "3.17.0"
val shapelessVersion = "2.1.0"
val slf4jVersion = "1.6.6"
val thriftVersion = "0.5.0"

Expand Down Expand Up @@ -269,6 +270,16 @@ object ScaldingBuild extends Build {
"org.apache.hadoop" % "hadoop-core" % hadoopVersion % "provided",
"org.slf4j" % "slf4j-api" % slf4jVersion,
"org.slf4j" % "slf4j-log4j12" % slf4jVersion % "provided"
) ++ (
if(isScala210x(scalaVersion.value)) {
Seq(
"com.chuusai" % "shapeless_2.10.4" % shapelessVersion
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this actually compatible with 2.10.3 -> 2.10.5?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding (based on the README) is that shapeless 2.1.0 is only compatible with 2.11.x and 2.10.4, but I might be mistaken. I think the relevant bits are as follows:

Builds are available for Scala 2.11.x and for Scala 2.10.4
...
Note that for Scala 2.10.4 you must provide an explicit Scala version suffix to your shapeless dependency, and it is recommanded to add the macro paradise plugin to your build, for some macros provided by shapeless to work smoothly

For shapeless 2.0.0, they have the following comments:

Builds are available for Scala 2.11.x and for Scala 2.10.4.
...
For Scala 2.10.x you must specify a Scala version of at least 2.10.2, and add either cross CrossVersion.full or provide an explicit Scala version suffix to your shapeless dependency
...
Note that Scala 2.10.x releases are compatible with each other starting from 2.10.2, so a mismatch in minor versions above would be fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shapeless 2.1.0 has full cross-versions for 2.10 starting at 2.10.4 (e.g for 2.10.5), so it'd be better to use this:

"com.chuusai" %% "shapeless" % shapelessVersion cross CrossVersion.full

So that we get the 2.10.5 version when Scalding updates to 2.10.5.

(I don't remember whether you need %% for full cross-versioning—I think either will work, and I find it a little clearer with %%.)

)
} else {
Seq(
"com.chuusai" %% "shapeless" % shapelessVersion
)
}
)
).dependsOn(scaldingArgs, scaldingDate, maple)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,10 @@ package com.twitter.scalding

import cascading.tuple.TupleEntry
import cascading.tuple.{ Tuple => CTuple }
import shapeless._
import shapeless.ops.nat._

import scala.collection.breakOut
import scala.collection.{GenTraversable, breakOut}

/**
* Typeclass to represent converting from cascading TupleEntry to some type T.
Expand Down Expand Up @@ -51,6 +53,37 @@ trait LowPriorityTupleConverters extends java.io.Serializable {
}
}

/**
* Based on `FromTraversable` from shapeless
*/
trait FromTupleEntry[Out <: HList, I <: Nat] {
def apply(l : TupleEntry) : Option[Out]
}

object FromTupleEntry {
def apply[Out <: HList](implicit from: FromTupleEntry[Out, Nat._0]) = from

implicit def hnilFromTupleEntry[T, I <: Nat](implicit toIntI : ToInt[I]) =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

T is unused here.

new FromTupleEntry[HNil, I] {
def apply(te : TupleEntry) =
if(te.size() == toIntI()) Some(HNil) else None
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This forces the TupleEntry to have exactly the same size as the HList. I'm not sure if it needs to be this strict or if it would be ok for the TupleEntry to have more entries in it than the HList.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we run toInt once before creating the instance to guarantee that is only run once? We definitely don't work to run that on each tuple.

}

implicit def hlistFromTupleEntry[OutH, OutT <: HList, I <: Nat]
(implicit
flt : FromTupleEntry[OutT, Succ[I]],
g : TupleGetter[OutH],
toIntI : ToInt[I]) =
new FromTupleEntry[OutH :: OutT, I] {
def apply(te : TupleEntry) : Option[OutH :: OutT] =
if(te.size() <= toIntI()) None
else for(
h <- new Some(g.get(te.getTuple, toIntI()));
t <- flt(te)
) yield h :: t
}
}

object TupleConverter extends GeneratedTupleConverters {
/**
* Treat this TupleConverter as one for a superclass
Expand All @@ -67,6 +100,23 @@ object TupleConverter extends GeneratedTupleConverters {
def arity[T](implicit tc: TupleConverter[T]): Int = tc.arity
def of[T](implicit tc: TupleConverter[T]): TupleConverter[T] = tc

import shapeless.ops.hlist._

implicit def hListConverter[H, T <: HList, N <: Nat]
(implicit
g: TupleGetter[H],
len: Length.Aux[H :: T, N],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still getting a sense of the design here, but any reason you don't constrain the FromTupleEntry instances so that the I (which could be a type member) matches the length of the HList? That should allow you to omit the N type parameter here altogether.

toIntN : ToInt[N],
fl : FromTupleEntry[H :: T, Nat._0]): TupleConverter[H :: T] =
new TupleConverter[H :: T] {
override def apply(te: TupleEntry): H :: T = {
val l : Option[H :: T] = fl(te)
l.get
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not the best to just "get" the Option here. Open to other suggestions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, TupleConverter and TupleSetter are the places where types are added and removed before handing to cascading. It can't be type-safe at that point because cascading is not type-safe. There is really nothing you can do but call .get. That said, in a well typed scalding flow, that .get can never throw, so ultimately it would be safe (even though using the TupleConverter on its own, would not be safe).

}

override def arity: Int = toIntN()
}

/**
* Copies the tupleEntry, since cascading may change it after the end of an
* operation (and it is not safe to assume the consumer has not kept a ref
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,21 @@ object TupleSetter extends GeneratedTupleSetters {
def arity[T](implicit ts: TupleSetter[T]): Int = ts.arity
def of[T](implicit ts: TupleSetter[T]): TupleSetter[T] = ts

import shapeless.ops.hlist._
import shapeless.ops.nat._
import shapeless._

implicit def hListSetter[L <: HList, N <: Nat, T <: Any]
(implicit len: Length.Aux[L, N], ti: ToInt[N], tl: ToList[L, T]) = new TupleSetter[L] {

override def apply(arg: L): CTuple = {
val list = arg.toList[T].map(_.asInstanceOf[Object])
new CTuple(list:_*)
}

override def arity: Int = ti()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be a val, to be sure we only call once.

}

//This is here for handling functions that return cascading tuples:
implicit lazy val CTupleSetter: TupleSetter[CTuple] = new TupleSetter[CTuple] {
override def apply(arg: CTuple) = new CTuple(arg)
Expand Down
41 changes: 41 additions & 0 deletions scalding-core/src/test/scala/com/twitter/scalding/TupleTest.scala
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ package com.twitter.scalding
import cascading.tuple.{ TupleEntry, Tuple => CTuple }

import org.scalatest.{ Matchers, WordSpec }
import shapeless._

class TupleTest extends WordSpec with Matchers {
def get[T](ctup: CTuple)(implicit tc: TupleConverter[T]) = tc(new TupleEntry(ctup))
Expand Down Expand Up @@ -62,6 +63,23 @@ class TupleTest extends WordSpec with Matchers {
arityConvMatches((2, 3), 2) shouldBe true
aritySetMatches((2, 3), 2) shouldBe true
}
"get primitives out of cascading tuples using HLists" in {
val ctup = new CTuple("hey", new java.lang.Long(2), new java.lang.Integer(3))
get[String :: Long :: Int :: HNil](ctup) shouldBe "hey" :: 2L :: 3 :: HNil

roundTrip[Int :: HNil](3 :: HNil) shouldBe true
arityConvMatches(3 :: HNil, 1) shouldBe true
aritySetMatches(3 :: HNil, 1) shouldBe true
roundTrip[Long :: HNil](42L :: HNil) shouldBe true
arityConvMatches(42L :: HNil, 1) shouldBe true
aritySetMatches(42L :: HNil, 1) shouldBe true
roundTrip[String :: HNil]("hey" :: HNil) shouldBe true
arityConvMatches("hey" :: HNil, 1) shouldBe true
aritySetMatches("hey" :: HNil, 1) shouldBe true
roundTrip[Int :: Int :: HNil](4 :: 2 :: HNil) shouldBe true
arityConvMatches(2 :: 3 :: HNil, 2) shouldBe true
aritySetMatches(2 :: 3 :: HNil, 2) shouldBe true
}
"get non-primitives out of cascading tuples" in {
val ctup = new CTuple(None, List(1, 2, 3), 1 -> 2)
get[(Option[Int], List[Int], (Int, Int))](ctup) shouldBe (None, List(1, 2, 3), 1 -> 2)
Expand All @@ -75,6 +93,19 @@ class TupleTest extends WordSpec with Matchers {
arityConvMatches(List(1, 2, 3), 1) shouldBe true
aritySetMatches(List(1, 2, 3), 1) shouldBe true
}
"get non-primitives out of cascading tuples using HLists" in {
val ctup = new CTuple(None, List(1, 2, 3), 1 -> 2)
get[Option[Int] :: List[Int] :: (Int, Int) :: HNil](ctup) shouldBe None :: List(1, 2, 3) :: 1 -> 2 :: HNil

roundTrip[Option[Int] :: List[Int] :: HNil](Some(1) :: List() :: HNil) shouldBe true
arityConvMatches(None :: Nil :: HNil, 2) shouldBe true
aritySetMatches(None :: Nil :: HNil, 2) shouldBe true

arityConvMatches(None :: HNil, 1) shouldBe true
aritySetMatches(None :: HNil, 1) shouldBe true
arityConvMatches(List(1, 2, 3) :: HNil, 1) shouldBe true
aritySetMatches(List(1, 2, 3) :: HNil, 1) shouldBe true
}
"deal with AnyRef" in {
val ctup = new CTuple(None, List(1, 2, 3), 1 -> 2)
get[(AnyRef, AnyRef, AnyRef)](ctup) shouldBe (None, List(1, 2, 3), 1 -> 2)
Expand All @@ -85,5 +116,15 @@ class TupleTest extends WordSpec with Matchers {
arityConvMatches[(AnyRef, AnyRef)](("hey", "you"), 2) shouldBe true
aritySetMatches[(AnyRef, AnyRef)](("hey", "you"), 2) shouldBe true
}
"deal with AnyRef using HLists" in {
val ctup = new CTuple(None, List(1, 2, 3), 1 -> 2)
get[AnyRef :: AnyRef :: AnyRef :: HNil](ctup) shouldBe None :: List(1, 2, 3) :: 1 -> 2 :: HNil
get[AnyRef :: HNil](new CTuple("you")) shouldBe "you" :: HNil

roundTrip[AnyRef :: HNil]("hey" :: HNil) shouldBe true
roundTrip[AnyRef :: AnyRef :: HNil](Nil :: Nil :: HNil) shouldBe true
arityConvMatches[AnyRef :: AnyRef :: HNil]("hey" :: "you" :: HNil, 2) shouldBe true
aritySetMatches[AnyRef :: AnyRef :: HNil]("hey" :: "you" :: HNil, 2) shouldBe true
}
}
}