Sparkling Water : Issue splitting the data into training and testing

Description

Steps from Alex:

hacking in Spark right now with Michal and we are running into this problem @ the point where it's time to split the data into training and testing.

Though it says its a type error (error: type mismatch), Michal and I tried a few permutations and it didn't seem to work for me.

Basically, I am hacking his Bicycle demand app and am applying it to solving crime in Chicago.

Here is my code
------------------------------

//
// Split into train and test parts
//

val keys = Array[String]("train.hex", "test.hex", "hold.hex").map(Key.make(_))
val ratios = Array[Double](0.8, 0.1, 0.1)
val frs = ShuffleSplitFrame.shuffleSplitFrame(crimeWeatherDF, keys, ratios, 1234567689L)

-------------------------------
HERE is the error (last line of above code)

scala> val frs = ShuffleSplitFrame.shuffleSplitFrame(crimeWeatherDF, keys, ratios, 1234567689L)

<console>:55: error: type mismatch;

found : Array[water.Key[_]]

required: Array[water.Key[_ <: water.Keyed[_ <: water.Keyed[_ <: AnyRef]]]]

Note: water.Key[] >: water.Key[ <: water.Keyed[_ <: water.Keyed[_ <: AnyRef]]], but class Array is invariant in type T.

You may wish to investigate a wildcard type such as `_ >: water.Key[_ <: water.Keyed[_ <: water.Keyed[_ <: AnyRef]]]`. (SLS 3.2.10)

val frs = ShuffleSplitFrame.shuffleSplitFrame(crimeWeatherDF, keys, ratios, 1234567689L)

Fixed
Your pinned fields
Click on the next to a field label to start pinning.

Assignee

Michal Malohlava

Reporter

Neeraja Madabhushi

CustomerVisible

No