programming matrix: SPARK and AVRO


    org.apache.spark
    spark-core_2.10
    1.6.1



    org.apache.spark
    spark-sql_2.10
    1.6.1



    com.databricks
    spark-csv_2.10
    1.4.0

Submitting apps to a cluster

https://spark.apache.org/docs/latest/submitting-applications.html

http://stackoverflow.com/questions/24442240/how-to-reference-jar-files-after-sbt-publish-local

libraryDependencies ++= Seq(
  "org.apache.spark" % "spark-core_2.10" % sparkVersion withSources(),
  "org.apache.spark" % "spark-sql_2.10" % sparkVersion  withSources()
)

https://github.com/sbt/sbt-assembly

add this to project/assembly.sbt


addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.3")

and this to build.sbt

assemblyMergeStrategy in assembly := {

case PathList("META-INF", xs @ _*) => MergeStrategy.discard

case x => MergeStrategy.first

}

http://stackoverflow.com/questions/34093715/scala-code-not-compiling-in-sbt

You should add to your build.sbt the following dependency:

libraryDependencies += "org.apache.spark" %% "spark-mllib" % "1.4.0"

And in your scala-file add the following import:

import org.apache.spark.{SparkConf, SparkContext}

programming matrix

Wednesday, October 19, 2016

SPARK and AVRO

Followers

Blog Archive