Spark2 data sets
https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/8599738367597028/2201444230243967/3601578643761083/latest.html
https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-DataFrame.html
http://pandas.pydata.org/pandas-docs/stable/dsintro.html
https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Column
https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Dataset
https://stackoverflow.com/questions/38137741/how-to-write-a-dataframe-schema-to-file-in-scala
https://www.balabit.com/blog/spark-scala-dataset-tutorial/
https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/8599738367597028/2201444230243967/3601578643761083/latest.html
https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-DataFrame.html
Spark SQL borrowed the concept of DataFrame from pandas' DataFrame and made it immutable, parallel (one machine, perhaps with many processors and cores) and distributed (many machines, perhaps with many processors and cores).
http://pandas.pydata.org/pandas-docs/stable/dsintro.html
https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Column
https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Dataset
https://stackoverflow.com/questions/38137741/how-to-write-a-dataframe-schema-to-file-in-scala
import java.io.PrintWriter;
val filePath = "/tmp/schema_file"
new PrintWriter(filePath) { write(df.schema.treeString); close }
https://docs.databricks.com/spark/latest/spark-sql/complex-types.html#transform-complex-data-types-scalahttps://www.balabit.com/blog/spark-scala-dataset-tutorial/