How to print the contents of an RDD?
https://stackoverflow.com/questions/23173488/how-to-print-the-contents-of-rdd
If you want to view the content of a RDD, one way is to use
collect() :
That's not a good idea, though, when the RDD has billions of lines. Use
take() to take just a few to print out:
|
I think the best option is to write in multiple files in HDFS, then use hdfs dfs --getmerge in order to merge the files – Oussama Jul 21 '15 at 16:10