Revisiting #Spark Scripts From the Command Line. #bigdata #spark #scala

It’s been a while since I looked at any Spark code, I’ve just been working on other things. There’s been a few comments on the blog about running Spark jobs from the command line shell.

Test Data

First let’s have some text data to work off. We’ll do a basic word count on it. Nothing to hand apart from my Tensor Flow algorithmic book generation.

I Wordlessly Kate and I gaze at the elevator at the end. I have never understood what you’re going to do with my safety. I groan as my body is rigid, tension radi- ating out me in front of me. He looks so remorseful, and in the same color as the crowd arrives and in my apartment. The thought is crippling. But and I don’t want to go to me that I want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to you. I don’t want to be beholden to him — and I can tell him about 17 miles a deal. “Did you have to compro- mise. I giggle. “Wench. Food, now, please.” “Since you want to talk about you in my own way, and I am going to be very surprised, not to see you. Ax (Your fiancee) I ask softly. He looks so vulnerable — and I don’t know if it’s my heightened way of the ‘old,’ son. I have a hairdresser arriving at your mom?” “Yes.” He grins at me and winks, making me flush. He smirks at me. “What is it?” I ask. He gazes at me, his eyes dark and earnest. “Find out the elevators, of the first time in a half-bear — and I have to go to church . . . Date: June 10, 2011 16:05 To: Christian Grey Twiddling Christian and I don’t know if it’s not at the rules are a hostile Anthem, “Every Breath You Take.” I do you have to do with you?” he asks. “I don’t want to go to work for a living, and I’ll be very persuasive,” he murmurs, and his eyes are alight with humor. “He’s like a drink,” Jack mut- ters, locking the eggs. I crack through my body. But what I do to make you uncomfortable.” I shake my head to fetch him at the same COURTESY to a child. “I thought you were in the apartment or you^?

It’s not a classic I know.

The Scala Spark Script

Next a Scala script that does the word count in Spark.

val text = sc.textFile("/Users/jasonbell/sample.txt")
val counts = text.flatMap(line => line.split(" ")).map(word => (word,1)).reduceByKey(_+_)
counts.collect

Basic…. but it works.

And A Run Through

And then run it from the command line.

$ /usr/local/spark-2.1.0-bin-hadoop2.3/bin/spark-shell -i wc.scala
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
17/04/08 09:07:14 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/04/08 09:07:20 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
Spark context Web UI available at http://192.168.1.65:4040
Spark context available as 'sc' (master = local[*], app id = local-1491638836119).
Spark session available as 'spark'.
Loading wc.scala...
text: org.apache.spark.rdd.RDD[String] = /Users/jasonbell/sample.txt MapPartitionsRDD[1] at textFile at <console>:24
counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[4] at reduceByKey at <console>:26
res0: Array[(String, Int)] = Array((COURTESY,1), (“Since,1), (flush.,1), (is,3), (now,,1), (2011,1), (arrives,1), (same,2), (June,1), (am,1), (have,5), (never,1), (tension,1), (winks,,1), (dark,1), (miles,1), (with,3), (fiancee),1), (crippling.,1), (first,1), (—,3), (fetch,1), (talk,1), (uncomfortable.”,1), (eyes,2), (crack,1), (my,7), (Take.”,1), (child.,1), (go,3), (make,1), (Breath,1), (what,2), (out,2), (Twiddling,1), (me,,1), (gazes,1), (looks,2), (Date:,1), (deal.,1), (remorseful,,1), (me,4), (him,3), (his,2), (are,2), (body,1), (shake,1), (persuasive,”,1), (“Yes.”,1), (can,1), (half-bear,1), (mise.,1), (Wordlessly,1), (“What,1), (elevator,1), (Food,,1), (.,3), (earnest.,1), (as,2), (going,2), (‘old,’,1), (very,2), (don’t,27), (you,1), (son.,1), (safety.,1), (eggs.,1), (apartment...
Welcome to
 ____ __
 / __/__ ___ _____/ /__
 _\ \/ _ \/ _ `/ __/ '_/
 /___/ .__/\_,_/_/ /_/\_\ version 2.1.0
 /_/

Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_60)
Type in expressions to have them evaluated.
Type :help for more information.

If you’re not getting the results then something is wrong.

 

 

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: