RDBMS is an integral part of building most analytical layers on top of a data warehouse. Similarly, Spark is well known and vastly being used across the organizations to build a robust ETL pipeline that can do massive computing in memory and load to the various platforms including RDBMS.

Image for post
Image for post

Many times, we face issues when we try to do DB operations switching between Scala and Spark. When we just need a single value we may not want to use spark to create a data frame for us. Similarly, many SQL queries like UPDATE/DELETE, etc. are not supported directly in Spark, so need to switch between these two frequently. …


Image for post
Image for post

Whenever I get a chance to work in Scala, I always love its simplicity like Python, handiness like Perl, and running like Java.

Scala’s implicit ways are another way of writing syntactical sugary flavors of writing compact code in your own style which makes it easier to call from external class or code blocks. Here, I’m keeping a few custom definitions I use frequently:

I. CamelCase :

A camel case implicit function can be handy while storing strings like First & Last Name. It can be also easily called from Spark using a UDF as shown below:

scala> "niroj pattnaik".toCamelCase
val res21: String = Niroj…

About

Niroj Pattnaik

Hadoop Developer |BigData/ETL Engineer| Techincal Architect| And a Student. https://www.linkedin.com/in/niroj-kumar-pattnaik-89b64221/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store