question

ShambhuRai-4099 avatar image
0 Votes"
ShambhuRai-4099 asked ShambhuRai-4099 commented

csv to hive database load

Hi Expert,

How to load data from csv to Hive database via notebook

azure-data-factoryazure-databricksazure-data-lake-storage
· 7
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

from pyspark import SparkContext
sc = SparkContext("local", "Simple App")
from pyspark.sql import SQLContext, Row
sqlContext = SQLContext(sc)
from pyspark.sql import HiveContext
sqlContext = HiveContext(sc)

df = sqlContext.read.format("jdbc").option("url","jdbc:sqlserver://<server>:<port>").option("databaseName","xxx").option("driver","com.microsoft.sqlserver.jdbc.SQLServerDriver").option("dbtable","xxxx").option("user","xxxxx").option("password","xxxxx").load()

df.registerTempTable("test")

df1= sqlContext.sql("select * from test where xxx= 6")
df1.write.format("com.databricks.spark.csv").save("/xxxx/xxx/ami_saidulu")

df1.write.option("path", "/xxxx/xxx/ami_saidulu").saveAsTable("HIVE_DB.HIVE_TBL",format= 'csv',mode= 'Append')


Can not understand from where we can connect to csv table how to map column to column

0 Votes 0 ·

Suggestion please

0 Votes 0 ·

Hi @ShambhuRai-4099,

When you are running the above code snippet are you experiencing any error message?

Meanwhile, you may checkout this article PySpark: Dataframe Write Modes which explain different options available while writing the dataframe.

0 Votes 0 ·

suggestion pls

0 Votes 0 ·

suggestion please

0 Votes 0 ·

Hi Expert,

can some explain me or show some code whcih eloboarate source and target steps for an example source is c:\test\
and target is hive database using below code in notebook or some example

from pyspark import SparkContext
sc = SparkContext("local", "Simple App")
from pyspark.sql import SQLContext, Row
sqlContext = SQLContext(sc)
from pyspark.sql import HiveContext
sqlContext = HiveContext(sc)

df = sqlContext.read.format("jdbc").option("url","jdbc:sqlserver://<server>:<port>").option("databaseName","xxx").option("driver","com.microsoft.sqlserver.jdbc.SQLServerDriver").option("dbtable","xxxx").option("user","xxxxx").option("password","xxxxx").load()

df.registerTempTable("test")

df1= sqlContext.sql("select * from test where xxx= 6")
df1.write.format("com.databricks.spark.csv").save("/xxxx/xxx/ami_saidulu")

df1.write.option("path", "/xxxx/xxx/ami_saidulu").saveAsTable("HIVE_DB.HIVE_TBL",format= 'csv',mode= 'Append')

0 Votes 0 ·

1 Answer

ShambhuRai-4099 avatar image
0 Votes"
ShambhuRai-4099 answered Samy-7940 commented

Hi Expert,
My question in above code is where i have to mentioned the csv path and Hive path ?. can i write this code in one sql command in notebook or i have to split it

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Df1 is the dataframe where you are writing as csv. You are reading from some URL and writing to csv file.

0 Votes 0 ·