
Create View
Views are named queries stored temporarily in memory. If you are more confortable with SQL than with the DataFrame API, creating a temporary view allows you run SQL queries directly on the DataFrame.
from pyspark.sql import SparkSession
# Initialize Spark Sessionspark = SparkSession.builder.appName("app").getOrCreate()
# Create a sample DataFramedata = [("James", "Sales", 3000), ("Michael", "Sales", 4600)]columns = ["Employee Name", "Department", "Salary"]df = spark.createDataFrame(data, columns)df.show()
Output:
+-------------+----------+------+
|Employee Name|Department|Salary|
+-------------+----------+------+
| James| Sales| 3000|
| Michael| Sales| 4600|
+-------------+----------+------+
Create a Temporary View From a PySpark DataFrame
df.createOrReplaceTempView("temp_view_name")
# Now, you can run SQL queries over this viewspark.sql("SELECT * FROM temp_view_name").show()
Output:
+-------------+----------+------+
|Employee Name|Department|Salary|
+-------------+----------+------+
| James| Sales| 3000|
| Michael| Sales| 4600|
+-------------+----------+------+
spark.stop()