
lit()
The lit()
function is used to generate a column with a literal (constant) value. It's commonly used in DataFrame transformations to add a new column with a fixed value to each row.
Create Spark Session and sample DataFrame
from pyspark.sql import SparkSessionfrom pyspark.sql.functions import lit
# Initialize Spark Sessionspark = SparkSession.builder.appName("litExample").getOrCreate()
# Sample DataFramedata = [("James", 34, 'james123@gmail.com'), ("Anna", 28, 'anna@gmail.com'), ("Robert", 45, 'robert@gmail.com')]columns = ["Name", "Age", "email"]df = spark.createDataFrame(data, columns)df.show()
Output:
+------+---+------------------+
| Name|Age| email|
+------+---+------------------+
| James| 34|james123@gmail.com|
| Anna| 28| anna@gmail.com|
|Robert| 45| robert@gmail.com|
+------+---+------------------+
Example: Use lit()
to Add a New Column
lit("USA")
: it create a column with constant string value of "USA".
df_with_literal = df.withColumn("Country", lit("USA"))df_with_literal.show()
Output:
+------+---+------------------+-------+
| Name|Age| email|Country|
+------+---+------------------+-------+
| James| 34|james123@gmail.com| USA|
| Anna| 28| anna@gmail.com| USA|
|Robert| 45| robert@gmail.com| USA|
+------+---+------------------+-------+
# Stop the Spark Sessionspark.stop()