
date_add()
date_add()
function is used for adding a specified number of days to a date column in a DataFrame.
Usage
date_add()
takes two arguments: a date column and an integer representing the number of days to add.- It returns a new column with the calculated dates.
Create Spark Session and sample DataFrame
from pyspark.sql import SparkSessionfrom pyspark.sql.functions import date_add, to_date
# Initialize Spark Sessionspark = SparkSession.builder.appName("addDateExample").getOrCreate()
# Sample DataFrame with Date Stringdata = [("2021-01-01",), ("2021-06-24",)]columns = ["Date"]df = spark.createDataFrame(data, columns)df.show()
Output:
+----------+
| Date|
+----------+
|2021-01-01|
|2021-06-24|
+----------+
Example: Use date_add
to Add Days to a Date
# Convert String to Date Typedf = df.withColumn(df.Date, to_date("Date"))
# Adding Days to the Datedf_with_days_added = df.withColumn("Date Plus 10 Days", date_add(df.Date, 10))df_with_days_added.show()
Output:
+----------+-----------------+
| Date|Date Plus 10 Days|
+----------+-----------------+
|2021-01-01| 2021-01-11|
|2021-06-24| 2021-07-04|
+----------+-----------------+
.withColumn(df.Date)
andto_date("Date")
: these two methods used together to convert string type column Date of DataFrame df to Date type.date_add(df.Date, 10)
: Adds 10 days to each date in the Date column.- The resultant dates are stored in a new column Date Plus 10 Days.
# Stop the Spark Sessionspark.stop()