
add_months()
add_months()
function is used for date manipulation by adding a specific number of months to a date column.
Usage
add_months()
takes two arguments: a column containing dates and an integer representing the number of months to add.- It returns a new column with updated dates.
Create Spark Session and sample DataFrame
from pyspark.sql import SparkSessionfrom pyspark.sql.functions import add_months, to_date
# Initialize Spark Sessionspark = SparkSession.builder.appName("addMonthsExample").getOrCreate()
# Sample DataFrame with Date Stringdata = [("2021-01-01",), ("2021-06-24",)]columns = ["Date"]df = spark.createDataFrame(data, columns)df.show()
Output:
+----------+
| Date|
+----------+
|2021-01-01|
|2021-06-24|
+----------+
Example: Use add_months()
to Add Months to a Date
# Convert String to Date Typedf = df.withColumn("Date", to_date("Date"))
# Adding Months to the Datedf_with_months_added = df.withColumn("Date Plus 3 Months", add_months("Date", 3))df_with_months_added.show()
Output:
+----------+------------------+
| Date|Date Plus 3 Months|
+----------+------------------+
|2021-01-01| 2021-04-01|
|2021-06-24| 2021-09-24|
+----------+------------------+
.withColumn()
andto_date()
: these two methods used together to convert string type column Date of DataFrame df to date type.add_months("Date", 3)
: Adds 3 months to each date in the Date column.- The resulting dates are stored in a new column Date Plus 3 Months.
# Stop the Spark Sessionspark.stop()