How to generate k random dates between two other dates using Python?


Generating random data is important in the field of data science. From building Neural Networks forecasting, stock market data, etc., often come with dates as one of the parameters. We may need to generate random numbers between the two dates for statistical analysis. This article will show how to generate k random dates between two given dates

Use random and date time Modules

The date time is an in-built library of Python to deal with time. The random module, on the other hand, helps to produce random numbers. Hence, we can combine the random and date time modules to generate a random date between two dates.

Syntax

random.randint(start, end, k)

Here random refers to the Python random library. randint method takes three important start, end, and k(number of elements). The start and end specify the range of numbers we need to generate random numbers. k defines the number of numbers that we need to generate

Example

In the following example, we created a function named generate_random_dates which takes the start date, the end date, and the number of random dates to generate as the arguments. For k random number using the random module. We added this number to the start date but within the end date range.

import random
from datetime import timedelta, datetime
def generate_random_dates(start_date, end_date, k):
    random_dates = []
    date_range = end_date - start_date
    for _ in range(k):
        random_days = random.randint(0, date_range.days)
        random_date = start_date + timedelta(days=random_days)
        random_dates.append(random_date)
    return random_dates
start_date = datetime(2023, 5, 25)
end_date = datetime(2023, 5, 31)
random_dates = generate_random_dates(start_date, end_date, 5)
print("The random dates generated are:")
for index, date in enumerate(random_dates):
    print(f"{index+1}. {date.strftime('%Y-%m-%d')}")

Output

The random dates generated are:
1. 2023-05-27
2. 2023-05-26
3. 2023-05-27
4. 2023-05-25
5. 2023-05-29

Using datetime and Hashing Method

The hashing function in Python produces a fixed-length string character known as the hash. We can use the hashing function to introduce randomness. The hash function produces a seemingly random value based on its input. By applying the modulo operation with the date_range, the resulting hash value is constrained to a range of possible values within the desired date range.

Syntax

hash(str(<some value>)) % <range of dates>

Depending upon some underlying architecture, the hash function can take the string and return a hash value. % is the modulo operator, which calculates the remainder of the value. This ensures that the result is always at least the required range.

Example

In the following code, we iterated k times. We used the hash function to generate the hashing of the string. Next, we took module operation with a date range to ensure that the data lies within the particular start and end date. We appended the random dates generated to our list named random_dates

from datetime import timedelta, datetime

def generate_random_dates(start_date, end_date, k):
   random_dates = []
   date_range = (end_date - start_date).days + 1

   for _ in range(k):
      random_days = hash(str(_)) % date_range
      random_date = start_date + timedelta(days=random_days)
      random_dates.append(random_date)

   return random_dates

# Example usage
start_date = datetime(2023, 5, 25)
end_date = datetime(2023, 5, 31)
random_dates = generate_random_dates(start_date, end_date, 5)

print("The random dates generated are:")
for index, date in enumerate(random_dates):
   print(f"{index+1}. {date.strftime('%Y-%m-%d')}")

Output

The random dates generated are:
1. 2023-05-28
2. 2023-05-28
3. 2023-05-25
4. 2023-05-27
5. 2023-05-28

Use NumPy And Pandas Library

Numpy and Pandas are popular libraries of Python for Mathematical computations and Data Analysis. The NumPy library has a random method that we can use to generate random numbers. On the other hand, we can use the Pandas library to generate the date range.

Syntax

numpy.random.randint(start, end , size=<size of the output array> ,
dtype=<data type of the elements>, other parameters.....)

The random is a module of the NumPy library. The randint method takes the start and the end as the required parameters. It defines the range of the numbers we need to find the random numbers. size defines the size of the output array, dtype represents the data type of the elements.

Example

In the following code, we have created a function named generate_random_dates which takes the start date, end date, and the number of days as the parameter and returns the series of random dates in the form of a list. We used the Pandas library to initialize the dates and the Numpy library to generate numbers.

import numpy as np
import pandas as pd
def generate_random_dates(start_date, end_date, k):
   date_range = (end_date - start_date).days + 1
   random_days = np.random.randint(date_range, size=k)
   random_dates = pd.to_datetime(start_date) + pd.to_timedelta(random_days, unit='d')
   return random_dates
start_date = datetime(2021, 5, 25)
end_date = datetime(2021, 5, 31)
print("The random dates generated are:")
random_dates = generate_random_dates(start_date, end_date, 5)
for index,date in enumerate(random_dates):
   print(f"{index+1}. {date.strftime('%Y-%m-%d')}")

Output

The random dates generated are:
1. 2021-05-26
2. 2021-05-27
3. 2021-05-27
4. 2021-05-25
5. 2021-05-27

Use random And Arrow Library

The Arrow is a library of Python. This gives a better and more optimized way to deal with the date and times. We can use the get method of the arrow to get the time in date format and random library to randomly get k numbers between the start and the end date.

Syntax

arrow.get(date_string, format=<format of the date string> , tzinfo=<time
zone information>)

The arrow represents the arrow module of Python. date_string represents the date and time string that we need to parse. It should, however, be in such a format that the arrow module can recognize it. format defines the format of the date_string. tzinfo provides the timezone information.

Example

We used the arrow method in the following code to generate the random dates. We defined a custom function named generate_random_dates. We iterated k times within the function. We used the uniform method for each iteration to produce the random dates. We shifted the dates to a random day so that the random days fell within the range. We appended the dates to the random_dates list and returned the value.

import random
import arrow

def generate_random_dates(start_date, end_date, k):
   random_dates = []
   date_range = (end_date - start_date).days

   for _ in range(k):
      random_days = random.uniform(0, date_range)
      random_date = start_date.shift(days=random_days)
      random_dates.append(random_date)

   return random_dates
start_date = arrow.get('2023-01-01')
end_date = arrow.get('2023-12-31')
random_dates = generate_random_dates(start_date, end_date, 7)
print("The random dates generated are:")
for index,date in enumerate(random_dates):
    print(f"{index+1}. {date.strftime('%Y-%m-%d')}")

Output

The random dates generated are:
1. 2023-02-05
2. 2023-10-17
3. 2023-10-08
4. 2023-04-18
5. 2023-04-02
6. 2023-08-22
7. 2023-01-01

Conclusion

In this article, we discussed how to generate random dates between given two dates using different libraries of Python. Generating our random dates without using any in-built library is a tedious task. Hence, the libraries and methods are recommended to perform the task. We can use Date Time, Numpy pandas, etc., to generate the random dates. These codes are not methods, etc.

Updated on: 28-Jul-2023

611 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements