PublicHolidays Class
Represents the Public Holidays public dataset.
This datasets contains worldwide public holiday data sourced from PyPI holidays package and Wikipedia, covering 38 countries or regions from 1970 to 2099. Each row indicates the holiday info for a specific date, country or region, and whether most people have paid time off. For more information about this dataset, including column descriptions, different ways to access the dataset, and examples, see Public Holidays in the Microsoft Azure Open Datasets catalog.
Initialize filtering fields.
- Inheritance
-
PublicHolidays
Constructor
PublicHolidays(country_or_region: str = '', start_date: datetime = datetime.datetime(2008, 1, 1, 0, 0), end_date: datetime = datetime.datetime(2024, 10, 18, 0, 0), cols: List[str] | None = None, enable_telemetry: bool = True)
Parameters
Name | Description |
---|---|
country_or_region
Required
|
The country or region to return data for. |
start_date
|
The date at which to start loading data, inclusive. If None, the Default value: 2008-01-01 00:00:00
|
end_date
|
The date at which to end loading data, inclusive. If None, the Default value: 2024-10-18 00:00:00
|
cols
|
A list of columns names to load from the dataset. If None, all columns are loaded. For information on the available columns in this dataset, see Public Holidays. Default value: None
|
enable_telemetry
|
Whether to enable telemetry on this dataset. Default value: True
|
country_or_region
Required
|
The country or region you'd like to query against. |
start_date
Required
|
The start date you'd like to query inclusively. |
end_date
Required
|
The end date you'd like to query inclusively. |
cols
Required
|
A list of column names you'd like to retrieve. None will get all columns. |
enable_telemetry
Required
|
Indicates whether to send telemetry. |
Remarks
The example below shows how to access the dataset.
from azureml.opendatasets import PublicHolidays
from datetime import datetime
from dateutil.relativedelta import relativedelta
end_date = datetime.today()
start_date = datetime.today() - relativedelta(months=1)
hol = PublicHolidays(start_date=start_date, end_date=end_date)
hol_df = hol.to_pandas_dataframe()
Methods
filter |
Filter time. |
filter
Filter time.
filter(env: SparkEnv | PandasEnv, min_date: datetime, max_date: datetime)
Parameters
Name | Description |
---|---|
env
Required
|
The runtime environment. |
min_date
Required
|
The min date. |
max_date
Required
|
The max date. |
Returns
Type | Description |
---|---|
The filtered data frame. |
Attributes
country_or_region_column_name
country_or_region_column_name = 'countryOrRegion'
countrycode_column_name
countrycode_column_name = 'countryRegionCode'
default_end_date
default_end_date = datetime.datetime(2024, 10, 18, 0, 0)
default_max_end_date
default_max_end_date = datetime.datetime(2099, 1, 1, 0, 0)
default_start_date
default_start_date = datetime.datetime(2008, 1, 1, 0, 0)