西雅图消防部门 911 调遣。
注意
Microsoft 按“原样”提供 Azure 开放数据集。 Microsoft 对数据集的使用不提供任何担保(明示或暗示)、保证或条件。 在当地法律允许的范围内,Microsoft 对使用数据集而导致的任何损害或损失不承担任何责任,包括直接、必然、特殊、间接、偶发或惩罚性损害或损失。
此数据集是根据 Microsoft 接收源数据的原始条款提供的。 数据集可能包含来自 Microsoft 的数据。
数量和保留期
此数据集以 Parquet 格式存储。 它每日更新,在 2019 年包含大约 800,000 行 (20 MB)。
此数据集包含从 2010 年至今累积的历史记录。 可使用我们的 SDK 中的参数设置来提取特定时间范围内的数据。
存储位置
此数据集存储在美国东部 Azure 区域。 为确保相关性,建议将计算资源置于美国东部。
此数据集来自西雅图市政府。 有关详细信息,请参阅西雅图城市网站。 请参阅许可和属性,获取与使用此数据集相关的条款。 如果对数据源有任何疑问,请发送电子邮件至 open.data@seattle.gov。
列
名称 |
数据类型 |
唯一 |
值(示例) |
说明 |
address |
字符串 |
196,965 |
517 3rd Av 318 2nd Av Et S |
事件位置。 |
category |
字符串 |
232 |
帮助响应医疗响应 |
响应类型。 |
dataSubtype |
字符串 |
1 |
911_Fire |
“911_Fire” |
dataType |
字符串 |
1 |
安全 |
“安全” |
dateTime |
timestamp |
1,533,401 |
2020 年 11 月 4 日 06:49:00 2019 年 6 月 19 日 13:49:00 |
呼叫的日期和时间。 |
latitude |
Double |
94,332 |
47.602172 47.600194 |
这是纬度值。 纬线平行于赤道。 |
longitude |
Double |
79,492 |
-122.330863 -122.330541 |
这是经度值。 经度线垂直于纬度线,并且都穿过两个极点。 |
预览
dataType |
dataSubtype |
dateTime |
category |
子类别 |
状态 |
address |
latitude |
longitude |
source |
extendedProperties |
安全 |
911_Fire |
2021 年 4 月 28 日凌晨 5:22:00 |
垃圾着火 |
null |
null |
200 University St |
47.607299 |
-122.337087 |
null |
|
安全 |
911_Fire |
2021 年 4 月 28 日凌晨 5:15:00 |
会审事件 |
null |
null |
6th Ave / Olive Way |
47.61313 |
-122.336282 |
null |
|
安全 |
911_Fire |
2021 年 4 月 28 日凌晨 5:12:00 |
帮助响应 |
null |
null |
4th Ave S / Seattle Blvd S |
47.596486 |
-122.329046 |
null |
|
安全 |
911_Fire |
2021 年 4 月 28 日凌晨 5:09:00 |
垃圾着火 |
null |
null |
3rd Ave / University St |
47.607763 |
-122.335976 |
null |
|
安全 |
911_Fire |
2021 年 4 月 28 日凌晨 4:57:00 |
低灵敏度响应 |
null |
null |
533 3rd Ave W |
47.623717 |
-122.360635 |
null |
|
安全 |
911_Fire |
2021 年 4 月 28 日凌晨 4:57:00 |
转向 AMR |
null |
null |
4638 S Austin St |
47.534702 |
-122.274812 |
null |
|
安全 |
911_Fire |
2021 年 4 月 28 日凌晨 4:55:00 |
会审事件 |
null |
null |
8th Ave N / Harrison St |
47.622051 |
-122.341066 |
null |
|
数据访问
Azure Notebooks
# This is a package in preview.
from azureml.opendatasets import SeattleSafety
from datetime import datetime
from dateutil import parser
end_date = parser.parse('2016-01-01')
start_date = parser.parse('2015-05-01')
safety = SeattleSafety(start_date=start_date, end_date=end_date)
safety = safety.to_pandas_dataframe()
safety.info()
# Pip install packages
import os, sys
!{sys.executable} -m pip install azure-storage-blob
!{sys.executable} -m pip install pyarrow
!{sys.executable} -m pip install pandas
# Azure storage access info
azure_storage_account_name = "azureopendatastorage"
azure_storage_sas_token = r""
container_name = "citydatacontainer"
folder_name = "Safety/Release/city=Seattle"
from azure.storage.blob import BlockBlobServicefrom azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
if azure_storage_account_name is None or azure_storage_sas_token is None:
raise Exception(
"Provide your specific name and key for your Azure Storage account--see the Prerequisites section earlier.")
print('Looking for the first parquet under the folder ' +
folder_name + ' in container "' + container_name + '"...')
container_url = f"https://{azure_storage_account_name}.blob.core.windows.net/"
blob_service_client = BlobServiceClient(
container_url, azure_storage_sas_token if azure_storage_sas_token else None)
container_client = blob_service_client.get_container_client(container_name)
blobs = container_client.list_blobs(folder_name)
sorted_blobs = sorted(list(blobs), key=lambda e: e.name, reverse=True)
targetBlobName = ''
for blob in sorted_blobs:
if blob.name.startswith(folder_name) and blob.name.endswith('.parquet'):
targetBlobName = blob.name
break
print('Target blob to download: ' + targetBlobName)
_, filename = os.path.split(targetBlobName)
blob_client = container_client.get_blob_client(targetBlobName)
with open(filename, 'wb') as local_file:
blob_client.download_blob().download_to_stream(local_file)
# Read the parquet file into Pandas data frame
import pandas as pd
print('Reading the parquet file into Pandas data frame')
df = pd.read_parquet(filename)
# you can add your filter at below
print('Loaded as a Pandas data frame: ')
df
Azure Databricks
# This is a package in preview.
# You need to pip install azureml-opendatasets in Databricks cluster. https://learn.microsoft.com/azure/data-explorer/connect-from-databricks#install-the-python-library-on-your-azure-databricks-cluster
from azureml.opendatasets import SeattleSafety
from datetime import datetime
from dateutil import parser
end_date = parser.parse('2016-01-01')
start_date = parser.parse('2015-05-01')
safety = SeattleSafety(start_date=start_date, end_date=end_date)
safety = safety.to_spark_dataframe()
display(safety.limit(5))
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "citydatacontainer"
blob_relative_path = "Safety/Release/city=Seattle"
blob_sas_token = r""
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
blob_sas_token)
print('Remote blob path: ' + wasbs_path)
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))
Azure Synapse
# This is a package in preview.
from azureml.opendatasets import SeattleSafety
from datetime import datetime
from dateutil import parser
end_date = parser.parse('2016-01-01')
start_date = parser.parse('2015-05-01')
safety = SeattleSafety(start_date=start_date, end_date=end_date)
safety = safety.to_spark_dataframe()
# Display top 5 rows
display(safety.limit(5))
# Azure storage access info
blob_account_name = "azureopendatastorage"
blob_container_name = "citydatacontainer"
blob_relative_path = "Safety/Release/city=Seattle"
blob_sas_token = r""
# Allow SPARK to read from Blob remotely
wasbs_path = 'wasbs://%s@%s.blob.core.windows.net/%s' % (blob_container_name, blob_account_name, blob_relative_path)
spark.conf.set(
'fs.azure.sas.%s.%s.blob.core.windows.net' % (blob_container_name, blob_account_name),
blob_sas_token)
print('Remote blob path: ' + wasbs_path)
# SPARK read parquet, note that it won't load any data yet by now
df = spark.read.parquet(wasbs_path)
print('Register the DataFrame as a SQL temporary view: source')
df.createOrReplaceTempView('source')
# Display top 10 rows
print('Displaying top 10 rows: ')
display(spark.sql('SELECT * FROM source LIMIT 10'))
SELECT
TOP 100 *
FROM
OPENROWSET(
BULK 'https://azureopendatastorage.blob.core.windows.net/citydatacontainer/Safety/Release/city=Seattle/*.parquet',
FORMAT = 'parquet'
) AS [r];
示例
后续步骤
查看开放数据集目录中的其余数据集。