SQL Server Reporting Services
From: Developing big data solutions on Microsoft Azure HDInsight
SQL Server Reporting Services (SSRS) provides a platform for delivering reports that can be viewed interactively in a Web browser, printed, or exported and delivered automatically in a wide range of electronic document formats. These formats include Excel, Word, PDF, image, and XML.
Reports consist of data regions, such as tables of data values or charts, which are based on datasets that encapsulate queries. The datasets used by a report are, in turn, based on data sources that can include relational databases, corporate data models in SQL Server Analysis Services, or other OLE DB or ODBC sources.
Professional report authors, who are usually specialist BI developers, can create reports by using the Report Designer tool in SQL Server Data Tools. This tool provides a Visual Studio-based interface for creating data sources, datasets, and reports; and for publishing a project of related items to a report server. Figure 1 shows Report Designer being used to create a report based on data in HDInsight, which is accessed through an ODBC data source and a dataset that queries a Hive table.
Figure 1 - Report Designer
While many businesses rely on reports created by BI specialists, an increasing number of organizations are empowering business users to create their own self-service reports. To support this scenario, business users can use Report Builder (shown in Figure 2). This is a simplified report authoring tool that is installed on demand from a report server. To further simplify self-service reporting you can have BI professionals create and publish shared data sources and datasets that can be easily referenced in Report Builder, reducing the need for business users to configure connections or write queries.
Figure 2 - Report Builder
Reports are published to a report server, where they can be accessed interactively through a web interface. When Reporting Services is installed in native mode, the interface is provided by a web application named Report Manager (shown in Figure 3). You can also install Reporting Services in SharePoint-integrated mode, in which case the reports are accessed in a SharePoint document library.
Figure 3 - Viewing a report in Report Manager
Reporting Services provides a powerful platform for delivering reports in an enterprise, and can be used effectively in a big data solution. The following table describes specific considerations for using Reporting Services in the HDInsight use cases and models described in this guide.
Use case |
Considerations |
---|---|
Iterative data exploration |
For one-time analysis and data exploration, Excel is generally a more appropriate tool than Reporting Services because it requires less in the way of infrastructure configuration and provides a more dynamic user interface for interactive data exploration. |
Data warehouse on demand |
When HDInsight is used to implement a basic data warehouse, it usually includes a schema of Hive tables that are queried over ODBC connections. In this scenario you can use Reporting Services to create ODBC-based data sources and datasets that query Hive tables. |
ETL automation |
Most ETL scenarios are designed to transform big data into a suitable structure and volume for storage in a relational data source for further analysis and querying. In this scenario, Reporting Services may be used to consume data from the relational data store loaded by the ETL process. |
BI integration |
You can use Reporting Services to integrate data from HDInsight with enterprise BI data at the report level by creating reports that display data from multiple data sources. For example, you could use an ODBC data source to connect to HDInsight and query Hive tables, and an OLE DB data source to connect to a SQL Server data warehouse. However, in an enterprise BI scenario that combines corporate data and big data in formal reports, better integration can generally be achieved by integrating at the data warehouse or corporate data model level, and by using a single data source in Reporting Services to connect to the integrated data. |
Guidelines for using Reporting Services with HDInsight
When using SQL Server Reporting Services with HDInsight, consider the following guidelines:
- When creating a data source for Hive tables, use an explicit ODBC connection string in preference to a DSN. This ensures that the data source does not depend on a DSN on the report server.
- Consider increasing the default timeout value for datasets that query Hive tables. Hive queries over ODBC can take a considerable amount of time.
- Consider using report snapshots, or cached datasets and reports, to improve performance by reducing the number of times that queries are submitted to HDInsight.
The guidance provided here assumes that you want to use SQL Server Reporting Services to consume and visualize data directly from HDInsight. However, in many scenarios the results of HDInsight processing are transferred to a database (for example, a data warehouse implemented in SQL Server) or an analytical data model (for example, a SQL Server Analysis Services cube). You can use Reporting Services to consume and visualize data from practically any data source.