| from Patricia Bobe

Tableau in Standard Reporting: A Deployment-Process about Rest API via Python

Introduction

Tableau is used especially in the field of ad hoc reporting, where simple drag and drop quickly reveals the first analysis results. It allows the user to work independently on his or her data. This eliminates the often lengthy process of requesting specific data or even completed reports from the IT department. However, Tableau is also being used more and more frequently in standard reporting. In this case, however, there are stronger guidelines that the report developer must follow. Especially in companies where processes are highly standardized, a process that complies with the standards must also be created and adhered to for reporting. To bridge the gap between self-service BI and data governance, a deployment process for Tableau Reports was developed and implemented using the REST API via Python.

To ensure the quality of the analysis in Tableau a deployment-pipeline should be built. This deployment-pipeline makes it possible for workbooks to be subjected to several tests, e.g. by departments or Data Stewards.

This way it can be ensured that recipients of the workbooks only receive quality-checked and standardized reports that are also CI/CD-compliant and only access valid data from the corresponding environment.

Process description

The process described below refers to an infrastructure with a total of three environments: Development, Test/Integration and Production. Security measures are already in place for these environments, so that only authorized data is stored accordingly (productive data is only available on the production environment). This architecture allows for customized user authorizations. The authorization concept is particularly important to meet quality standards and security requirements.

Let's start with Tableau Desktop. The report developer connects to one of the development databases and creates his or her analysis in the desktop client. He/she knows his/her data and can answer specific questions. After completing the workbook, the developer publishes it to the Tableau Server development environment.

Only the users on the development environment that have the authorization to create Tableau workbooks, can upload them to the server and edit them. To ensure the integrity of the data, a so-called Data Steward can check and certify the data source.  This way, the user knows that the data is reliable and well prepared.

Only when these workbooks and the data connections are technically functioning and, if necessary, equipped with an authorization structure such as row-level security, do they receive a release tag and are deployed to the integration environment with the next release. Optionally, during each deployment process, the Tableau file is checked in to a versioning tool (such as GIT, Harvest, etc.), migrated, and checked out again. Accordingly, versioning and historization is also available outside the Tableau server.

After a successful check, the workbook is transferred to the production environment with the next release and is available to all recipients of the report. On the production environment, the users only have viewer authorizations, so that the workbooks can no longer be edited there. This ensures an appropriate quality standard.

 

Transformation to the production environment

 

Technical realization

This deployment process is primarily implemented via Tableau's REST API using Python. Tableau provides a Python library (the Tableau Server Client) in which many functions are already encapsulated. Accordingly, the script is clear and thus maintainable.

Export

There are two different Python scripts that enable the deployment process when embedded in a corresponding shell script. The script that is executed first is responsible for downloading the Tableau workbooks including data sources (from the test or integration environment). By applying a token and a technical user, whose password is stored in an encrypted file, a login to Tableau Server is performed. The corresponding release tag is provided via the script call. The script then searches the tagged Tableau artifacts and loads them into a file system. The folders are automatically created in the file system identically to the project structure on Tableau Server.

Furthermore, specific information such as the

  • ID,
  • workbook name ,
  • connection type,
  • server-address and
  • connection to the data source of the tableau workbook

will be written into an automatically created configuration file. The name of the configuration file is automatically derived from the name of the workbook and the environment name, e.g. Analysis_Measures_DEV.txt. This file is also stored in a file system and is included in the deployment process.

Configuration of the data connections

To reconnect the Tableau workbooks to the correct data sources during the upcoming import to the new stage, it is necessary to the connection information. A differentiation is made between extracts and live connections as well as embedded and separately published data sources. According to this, the configuration file of the Tableau workbook also differs. For separately published data sources, a separate configuration must be created. In this case the creation of the configuration file of the connections is done not only for the workbooks, but also for the data sources themselves.

In the configuration file of the workbook, which contains embedded data sources, the Tableau Server is referenced as the data source. For extracts, the entry "localhost" is returned by Tableau Server instead of the server name and must therefore be traced back to the corresponding data source path (path to the hyper ).

 

Configuration of the data connections

 

Adjustment of the connection information

As indicated earlier, there are three environments not only for Tableau Server, but also for the source systems. This means that depending on the environment, the report must reference a different database. In the deployment process, there is not only a copying process of the Tableau artifacts, but also an adjustment of the connection information of the reports to the database of the corresponding environment.

In this context, when exporting the files from Tableau Server, a second configuration file is already created, which is used for the integration/production environment. This file must either be adapted manually, or the new database name is entered automatically if a corresponding mapping file exists (if test environment = a, then integration environment = b).

Import

The folder structure of the file system is retrieved from the script and the Tableau workbooks and data sources are deployed in an identical project structure on the next environment. If these projects do not yet exist, they are automatically created by the script (this is possible including rights assignment). During the deployment the configuration files are read, so that a correct connection of the workbooks to the data sources and/or the data sources to the source systems is ensured. At the end of the script, the deployed connections are replaced by the connection information in the environment-specific configuration file. When deployed to the production environment, quality-assured Tableau analyses with connections to the production data are now available to viewers.

Benefit

In corporate groups with highly standardized processes, it is often difficult to take advantage of self-service BI while restricting fixed governance guidelines. On the one hand, the self-service idea should enable employees to access data quickly and create analyses independently. This makes not only the evaluation path more flexible, but also reduces the burden on IT by cutting down on processes. On the other hand, the integrity and validity of the data must be ensured, the appropriate access regulated and the quality standards for productive dashboards secured. This challenge between self-service BI and data governance can be mastered by an automated deployment process.

A deployment via the Tableau REST API is a way of both integrating Tableau into a group's standard reporting and adapting it to its processes, as well as providing options for exploratory data analysis in the development environment. During the path to the production environment, technical and business tests as well as authorization concepts guarantee access to the right data sources and the quality of the analyses. Additional results are the traceability as well as the maintainability of the building process.

The question of how Tableau can be used for standard reporting has been answered in detail at this point. The same applies to the question of whether the use of an automated deployment process is worthwhile. Because the high benefit cannot be denied: The developers are satisfied, the IT department is relieved, and the recipients of the standard reports are fully informed.

Share this article with others

About the author

Patricia Bobe focuses on BI requirements, dashboard creation and visual analytics as a Tableau artist. Furthermore, as a Solution Expert, she brings several years of project experience in various industries and sectors, including banking, manufacturing and retail, as well as expertise in data discovery & reporting to Woodmark.

To overview blog posts