How to streamline communication in data pipelines using Mage

First published on July 24, 2023

 

6 minute read

Guest post by Xiaoxu GaoI’m a Developer with a focus on Python and Data Engineering. I write stuff to talk to myself and the world.

TLDR

Let the bot handle difficult communications for us.

Have you ever encountered a situation where your downstream data pipelines are blocked by a small manual mistake in one of the Google Sheets? Sometimes, the sheet is not even owned by your team, so you can’t do anything but chase the sheet owner to fix it. Meanwhile, many other critical pipelines are also failing as a consequence, and you need to take care of them as well.

You feel exhausted and drained. The worst part is, there is nothing you can really do as an engineer. It’s all about endless communication and stakeholder management. The Google Sheet issue is just an example of source issues that can occur across various scales. Take a moment to pause and consider one issue that resonates with you as we delve into the article.

A key to improving this situation is

automating the communication lifecycle within your data pipelines

. If your pipeline has an alerting mechanism in place, then it’s already a good start. However, alerts primarily target the data engineering teams rather than external teams.

Based on my experience, it’s equally vital to establish proactive communication with the source team or end users to ensure they are well-informed about ongoing situations and can take action accordingly. Throughout this article, I will use Mage for the implementation, a modern Airflow alternative known for its effective features in solving such problems.

Automated communication

One of the missions of engineers is to automate things. It saves us time for the future and it is fun. Nobody enjoys continuously chasing the sourcing team to fix data issues or individually explaining what happened to end users when things are not working. We would instead let a bot do it for us. There are two levels of automation we can implement:

Immediate feedback to the data source team —

Rather than manually informing the source team on the data issue, an automated and consistent way of communication can be established through a bot. Whenever a data test fails, a callback-type like function will be triggered to notify the source team via email or Slack, providing them with detailed reasons for the failure. For scheduled runs, alerts will be consistently dispatched with each execution until the issue is resolved. The repetitive nature of these alerts should raise team’s awareness and prompt them to address the issue without the need for any manual intervention.

Immediate feedback to the data source team (Created by Author)

Data service status page for data users —

On the other hand, we need to address data users’ requests during downtime. It’s common for the data team to receive numerous repetitive questions during the data downtime. The purpose of a data service status page is to provide a centralized platform for announcing data issues. Users are encouraged to check the page first as the self-service page should provide comprehensive but concise information regarding ongoing situations and expected outcomes. The goal is to minimize the submission of redundant tickets, thereby enabling data engineers to prioritize and concentrate on the most critical tasks at hand.

Data service status page for data users (Created by Author)

I have a dedicated article about

. You are welcome to read it yourself to learn more about it.

Implementation in Mage using callback blocks

The following example uses Mage v0.8.100.

Next, let’s see how to integrate the communication lifecycle into the data pipelines using Mage. If you are not familiar with Mage yet, Mage is seen as a modern alternative for Airflow, aiming to bring the best developer experience and best engineering practices to data engineering teams. Mage addresses a few pain points of Airflow such as local testing and data pass between tasks. It also has a really intuitive UI that helps engineers build data pipelines within a matter of minutes.

Mage UI (Created by Author)

A pipeline in Mage is composed of several types of blocks: @data_loader, @transformer, @data_exporter, @sensor, @callback etc. In this article, we will use the

. It’s a special one because it doesn’t run as an individual step but runs after the parent block. You create a callback block from the “Add-on” and it looks like the screenshot below.

Callback block in Mage (Created by Author)

Each callback block has two functions: @callback(‘success’) and @callback(‘failure’). In our case, @callback(‘failure’) can be used to send communication messages.

parent_block_data

includes the metadata such as the pipeline and block name, which can be utilized to customize the message body. Additionally, Mage strives to optimize the developer experience to its fullest. The UI makes it so easy to reuse the callback block in different blocks. The selected blocks will be shown at the bottom.

Callback block in Mage (Created by Author)

Correspondingly, the callback block will be shown at the bottom of each parent block as well. Each parent block can have one or more callback blocks.

Callback block in Mage (Created by Author)

After the parent block is finished, the

@callback('success')

function is triggered if the block runs successfully, otherwise

@callback('failure')

will be executed.

It’s worth noting that, in v0.8.100, @callback(‘failure’) will only be invoked if the execution of @loader, @transformer or @data_exporter fails, excluding @test section. If @test fails, @callback(‘failure’) won’t be triggered. But please verify this in the version you are currently using.

The @callback block simplifies the creation of the communication lifecycle within the data pipelines. It bridges the communication gap between the data team and external teams. When it comes to announcing major data changes or data issues, having a consistent and timely way of communication is certainly advantageous. But when it comes to celebrating success, it’s time to unleash your creativity and invite users to this celebration.

Conclusion

Ensuring consistent and proper communication between the data team and stakeholders is challenging as everyone has their own preferred communication style. The key takeaway from this article is that automating communication can significantly enhance the work efficiency of the data team.

To start with, we can first create templates for various types of data-related communication, such as table depreciation or table schema change. Once the template is in place, the next step is to explore ways to automate the process, ensuring that the messages are delivered at the right moment and with minimal manual intervention, and this is what we’ve seen in this article.

I hope that this idea resonates with you and sparks excitement about implementing similar solutions within your data team. I’m curious to hear your thoughts on this matter, leave your comments below and let us know. Cheers!

Link to original article: