Airflow | Prakash

In this section, we provide guides and references to use the Airflow connector.

Step 1 –: Create New Service

Create New Service to click on + ADD .
The first step is to ingest the metadata from your sources. To do that, you first need to create a Service connection first.
This Service will be the bridge between Prakash and your source system

The Add New service form should look something like this.

Step 2 –: Select Airflow Pipeline Service Type

Step 3 –: Name and Describe Your Service

Provide a name and description for your Service.

Service Name:-

Prakash uniquely identifies Services by their Service Name. Provide a name that distinguishes your deployment from other Services, including the other Airflow Services that you might be ingesting metadata from.

Note that when the name is set, it cannot be change.

Step 4 –: Configure The Service Connection

in this step, we will configure the connection settings required for Glue
Please follow the instructions below to properly configure the Service to read from your sources. You will also find helper documentation on the right-hand side panel in the UI

Connection Details: -

Host and Port: Host and port of the Airflow service. This should be specified as a string in the format hostname:port. E.g., adb-xyz.azuredatabricks.net:443
Number Of Status: Number of past task status to read every time the ingestion runs. By default, we will pick up and update the last 10 runs.
Metadata Database Connection: Select your underlying database connection.

Note that the Backend Connection is only used to extract metadata from a DAG running directly in your instance.

Step 5 –: Check Test Connection

Once the credentials have been added, click on TEST CONNECTION To Check Credentials is valid or not.

If Test Connection Successful after that click on SAVE and then configure Metadata Ingestion.

Step 6 –: Configure Metadata Ingestion

In this step we will configure the metadata ingestion pipeline, please follow the instructions below.

Pipeline Filter Pattern: Note that all of them support regex as include or exclude.
Database Service Name: You can enter a list of Database Services that are hosting the inlet and the outlet tables.
Include Tags: Set the ‘Include Tags’ toggle to control whether to include tags in metadata ingestion.
Mark Deleted Pipeline: Set the ‘Mark Deleted Dashboards’ toggle to flag Pipeline as soft-deleted if they are not present anymore in the source system.
Include lineage: Set the ‘Include Tags’ toggle to control whether to include tags as part of metadata ingestion.
Enable Debug log: Set the Enable Debug Log toggle to set the default log level to debug.

Step 7 –: Schedule the Ingestion and Deploy

Scheduling can be set up at an hourly, daily, weekly, or manual cadence. The timezone is in UTC. Select a Start Date to schedule for ingestion. It is optional to add an End Date
Review your configuration settings. If they match what you intended, click DEPLOY to create the service and schedule metadata ingestion
If something doesn’t look right, click the BACK button to return to the appropriate step and change the settings as needed.
After configuring the workflow, you can click on DEPLOY to create the pipeline.

Step 8 –: View the Ingestion Pipeline

Once the workflow has been successfully deployed, you can view the Ingestion Pipeline running from the Service Page