Git Integration
Introduction
The Git Integration feature allows you to connect Data Mesh Manager with your Git repositories to import and synchronize data contracts. This feature streamlines the management of data contracts by leveraging your existing Git workflows and provides a seamless way to keep your data contracts in sync between your code repositories and Data Mesh Manager.
Currently, Data Mesh Manager supports two Git Connection Types:
- GitHub
- GitLab
This guide covers how to configure Git Connections and how to import and manage data contracts from Git repositories.
Configuring Git Connection
Before importing data contracts from Git, you need to configure a Git Connection.
- Navigate to Organization Settings by clicking on the Profile icon in the top-right corner
- Select (Organization) Settings
- Go to the Git Connections section
Adding a New Git Connection
In the Git Connections section, click on Add new Git Connection and fill in the following details:
- Git Connection Type: Choose either GitHub or GitLab
- Git Connection Name: Enter a descriptive name that will help you identify this connection in import dialogs
- Authentication Token: Enter a Personal Access Token (PAT) for your chosen connection type. This is the real token, not a placeholder or just the token name. The token must have the necessary permissions to access the repositories you want to import data contracts from. If you want to use the push functionality it needs write permission to the repository. Otherwise, read access is enough.
Personal Access Token Links:
Here are links to the documentation for creating Personal Access Tokens for each Git Connection type:
- GitLab: Creating a Personal Access Token, the token needs to have
api
scope for access,read_repository
andwrite_repository
are optional. - GitHub: Creating a Fine-grained Personal Access Token, a fine-grained token is recommended. Select the repositories you want to access and ensure the token has read and (optional) write access to the
Contents
scope.
⚠️ Important Security Note: All repositories accessible with the provided token will be available to Data Mesh Manager users within your Organization. This includes repository names and YAML files contained within these repositories. Make sure to use tokens with appropriate access scopes.
After filling in the necessary information, click Save to store the configuration. Note that once saved, the token will no longer be visible in the interface for security reasons.
Importing a Data Contract with Git
Once you've configured a Git Connection, you can import data contracts from your Git repositories.
- Navigate to the Data Contracts list by selecting Studio → Data Contracts
- Click on Add Data Contract and select Import from Git
In the import form, configure the following settings:
- Git Connection: Select a previously configured Git Connection from the dropdown
- Repository: Choose from the suggested repositories, or enter a known Git URL (must use the HTTPS scheme)
- Branch: Currently, only the
main
branch is supported and is preselected - Path: Enter the path to the YAML file within the repository that you want to import
After configuring these settings, click Import Data Contract to begin the import process.
If there are any errors during import (such as file not found, incorrect credentials, etc.), these will be displayed above the form with guidance on how to resolve them.
Managing Git-Connected Data Contracts
After successfully importing a data contract, you will be redirected to the newly created data contract page. For Git-connected contracts, the sidebar displays an additional card with information about the Git connection.
Synchronizing Changes
Data Mesh Manager automatically detects changes between your Git repository and the Data Mesh Manager platform. There are three possible synchronization scenarios:
- Git is Newer: The data contract in Git has newer changes than the version in Data Mesh Manager
- The Git card will display a Pull button to update the Data Mesh Manager version
- Data Mesh Manager is Newer: The data contract in Data Mesh Manager has newer changes than the version in Git
- The Git card will display a Push button to update the Git version
- Conflicting Changes: Both Git and Data Mesh Manager versions have different changes
- Conflicts must be resolved in your Git repository first
- After resolution, use the Pull option to synchronize with Data Mesh Manager
- Note that pulling changes may override modifications made in the Data Mesh Manager UI
By leveraging the Git integration, you can maintain a single source of truth for your data contracts while still benefiting from Data Mesh Manager's visualization and management capabilities.