CUSTOMER DATA INFRASTRUCTURE
Spark SQL complex types don't map to Braze JSON. Delta Lake schema evolution breaks downstream tools between pipeline runs. Meiro Pipes handles Delta Lake schema translation, resolves identity, and keeps profiles enriched in both directions — without Hightouch, Census, or a custom Spark job you'll be debugging at 2am.
Free trial · No credit card · Live in minutes
Identity is the first problem. Databricks stores records keyed on whatever upstream systems assigned — internal user IDs, Salesforce account IDs, emails. Braze expects an external_id. When these don't align, syncs silently drop records or create duplicate profiles. No standard Databricks connector reconciles cross-system identity.
Braze's data model adds two layers. Its event model is strict: every custom event requires a name, ISO 8601 timestamp, and a typed JSON properties object under 100 KB — one event per row, no reserved key names. CDI requires a PAYLOAD column with a handcrafted JSON string. That means writing change-detection logic against Delta Lake's change data feed, handling insert/update/delete cases separately, and rebuilding the payload every time a source schema changes. Delta Lake's schema evolution is useful for analytics; it doesn't help you maintain a Braze payload template.
Every attribute sync costs a Braze data point; events count against your contract. Teams overspend because attribute-versus-event tradeoffs happen in SQL rather than at the data model layer. Braze CDI is also one-directional — closing the enrichment loop from Braze behavioral data back through Databricks requires a separate reverse ETL vendor or additional custom plumbing.
Problem
Databricks ArrayType, StructType, and MapType columns are first-class in your Delta tables. Braze CDI can't handle them. Every complex Spark type has to be explicitly mapped to flat JSON before it can sync — and that mapping breaks every time a data scientist updates the feature table schema.
Meiro solves it
Pipes transform functions handle Spark complex type translation in JavaScript — unpack StructType fields, map ArrayType elements, flatten MapType entries into Braze-compatible attribute shapes. When the Delta table schema evolves, you update the transform once, not every downstream query.
Problem
Delta tables support schema evolution as a feature. For Braze CDI, it's a liability. A column added or renamed between pipeline runs silently breaks the CDI sync — change detection stops working, payloads stop matching, and the pipeline goes quiet without alerting anyone.
Meiro solves it
Pipes detects schema changes at the connector level and surfaces them before they cause silent failures. Your transforms are version-controlled and explicit about what they consume — schema drift in the Delta table triggers a review, not a midnight outage.
Problem
Databricks has internal user IDs, email addresses, Salesforce IDs from upstream CRM data. Braze has external_id. No standard CDI or pipeline tool reconciles them. Duplicate profiles, dropped records, broken segments.
Meiro solves it
Pipes resolves identity across every identifier type — email, user_id, device_id, phone, CRM ID — using deterministic matching with configurable merge limits. One unified profile, regardless of which system the identifier came from.
Problem
Databricks Unity Catalog requires precise permissions at metastore, catalog, schema, and table level for every integration. Granting CDI or reverse ETL access to the right Delta tables means navigating Unity Catalog's full permission hierarchy — and repeating that work for every new dataset or destination.
Meiro solves it
Pipes maintains one managed connection to Databricks with scoped Unity Catalog permissions. Add datasets, adjust access, rotate credentials — all in one place. No per-sync permission configuration scattered across CDI, Hightouch, and custom pipelines.
Problem
Braze CDI pulls Delta table data in. It doesn't push Braze behavioral events back to Databricks for Spark ML / MLflow model retraining, and it can't close the loop — Braze events → Delta table → MLflow model → scored profiles → Braze — without a separate reverse ETL tool.
Meiro solves it
Pipes collects from both directions. Braze behavioral events flow into Databricks. MLflow model outputs enrich profiles. Enriched profiles flow back to Braze via scheduled or real-time sync. One platform, bidirectional, identity-resolved.
Braze engagement data — opens, clicks, conversions, custom events — flows into Pipes via Currents or webhook. Events land without replacing your Braze SDK.
Events land in Databricks Delta tables automatically. Pipes connects directly — browse Unity Catalog, map columns, join with Spark ML feature tables or any Delta source. Databricks stays your source of truth.
Pipes stitches profiles across Braze external_ids, Databricks user_ids, CRM emails, device IDs — any identifier. Deterministic matching with configurable limits. No duplicate profiles. No dropped records.
Enriched profiles push back to Braze in the exact schema Braze expects — Spark complex types translated, Delta schema evolution handled, attributes as JSON payloads, events properly formatted. Scheduled or real-time. No Hightouch. No Census.
Your data science team builds a churn propensity model in Databricks using Spark ML and MLflow. It combines product usage data (from Braze events landed in Delta tables) with commercial data — contract value, support ticket volume, NPS scores — stored as feature tables in Unity Catalog.
The MLflow model writes predictions back to a Delta table: a churn_risk_score for every customer, alongside Spark StructType metadata from the prediction run.
Without Meiro: Getting that score back into Braze means writing a Databricks job that flattens the StructType prediction metadata, formats the score as a JSON payload in Braze CDI's exact shape, navigates Unity Catalog permissions to give CDI access, sets up the sync, and then rebuilds everything when the MLflow model output schema changes between experiment runs. Or paying Hightouch $10K+/yr to handle it.
With Meiro Pipes: The churn_risk_score is modeled as an attribute in Meiro. The transform function handles the StructType metadata, extracts the score, and maps it to Braze attribute names. Pipes resolves identity between the Databricks user_id and the Braze external_id. The enriched profile — including the score — pushes to Braze as a custom attribute in the correct format. Your lifecycle team builds a Canvas that triggers a retention campaign for anyone with churn_risk > 0.7. No StructType flattening. No CDI payload debugging. No Unity Catalog permission archaeology.
Time from MLflow model output to live Braze campaign: hours, not sprints.
Your Databricks Delta table
SELECT
user_id,
email,
churn_score::DOUBLE,
last_purchase_date,
account_tier,
updated_at
FROM analytics.customer_scores
WHERE updated_at > DATEADD(DAY, -1, CURRENT_DATE()) Pipes transform
// Pipes send function (Event Destination)
async function send(payload, headers) {
return payload.events.map(row => ({
external_id: row.user_id,
attributes: {
churn_risk_score: row.churn_score,
account_tier: row.account_tier,
last_purchase_date: new Date(row.last_purchase_date)
.toISOString()
}
}));
} What Braze receives
{
"external_id": "usr_8472",
"attributes": {
"churn_risk_score": 0.82,
"account_tier": "enterprise",
"last_purchase_date":
"2026-03-15T00:00:00.000Z"
}
} No manual StructType flattening. No `PAYLOAD` column construction. No Unity Catalog permission debugging. Pipes handles Delta Lake schema translation, Spark type mapping, and delivery — and surfaces schema drift before it causes silent failures.
The standard stack
Meiro Pipes
Braze CDI is a data pipe. Hightouch is a sync tool. Neither handles Spark type mapping, Delta schema evolution, or identity resolution. Meiro Pipes does all three — and the pipeline that remains is one you can actually maintain without a Databricks specialist on call.
You want to build a Braze Canvas that targets high-value customers at risk of churning — using Spark ML model outputs and feature table data from Databricks you can't currently access.
You're tired of maintaining the Databricks → Braze pipeline. The StructType flattening SQL. The Unity Catalog permissions archaeology. The CDI config that silently breaks when a data scientist adds a new column to the MLflow output table.
Native connector. Pushes attributes, events, and purchases to Braze in the exact /users/track API format. Handles JSON serialization, ISO 8601 date formatting, and property type validation.
Direct Delta Lake connection with Unity Catalog support. Browse catalogs, schemas, and Delta tables including complex Spark types. Map identifier columns to Meiro identity types. Handles Spark SQL type coercion and schema drift detection between pipeline runs.
Deterministic stitching across email, external_id, user_id, device_id, phone — any identifier. Configurable maxIdentifiers and priority to prevent false merges. Cross-system, not per-tool.
Sandboxed JavaScript functions for schema translation. Handle Databricks StructType, ArrayType, and MapType columns. Flatten Delta table complex types into Braze-compatible payloads. No raw Spark SQL. 47 allowlisted packages available.
Scheduled or real-time Live Profile Sync. Push enriched profiles and segments to Braze or any destination. On-demand exports for backfills. Full delivery history and retry.
Model data before it reaches Braze. Decide at the infrastructure layer what becomes an attribute (costs data points), event (costs events), or event property (free). Stop overspending on Braze's pricing model.
Spark's complex types are the first wall. ArrayType, StructType, and MapType are natural in Delta tables built by data science teams. MLflow model output tables frequently include StructType prediction metadata. Feature tables built for Spark ML use nested types throughout. Braze CDI cannot ingest any of this directly. Every complex column requires explicit type mapping before it reaches Braze — and that mapping becomes a maintenance liability the moment any upstream schema changes, which in Databricks happens constantly.
Delta Lake's schema evolution is a feature that becomes a liability at the integration boundary. Delta supports adding columns, changing types, and renaming fields between pipeline runs — that's the design. For Braze CDI, a schema change between runs silently breaks the sync. Payloads stop matching expected fields. Change detection queries return unexpected results. The pipeline goes quiet and nobody notices until a campaign stops updating and someone asks why.
Unity Catalog adds permission complexity at every integration boundary. Granting any external tool access to Delta tables requires navigating the full Unity Catalog hierarchy — metastore, catalog, schema, table — with appropriate grants at each level. As teams add datasets and destinations, this permission overhead compounds.
Identity remains the foundational problem. Databricks stores records with whatever identifiers data engineering assigned — internal user IDs, email addresses, Salesforce account IDs from upstream CRM data. Braze identifies users by external_id. The gap between these systems is where records get dropped or duplicated.
Connect Databricks and Braze through Meiro Pipes. Delta Lake schema-aware. Spark types translated. Identity-resolved. Bidirectional. Start free.