CUSTOMER DATA INFRASTRUCTURE
Iterable expects userId or email, catalog events, and `dataFields` in a specific shape. Databricks has Delta Lake tables with evolving schemas, Spark ML model outputs with ArrayType and StructType fields, and Unity Catalog permission boundaries that add friction at every integration point. Meiro Pipes resolves the identity gap, translates your Delta Lake schema into Iterable's API format, and keeps ML-enriched profiles flowing to Iterable — without a custom pipeline that breaks every time a data scientist adds a column.
Free trial · No credit card · Live in minutes
Identity is the first obstacle. Iterable uses email as the canonical identifier in most deployment configurations. Databricks stores records by internal user ID, Salesforce account ID, or other upstream-assigned identifiers depending on the data source. When these don't resolve to an Iterable email, syncs silently fail, create duplicate profiles, or associate data with the wrong user. No standard connector resolves this at the Databricks layer.
Iterable has two distinct data models. User profiles update via a flat dataFields dictionary. Events use the Track API: eventName, createdAt, userId or email, and a typed dataFields object. Delta Lake schema evolution can shift column types between runs — which causes Iterable to silently reject events with inconsistent property types. Catalog event types like order.purchased have strict schemas used for revenue attribution; wrong shape means the record is ignored for those purposes.
List management closes out the problem. Iterable is list-first: syncing audiences means computing membership deltas and calling subscribe/unsubscribe APIs separately from profile updates. Getting Iterable behavioral data back into Databricks requires an S3 export pipeline — not native integration. The complete collect-enrich-activate loop needs multiple tools.
Problem
Data scientists add columns between notebook runs. Delta Lake handles it — your downstream sync doesn't. A churn score field gets renamed, a StructType prediction output gets split, and the Iterable sync that was working last week now sends wrong data or fails silently.
Meiro solves it
Pipes is schema-aware at the transform layer, not at the connector level. When Delta Lake schemas evolve, you update the transform function — not a brittle column mapping. Version-controlled transforms mean schema changes are auditable and deliberate.
Problem
Spark ML model outputs contain ArrayType, StructType, and MapType fields that have no direct JSON equivalent. Iterable's API requires flat JSON with typed values. Converting Spark ML output types into Iterable-compatible payloads requires transformation logic that lives outside the notebook.
Meiro solves it
Pipes transform functions handle Spark type conversion in the JavaScript sandbox. ArrayType fields become flat arrays. StructType metadata gets traversed and mapped to Iterable dataFields. MapType categorical encodings get resolved. The transform layer bridges the type gap.
Problem
Databricks model training uses internal customer_id or model training IDs. Iterable uses email as the primary identifier in most configurations. When churn scores land in Databricks keyed on customer_id, getting them to the right Iterable profile requires cross-system identity resolution.
Meiro solves it
Pipes resolves identity across email, customer_id, user_id, and any other identifier — using deterministic matching with configurable merge limits. ML model scores reach the correct Iterable profile regardless of which key the model training pipeline used.
Problem
Databricks Unity Catalog adds a permission boundary at every integration point. New service principals need to be provisioned. Table grants need to be configured. Each new sync job or connector requires another round of access management before data flows.
Meiro solves it
Pipes uses a single, auditable service principal connection to Databricks. One permission grant, one place to manage access. Unity Catalog ACLs are respected — Pipes only sees what you grant it. No proliferation of service accounts across sync jobs.
Problem
Your data science team builds churn propensity or feature adoption models in Databricks. Outputs land in Delta tables. Getting those scores into Iterable to trigger lifecycle sequences requires a pipeline that doesn't exist out of the box — and breaks when the model output schema changes.
Meiro solves it
Pipes connects directly to the Delta table where Spark ML outputs land. Model scores become Iterable dataFields on user profiles. Qualifying users are subscribed to the correct Iterable list. The lifecycle sequence fires automatically — and the transform adapts to model output schema changes.
Iterable engagement data — email opens, clicks, conversions, custom events — flows into Pipes via webhook or export. Events land without replacing your existing Iterable setup.
Events land in Databricks Delta tables automatically. Pipes connects directly — browse Unity Catalog schemas, map columns, join with Spark ML model outputs or feature tables. Databricks remains your source of truth for ML-enriched user intelligence.
Pipes stitches profiles across Iterable userIds, email addresses, Databricks customer_ids, and model training identifiers. Deterministic matching with configurable limits. No duplicate profiles. No dropped records.
Enriched profiles push back to Iterable with correctly formatted dataFields, properly shaped catalog events, and list membership changes. Spark ML type conversions handled in the transform layer. Scheduled or real time. No custom ETL.
Your data science team builds a churn propensity model using Spark ML in Databricks. The model runs weekly, and outputs a Delta table with user_id, churn_risk_score (DoubleType), account_tier (StringType), and a feature importance StructType. Users with a churn_risk_score above 0.7 should trigger a targeted retention sequence in Iterable.
The problem: the Delta table output schema changes between model iterations — data scientists add feature columns. Iterable identifies users by email, not user_id. The feature importance StructType needs to be simplified before it can become a dataFields value.
Without Meiro: You'd write a Databricks notebook or job that queries the Delta table using Spark SQL (with ::DOUBLE casts and DATEADD(DAY, -1, CURRENT_DATE()) for change detection), resolves email from user_id via a join, converts StructType fields manually, calls Iterable's user update API in batches, and subscribes qualifying users to the retention list. Every model iteration that changes the output schema breaks the job.
With Meiro Pipes: The Delta table is connected directly via Unity Catalog. A Spark SQL query with DATEADD(DAY, -1, CURRENT_DATE()) fetches recent model outputs efficiently. The Pipes transform handles StructType traversal and type coercion in the JavaScript sandbox. Pipes resolves user_id to Iterable email using the identity graph. Enriched profiles — including churn_risk_score and account_tier — push to Iterable as dataFields. Qualifying users are subscribed to the retention list automatically. When the model output schema evolves, you update the transform function — not the pipeline infrastructure.
Time from Spark ML model output to live Iterable retention campaign: hours, not sprints.
Your Databricks Delta table
SELECT
user_id,
email,
churn_risk_score::DOUBLE AS churn_risk_score,
account_tier,
last_active_date,
updated_at
FROM catalog.analytics.user_churn_scores
WHERE updated_at > DATEADD(DAY, -1, CURRENT_DATE()) Pipes transform
// Pipes send function (Event Destination)
async function send(payload, headers) {
return payload.events.map(row => ({
email: row.email,
userId: row.user_id,
dataFields: {
churn_risk_score: row.churn_risk_score,
account_tier: row.account_tier,
last_active_date: new Date(row.last_active_date)
.toISOString()
}
}));
} What Iterable receives
{
"email": "[email protected]",
"userId": "usr_8472",
"dataFields": {
"churn_risk_score": 0.82,
"account_tier": "enterprise",
"last_active_date": "2026-03-15T00:00:00.000Z"
}
} No raw API construction. Spark ML type conversion handled in the transform layer, not in Databricks notebooks. Pipes handles identity resolution, schema compliance, and delivery — and adapts when your Delta table schema evolves.
The standard stack
Meiro Pipes
A reverse ETL tool syncs rows. It doesn't handle Delta Lake schema evolution gracefully, convert Spark ML output types, or resolve identity across Databricks and Iterable. Meiro Pipes does all of that.
You want to build Iterable campaigns that trigger based on Spark ML churn scores and feature adoption signals — data your data science team produces in Databricks but that never makes it to Iterable today.
You're tired of maintaining the Databricks → Iterable pipeline. The `user_id`-to-email resolution. The Spark ML type conversion code. The sync job that breaks silently every time a data scientist adds a column to the model output Delta table.
Native connector. Pushes user profile updates, custom events, and catalog events (order.purchased, cart.abandon, etc.) to Iterable in the correct API format. Handles dataFields serialization, ISO 8601 date formatting, and list subscribe/unsubscribe calls.
Direct connection via Unity Catalog. Supports Spark SQL syntax including ::DOUBLE casts, DATEADD(DAY, -1, CURRENT_DATE()), and Delta table references. Browse catalogs, schemas, and tables. Model warehouse data as profile attributes, events, or audience definitions.
Deterministic stitching across email, userId, customer_id, phone, and model training identifiers. Configurable maxIdentifiers and merge priority. Resolves the Databricks customer_id → Iterable email gap automatically — even as model training pipelines change.
Sandboxed JavaScript functions for schema translation. Convert Spark ML output types — ArrayType, StructType, MapType — to Iterable-compatible flat JSON. Map fields, coerce types, construct dataFields dictionaries. Adapts to Delta Lake schema evolution without pipeline rewrites. 47 allowlisted packages available.
Scheduled or real-time Live Profile Sync. Push ML-enriched profiles, events, and list membership changes to Iterable. Delta table watermark-based change detection. Full delivery history and retry logic.
Model Databricks-derived ML audiences as Iterable list memberships. Pipes computes membership deltas between runs and issues the correct subscribe/unsubscribe API calls. No manual delta logic required.
Delta Lake schema evolution is the first obstacle. Data scientists add columns, rename fields, and change model output schemas between notebook runs. Delta Lake handles this gracefully. Downstream sync jobs don't. Every schema change silently breaks the pipeline — either sending wrong values to Iterable dataFields or failing on type mismatches. A durable integration needs to be schema-aware at the transform layer, not brittle at the column mapping level.
Spark ML type mapping is the second obstacle. Databricks MLflow and Spark ML model outputs carry Spark-native types: ArrayType for lists of model features, StructType for nested prediction metadata, MapType for categorical encodings. Iterable's API requires flat JSON with typed values. Converting these types requires explicit transformation logic that lives outside the Databricks notebook and outside the warehouse.
Identity is the third obstacle. Databricks stores customer records keyed on internal IDs or model training identifiers. Iterable's identity model is built around email (or userId as a secondary key). Resolving the gap between a Databricks customer_id and an Iterable email requires cross-system identity resolution that no standard connector provides.
Unity Catalog permissions add a fourth layer. Every new integration point requires provisioning a service principal and configuring table grants. Fine-grained access control is a feature — but it creates operational overhead that multiplies when sync jobs proliferate.
List management compounds all of this. Iterable is a list-first platform. Getting Databricks-derived ML audiences into Iterable as list memberships means computing current state, calculating deltas, and issuing subscribe and unsubscribe API calls separately from the profile update API. This is not a feature most reverse ETL tools provide.
Connect Databricks and Iterable through Meiro Pipes. Identity-resolved. Schema-aware. Bidirectional. Start free.