CUSTOMER DATA INFRASTRUCTURE
Customer.io's identify and track calls look simple. But BigQuery has customer IDs that don't match Customer.io's id or email, nested STRUCT fields that need flattening before they can become attributes, and per-byte billing that punishes full-table scans. Meiro Pipes resolves the identity gap, transforms your warehouse data into Customer.io's schema in the transform layer (not in expensive SQL), and keeps profiles enriched in both directions — without the custom pipeline you'd otherwise have to build.
Free trial · No credit card · Live in minutes
Identity is the first structural problem. Customer.io identifies users by a customer id you define, with email as optional. BigQuery stores records keyed on Firebase installation IDs, internal user IDs, or Stripe customer IDs depending on the data source. When these don't map to Customer.io's customer id, identify calls create duplicates or miss the intended user — anonymous-to-identified lifecycle merges fail at whichever stage the identifier breaks.
BigQuery introduces two additional failure points. Its STRUCT and ARRAY types must be flattened before they can map to Customer.io's flat attribute model — and that flattening is a maintenance liability on every schema change. BigQuery also bills per byte scanned, so naive change-detection queries against large tables are a GCP cost problem as well as an engineering one. The identify versus track classification — which determines whether data becomes a persistent attribute or a behavioral trigger — still must be made explicitly regardless of the warehouse. B2B teams add a further layer: Customer.io Objects require a separate API endpoint, a different schema, and manual object-to-person relationship maintenance.
Customer.io's warehouse export covers Redshift and BigQuery natively, but the reverse — BigQuery to Customer.io — still requires direct API integration. For teams that need both directions, the full loop is infrastructure work, not configuration.
Problem
BigQuery has Stripe IDs, internal user IDs, email addresses — keyed differently depending on the upstream system. Customer.io expects a customer id and optionally email. When these diverge, identify calls create duplicate profiles or miss the right user. Anonymous-to-known merges fail silently.
Meiro solves it
Pipes resolves identity across every identifier type — email, user_id, anonymous ID, Stripe customer ID, CRM contact ID — using deterministic matching. One unified profile, regardless of which identifier Customer.io sees at any given touchpoint.
Problem
BigQuery stores product event properties in nested STRUCTs and repeated ARRAYs. Customer.io requires flat attribute objects. Flattening in BigQuery SQL means unnesting at query time — more bytes scanned, higher costs on every sync run.
Meiro solves it
Pipes transform functions receive BigQuery rows and flatten nested fields in the JavaScript sandbox. Your BigQuery query stays simple: DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY), CAST for type coercion, backtick-quoted table names. The transform layer handles the rest.
Problem
Customer.io uses identify for persistent attributes and track for behavioral events. Getting this wrong affects segmentation, triggers, and pricing. BigQuery data arrives as rows in tables — the identify/track split is a modeling decision that has to be made explicitly.
Meiro solves it
Pipes lets you model your BigQuery data before it reaches Customer.io. Decide what becomes a persistent attribute (identify call) versus a behavioral event (track call) at the infrastructure layer — visible, version-controlled, and changeable without touching Customer.io.
Problem
Customer.io has a Data Warehouse destination but no native source connector for BigQuery. Getting BigQuery data into Customer.io requires direct API calls. Getting Customer.io engagement data back into BigQuery requires workarounds — and the loop is incomplete without both directions.
Meiro solves it
Pipes handles both directions natively. Customer.io engagement events flow into BigQuery. BigQuery data enriches profiles. Enriched profiles push back to Customer.io via identify and track calls. One platform, bidirectional, no workarounds.
Problem
SaaS users move from anonymous visitor to trial to paid customer, accumulating different identifiers at each stage. BigQuery may carry all of them in separate tables. Reconciling the full identity graph and keeping Customer.io synchronized across the entire lifecycle requires infrastructure above any single API call.
Meiro solves it
Pipes builds a cross-system identity graph that spans anonymous IDs, trial user IDs, paid customer IDs, and email — and keeps Customer.io profiles unified as users transition through lifecycle stages. No duplicate profiles. No dropped attributes.
Customer.io engagement data — email opens, clicks, conversions, campaign events — flows into Pipes via webhook or export. Events land without replacing your existing Customer.io setup.
Events land in BigQuery automatically. Pipes connects directly — browse datasets, map columns, join with product usage data, billing records, or any warehouse source. BigQuery stays your source of truth.
Pipes stitches profiles across Customer.io customer ids, email addresses, BigQuery user_ids, anonymous IDs, and Stripe or CRM identifiers. Deterministic matching with configurable limits. Full lifecycle coverage from anonymous to paid.
Enriched profiles push back to Customer.io via correctly structured identify calls and track events. Nested BigQuery fields flattened in the transform layer. Scheduled or real time. No custom API client. No batch job to maintain.
Your SaaS product tracks every product activation milestone in BigQuery — when a user completes onboarding steps, connects integrations, or invites teammates. Those events land in BigQuery event tables with nested property STRUCTs. You want Customer.io to trigger a specific onboarding sequence when a user completes each milestone.
The problem: milestone events are in BigQuery, not in Customer.io. The user who hit the milestone may be identified by an internal user_id that doesn't match the customer id Customer.io uses. The event properties are nested in a STRUCT field.
Without Meiro: You'd write a BigQuery job using CAST and DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY) to fetch recent events, unnest STRUCT fields (scanning more bytes), resolve the Customer.io customer id, and call the track API for each event. You'd maintain that job across schema changes, handle retries, and debug silent failures when identifiers don't match.
With Meiro Pipes: Milestone events from BigQuery are modeled as Customer.io track calls. The Pipes transform flattens nested STRUCT fields in the JavaScript sandbox — no expensive unnesting in BigQuery SQL. Pipes resolves internal user_id to Customer.io customer id using the identity graph. Milestone events push to Customer.io automatically with the correct event name, timestamp, and properties. Your lifecycle team triggers onboarding branches from those events without waiting for engineering.
Time from product milestone to triggered onboarding email: minutes, not days.
Your BigQuery table
SELECT
user_id,
email,
event_name,
occurred_at,
CAST(plan_tier AS STRING) AS plan_tier,
feature_key,
is_paid_customer,
company_id
FROM `project.analytics.product_events`
WHERE occurred_at > DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY) Pipes transform
// Pipes send function (Event Destination)
async function send(payload, headers) {
return payload.events.map(row => ({
type: 'identify',
userId: row.user_id,
traits: {
email: row.email,
churn_risk_score: row.churn_risk_score,
account_tier: row.account_tier,
last_active_date: row.last_active_date
}
}));
} What Customer.io receives
{
"type": "identify",
"userId": "usr_8472",
"traits": {
"email": "[email protected]",
"churn_risk_score": 0.82,
"account_tier": "enterprise",
"last_active_date": "2026-03-15"
}
} No custom API client code. No STRUCT unnesting in BigQuery SQL. Pipes handles nested field flattening in the transform layer, identity resolution, schema compliance, and delivery — and adapts when your BigQuery schema changes.
The standard stack
Meiro Pipes
A reverse ETL tool syncs rows. It doesn't resolve lifecycle identity, classify attributes versus events, or flatten nested BigQuery fields without blowing up your bytes-scanned bill. Meiro Pipes does all of that — and the pipeline that remains is one your team can actually understand.
You want to trigger Customer.io campaigns based on real product behavior — feature adoption, activation milestones, upgrade signals — data that your data team has in BigQuery but you can't access from Customer.io today.
You're tired of maintaining the BigQuery → Customer.io pipeline. The customer id resolution logic. The nested STRUCT unnesting that costs bytes. The identify/track classification code that lives in a script nobody documents.
Native connector. Sends identify calls (user attributes) and track calls (behavioral events) to Customer.io in the correct API format. Handles timestamp formatting, property serialization, and B2B Object API calls with relationship mapping.
Direct warehouse connection supporting backtick-quoted table references, DATE_SUB, TIMESTAMP_DIFF, and CAST. Browse datasets, map identifier columns to Meiro identity types. Model warehouse data as identify attributes, track events, or B2B Object records.
Deterministic stitching across Customer.io customer id, email, user_id, anonymous ID, Stripe ID, and CRM IDs. Full lifecycle coverage from anonymous visitor through paid customer. Configurable merge limits to prevent false merges.
Sandboxed JavaScript functions for schema translation. Flatten nested BigQuery STRUCT and ARRAY fields without expensive SQL unnesting. Classify data as identify or track calls. Map fields, coerce types, format timestamps. 47 allowlisted packages available.
Scheduled or real-time Live Profile Sync. Partition-aware change detection to minimize BigQuery bytes scanned. Push enriched profiles and events to Customer.io via identify and track calls. Full delivery history and retry logic.
Model BigQuery company and account records as Customer.io Objects. Pipes handles the Object API endpoint, schema differences, and person-to-object relationship maintenance — so B2B teams can sync account context alongside person records.
Identity is the first structural problem. Customer.io identifies users by a customer id you define, with email as an optional secondary identifier. Snowflake has customer records keyed on internal IDs, Stripe customer IDs, CRM contact IDs, or email depending on the data source. When these don't reconcile with Customer.io's customer id, identify calls create duplicate profiles or miss the intended user. No standard reverse ETL connector resolves this cross-system identity problem.
BigQuery's nested schema model adds a second structural layer. Product event data frequently arrives with nested STRUCT and ARRAY fields — properties that BigQuery stores efficiently but Customer.io cannot consume directly. Flattening those nested fields in BigQuery SQL means unnesting at query time, which increases bytes scanned and drives up per-query costs. The right approach is to flatten in the transform layer, keeping the warehouse query simple and cost-efficient.
The identify versus track decision is the third structural problem. Persistent user attributes belong in identify calls. Behavioral occurrences belong in track calls. Getting this classification wrong affects segmentation, trigger logic, and billing. BigQuery data arrives as rows in tables. The identify/track classification is a modeling decision that has to be made explicitly and maintained when the underlying data model changes.
The enrichment loop is the fourth gap. Customer.io's data warehouse destination doesn't include a native BigQuery source connector. Getting Customer.io engagement data into BigQuery requires workarounds. And the reverse — BigQuery to Customer.io — requires direct API integration that no native feature provides.
Connect BigQuery and Customer.io through Meiro Pipes. Identity-resolved. Schema-aware. Bidirectional. Start free.