Introducing: Federated event data governance for multi-subsidiary enterprises
Announcements
Sölvi Logason, CTO & Co-Founder

Sölvi Logason, CTO & Co-Founder

October 3, 2024

Introducing: Federated event data governance for multi-subsidiary enterprises

Finally, a way to manage schemas for organizations with independent subsidiaries who have run their own analytics for years. 

Central data teams in multi-subsidiary enterprises face the seemingly impossible challenge of upholding data quality across products that have little to no overlap in event schemas. 

It’s a nightmare to manage. We’ve gone in depth on why this doesn’t work at scale. In large organizations with multiple subsidiaries and thousands of employees, central governance just isn’t feasible within a single schema registry—and neither is keeping centralized schemas in sync across multiple, siloed schema registries. 

This is particularly true for organizations that grow through a series of acquisitions. Different teams with independent products—not to mention entirely different approaches to tracking and data design—must somehow be centrally synchronized. 

That’s why we’re taking steps to empower a central data team to work with multiple independent schema registries within Avo. This will enable a central data team to sync schemas across subsidiaries—while the subsidiaries can maintain autonomy over their own schemas. 

Organizations: Multiple subsidiaries with independent schema registries

An Organization is a collection of independent schema registries (typically associated with the org’s subsidiaries), with the possibility of centralized schemas for events that need to be synchronized across the subsidiaries (more on Centralized Schemas below). 

Organizations allow you to have centralized governance for a set of events that need to be kept in sync across subsidiaries, while still allowing each subsidiary to work fully independently with their own federated governance, their own event naming framework, and their own set of conventions. This is perfect for organizations with independent subsidiaries (e.g. through acquisitions) who have all been running their own analytics for years or decades.  

Let’s imagine a media organization called Entertainment Inc that contains three subsidiary companies: Newsreader , Movie Streams, and Mega Games. Each of the subsidiaries has its own schema registry, its own data design standards and naming conventions tailored to its individual products. 

Within an Organization, the umbrella company Entertainment Inc can oversee its subsidiaries in one place. 

The central team has visibility into each subsidiary and can coordinate with them. They can support the subsidiaries’ tracking efforts and provide recommendations based on how other teams are structuring their data. The data team at Entertainment Inc essentially has a bird’s eye view of everything being tracked, and can keep the ship moving in the right direction. 

But what if the central team needs new data from all of the subsidiaries? 

Centralized schemas: Sync schemas across independent subsidiaries

For organizations of this scale individual subsidiaries can and should own their schema registry. However, there are times when a central data governance team requires certain events to be tracked in a uniform way across all subsidiaries and domains. To make this possible, we’ve introduced Centralized Schemas

Let’s say the central data team at Entertainment Inc wants to compare the number of new subscriptions across each of the three subsidiaries, Newsreader, Movie Streams, and Mega Games. To get this information, they need a single event: “subscription_started” to be tracked, ideally with a consistent schema across the board, but most importantly with consistent meaning (e.g. does “subscription_started” refer to when the user presses the subscription button [sic] or when they’ve completed filling their info in and payment has been confirmed). 

Centralized schemas in practice: Plan, review, and push

With Centralized Schemas, synchronizing uniform tracking across subsidiaries is easy. The good news is that the workflow for Centralized Schemas matches the Avo workflow you might already be familiar with: 

Push centralized schemas out across subsidiaries and monitor the rollout. 

Just like in a regular Avo workspace, the central team will plan their analytics changes in an Avo branch in the central workspace. 

Once the data changes are drafted, you can invite owners of each subsidiary workspace that will be impacted by your proposed changes, to highlight and mitigate conflicts between universal and subsidiary data structures. This is a chance to resolve any potential conflicts before rolling out to subsidiaries. 

Once you’ve cleared the review process, you can merge the branch, which automatically opens a branch (pull request) in the subsidiary workspace with the requested changes. All with the push of a button. 

Merge a branch in the central workspace to push out to selected subsidiaries. 

From there, the central data team will have an overview of the rollout status within each subsidiary. This enables them to validate when the rollout is complete and data is ready to use from all relevant subsidiaries.

Monitor the rollout status of centralized schema changes into subsidiaries. 

No back and forth, messy spreadsheets or long Slack threads. Instead, take advantage of a governance workflow with a built-in approval step and data design audit checks. Organizations and Centralized Schemas provide a fast, easy, and secure system to implement uniform tracking across all your products or subsidiaries.

What’s next for Avo Organizations

With Organizations and Centralized Schemas we’ve taken a huge leap to help large organizations manage their schema registries at scale. But we’re just getting started. Here are some of the ideas we currently have, and we’d love to hear your thoughts:

What if one subsidiary data team could take inspiration from their peers from other subsidiaries within an organization and reduce duplicative work? Let’s say you’re working on implementing a new event. Your colleague in another subsidiary successfully designed, reviewed, and implemented an event for the same use case months or years ago. Wouldn’t it be great if you could draw from existing data structures, without having to start from scratch? In practice, here’s what it could enable you to do: 

  • Get visibility into other workspaces within your organization;
  • Clone your peers’ data structures and adjust to your own needs;
  • Evolve existing data structures without having to start from a blank slate.

Let us know what you think, and what other use cases you’d like to see covered with Avo Organizations!

Try it out and help us shape governance for multi-subsidiary organizations 

We’re excited to introduce new workflows that make federated governance for event data for multi-subsidiary organizations not just feasible, but fast.

Want to establish central governance with Organizations and Centralized Schemas? Book a demo today -> 

Launch week 🚀

This post is part of a series of launches:

Day 1: Introducing Guardrails: Codify standards for data design.
Read article ->

Day 2: Introducing Stakeholders: Divide and conquer your schema management.
Read article ->

Day 3: Introducing domain-specific schemas: More granular design and validation.
Read article ->

Day 4: Introducing: Federated event data governance for multi-subsidiary enterprises.

Day 5: What we shipped this week for data quality at scale.
Read article ->