Reimagining Data Governance For The AI Era
By Partha Anbil & Partha Khot

The adage "garbage in, garbage out" has never been more consequential. When AI-driven decisions can directly impact public health, research integrity, and regulatory compliance, the underlying data governance framework ceases to be a mere IT or compliance function. It becomes a first-order strategic imperative.
Drawing upon an in-depth analysis of a global pharmaceutical organization's large-scale enterprise resource planning (ERP) and data transformation, this article dissects the persistent governance challenges that undermine AI initiatives and presents a robust blueprint for building a resilient, scalable, and ethically sound framework. Because for industry professionals, effective data governance is the bedrock upon which the future of trustworthy AI in pharma must be built.
The Governance Gap In An AI-Powered World
The pharmaceutical value chain is inherently data-intensive, generating vast and heterogeneous data sets from R&D, clinical trials, manufacturing, supply chains, and RWE. While this data is a goldmine for AI applications, its management is fraught with complexity. Information originates from diverse sources — internal systems, CROs, healthcare professionals, and patients — and exists in different formats, taxonomies, and quality levels. Many organizations, despite significant digital investments, continue to operate with fragmented data architectures, inconsistent master data, and siloed governance practices.1 This foundational weakness creates a significant governance gap that exposes AI systems to a host of interconnected risks.

These challenges manifest daily as operational inefficiencies, compromised data trust, delayed regulatory submissions, and a heightened risk of deploying AI systems that are not only inaccurate but also ethically and legally indefensible. Addressing them requires a fundamental rethinking of how governance is structured, incentivized, and operationalized across the enterprise.
A Blueprint For Modern Data Governance
To bridge the governance gap, organizations must adopt a modern framework that’s adaptive, distributed, and value oriented. Based on a synthesis of best practices and real-world application, three core principles emerge: polycentric control, layered accountability, and treating data as a product.
1. The Polycentric Governance Model
The antidote to fragmented accountability is a federated or polycentric model.7 In this system, individual data domains (e.g., clinical research, supply chain, regulatory affairs) operate as semi-autonomous governance centers. They retain control over their local data processes and standards, allowing them to remain agile and responsive to domain-specific needs and local regulations (such as GDPR or HIPAA).
However, these distributed nodes are interconnected and aligned by a set of enterprise wide principles, shared metadata structures, and common interoperability standards enforced by a central governing body and enabling platforms, such as SAP Master Data Governance (MDG). This structure empowers domain experts to manage their data effectively while ensuring that the entire data ecosystem remains coherent, auditable, and strategically aligned. A key operational component of this model is the role of bilinguals — individuals who possess both deep domain knowledge (e.g., in pharmacovigilance or clinical trial design) and data or AI literacy. These actors serve as translators between business and technical teams, helping to shape governance questions, assess model risk, and guide cross-domain implementation without sacrificing domain specificity.8
2. Layered Accountability
While authority is distributed in a polycentric model, accountability must be clearly and vertically structured. A layered accountability framework ensures that roles and responsibilities are unambiguously defined across strategic, operational, and technical tiers.9 This creates a clear chain of command for decision-making and risk management.
At the strategic layer, an executive governance committee and a cross-functional data governance council provide oversight, define enterprise-wide policies, and arbitrate high-level conflicts. They are the ultimate authority on data strategy and investment. At the operational layer, business data owners are accountable for the quality and use of data within their domains. They are supported by data stewards — SMEs who translate business needs into data requirements, monitor quality, and manage metadata. The enablement layer consists of technical and compliance roles, including IT stewards who manage the underlying platforms and compliance officers who ensure adherence to regulatory requirements, such as GxP and 21 CFR Part 11. Data producers and consumers at the front lines are responsible for adhering to defined standards at the point of data creation and use.
This stratification ensures that every data-related action, from strategic policy-setting to daily data entry, is governed by a clear line of sight to an accountable owner, reinforcing traceability and auditability throughout the entire data life cycle.
3. Data as a Product
To overcome the free-rider problem, the perception of governance must shift from a cost center to a value driver. The most effective way to achieve this is by adopting a "data as a product" mindset, a concept championed by the data mesh paradigm.10 Under this model, critical data sets (e.g., clinical trial master data, supplier records, pharmacovigilance reports) are treated as internal products. They have designated owners, defined service-level agreements (SLAs) for quality and internal customers who consume them for analytics and AI applications.
When data is a product, governance becomes an intrinsic part of its value proposition. A data product is only valuable if it is discoverable, addressable, trustworthy, secure, and interoperable. This incentivizes domain teams to invest in high-quality data and comprehensive documentation because the success and adoption of their product depend on it. Furthermore, governance outcomes should be made visible through dashboards and scorecards that track data quality, policy compliance, and lineage coverage, and these metrics should be linked to enterprise KPIs, such as time-to-regulatory-submission and safety signal response time.11 This transforms governance from a compliance-driven mandate into a market-driven discipline.
Lessons From A Global ERP Transformation
The power of this blueprint is best understood through practical application. Consider the journey of a global pharmaceutical company embarking on an enterprise‑wide digital transformation anchored in its migration from SAP ECC to SAP S/4HANA, complemented by the implementation of SAP Master Data Governance (MDG) as a foundational capability.
The organization operated across dozens of countries with highly fragmented master data landscapes — multiple vendor, material, and customer master hierarchies, inconsistent data ownership models, and region‑specific governance practices that had evolved organically over years. These inconsistencies not only created operational inefficiencies but also hampered regulatory compliance, supply chain visibility, and analytics reliability — issues that became more acute as the company prepared for S/4HANA.
While the S/4HANA migration was positioned as a technical modernization, leadership quickly recognized that data quality and governance — not infrastructure — were the true gating factors. Simply converting ECC systems without addressing master data would have perpetuated historical issues, undermining the business case for transformation.
The company adopted a governance‑first transformation blueprint, sequencing SAP MDG as an enterprise capability rather than a post‑migration enhancement. Key elements included:
- Enterprise master data domain rationalization, prioritizing material, vendor, customer, and finance masters critical to manufacturing, quality, and global supply chain operations
- Establishment of global data ownership and stewardship models, clearly separating global standards from local extensions — essential in a regulated pharma context
- Introduction of MDG‑driven workflows integrated with business processes in procurement, manufacturing, quality, and commercial operations, ensuring governance was enforced at the point of creation rather than through retrospective cleanup
- Alignment of master data design with S/4HANA’s simplified data model, avoiding common pitfalls where legacy ECC constructs are blindly carried forward
This approach ensured that by the time core processes were migrated to S/4HANA, the organization was operating on harmonized, policy‑driven master data, rather than attempting to stabilize data post‑cutover.
Outcomes And Business Impact
The results went well beyond technical modernization:
- Accelerated S/4HANA adoption: Migration waves progressed faster due to reduced data inconsistencies and fewer downstream rework cycles.
- Improved regulatory readiness: Standardized master data strengthened traceability across manufacturing, quality, and serialization processes — critical for audits and inspections.
- Supply chain resilience: Cleaner, governed material and vendor masters enabled better planning accuracy and cross‑site visibility.
- Analytics‑ready foundation: A trusted master data layer unlocked more reliable reporting and advanced analytics, setting the stage for AI‑driven use cases in demand forecasting, quality insights, and supplier risk.
Perhaps most importantly, the program reframed master data from a back‑office IT concern into a strategic enterprise asset, tightly coupled with digital transformation outcomes rather than treated as an afterthought.
This case underscores a critical lesson for large pharmaceutical organizations: Successful S/4HANA transformation is less about system conversion and more about instituting disciplined, scalable data governance. By embedding SAP MDG at the heart of the program, the company ensured that its digital core was not only modern, but also trustworthy, compliant, and future‑ready.
The organization's initial as-is assessment revealed a familiar landscape of governance challenges: fragmented data ownership, inconsistent processes, and significant data quality issues across its core master data domains of finance, customer, supplier, and material. For instance, the absence of duplicate checks for financial master data created audit risks, while unclear ownership of customer tax attributes led to fragmented stewardship and cluttered data sets without retention policies. Material master data was particularly complex, with up to 16 different departments managing various attributes, resulting in inconsistent definitions, missing audit trails, and poor alignment between system policies and actual business processes.
To address this, the company implemented a hub‑and‑spoke operating model, directly translating polycentric design and layered accountability principles into day‑to‑day execution.
The hub acts as the central coordinating body, responsible for defining enterprise-wide data standards, managing the SAP MDG platform, and providing oversight and support. It is staffed with a Master Data Management (MDM) lead, domain MDM leads for each core data type (customer, supplier, finance, material), and a technical MDM lead, creating a center of excellence for data governance. With over 100 workflows across its core domains and more than 40 distinct account groups, this centralized orchestration was essential for ensuring process consistency and reducing variability in data handling.
The spokes represent the functional business units — such as global supply chain (GSC), finance, procurement, and R&D — that operationally own and manage the data. Each spoke is accountable for executing data creation and maintenance processes according to the standards set by the hub. Crucially, the model incorporates a maturity-sensitive approach, designating functions with well-established governance practices as thick spokes (e.g., GSC) and those with less maturity as thin spokes (e.g., R&D), which receives more intensive support from the hub. This pragmatic distinction ensures that governance expectations are calibrated to organizational reality.
The transformation followed a phased implementation: an initial setup and maturity assessment, a co-design phase to formalize the target operating model (TOM) with RACI matrices and interaction models, and a road map phase for scaling the model across the enterprise. This phased approach was critical for managing the complexity of a hybrid state where SAP MDG and the legacy SAP ECC system would coexist, requiring structured governance playbooks and interim ownership models to prevent role duplication and policy conflicts.
The Path Forward
The journey of this pharmaceutical company highlights a critical lesson for the industry: Building a robust data governance framework is not merely about designing an organizational chart or implementing new technology; it is about fundamentally re-architecting how data is owned, managed, and valued across the enterprise. The company's adoption of a hub-and-spoke model successfully laid the structural foundation for polycentric control and layered accountability. However, the analysis also revealed important opportunities for even greater maturity, including the need for a formal cross-domain arbitration council, the operationalization of cross-domain performance KPIs, and the critical extension of governance practices directly into AI and machine learning operations (MLOps) workflows.
For pharmaceutical leaders, the adoption of AI is a competitive and scientific necessity. To navigate this transition successfully, organizations must treat data governance with the strategic seriousness it deserves. This requires embracing a model that is federated in structure, layered in accountability, and value-driven in its orientation. By building a framework where governance is a catalyst rather than a constraint, pharmaceutical companies can unlock the immense potential of AI to drive innovation, enhance patient safety, and lead the future of healthcare.
References:
- Khatri, V., & Brown, C. V. (2010). Designing Data Governance. Communications of the ACM, 53(1), 148–152.
- Barocas, S., Hardt, M., & Narayanan, A. (2023). Fairness and Machine Learning. Available at https://fairmlbook.org/.
- Batool, A., et al. (2025). Mapping accountability across the AI lifecycle: A governance-oriented framework.
- Brous, P., Janssen, M., & Herder, P. (2020). The Dual Effects of Data Governance: A Systems Theory Perspective on Smart City Data Governance. Information Systems Frontiers, 22, 1109–1127.
- Ladley, J. (2019). Data Governance: How to Design, Deploy and Sustain an Effective Data Governance Program (2nd ed.). Morgan Kaufmann.
- Daly, A., Hagendorff, T., & Renda, A. (2019). Governance of Artificial Intelligence: Global Strategies and Emerging Gaps.
- Abraham, R., Schneider, J., & vom Brocke, J. (2019). Data Governance: A Conceptual Framework, Structured Review, and Research Agenda. International Journal of Information Management, 49, 424–438.
- Plotkin, D. (2020). Data Stewardship: An Actionable Guide to Effective Data Management and Data Governance. Academic Press.
- DAMA International. (2017). DAMA-DMBOK2: Data Management Body of Knowledge. Technics Publications.
- Dehghani, Z. (2022). Data Mesh: Delivering Data-Driven Value at Scale. O'Reilly Media.
- McKinsey & Company. (2022). Winning with Data: How Pharma Companies Can Gain a Competitive Edge.
Author’s note: The views expressed in the article are those of the author and not of the organizations he represents.
Partha Anbil is at the intersection of the life sciences industry and management consulting. He is currently an industry advisor, life sciences, at MIT, his alma mater. He held senior leadership roles at WNS, IBM, Booz & Company, Symphony, IQVIA, KPMG Consulting, and PWC. Anbil has consulted with and counseled health and life sciences clients on structuring solutions to address strategic, operational, and organizational challenges. He was a member of the IBM Industry Academy, a highly selective group of professionals inducted by invitation only, the highest honor at IBM. He is a healthcare expert member of the World Economic Forum (WEF).
Partha Khot is the life sciences practice lead at Coforge, a $1.7B multinational digital solutions and technology consulting services company focused on driving innovation at the intersection of domain and technology. He held leadership roles at Triomics, Abbott, and Citiustech, driving healthcare innovation and consulting across the U.S., Europe, and India. Partha is responsible for developing next-generation life sciences solutions at Coforge, built on industry platforms and differentiated through AI/automation accelerators.