Data architecture translates business needs into data and system requirements and seeks to manage data and its flow through the enterprise. Credit: Brandon Mowinkel Data architecture definition Data architecture describes the structure of an organization’s logical and physical data assets and data management resources, according to The Open Group Architecture Framework (TOGAF). It is an offshoot of enterprise architecture that comprises the models, policies, rules, and standards that govern the collection, storage, arrangement, integration, and use of data in organizations. An organization’s data architecture is the purview of data architects. Data architecture goals The goal of data architecture is to translate business needs into data and system requirements and to manage data and its flow through the enterprise. Many organizations today are looking to modernize their data architecture as a foundation to fully leverage AI and enable digital transformation. Consulting firm McKinsey Digital notes that many organizations fall short of their digital and AI transformation goals due to process complexity rather than technical complexity. Data architecture principles According to Joshua Klahr, vice president of product management, core products, at Splunk, and former vice president of product management at AtScale, six principles form the foundation of modern data architecture: Data is a shared asset. A modern data architecture needs to eliminate departmental data silos and give all stakeholders a complete view of the company.Users require adequate access to data. Beyond breaking down silos, modern data architectures need to provide interfaces that make it easy for users to consume data using tools fit for their jobs.Security is essential. Modern data architectures must be designed for security and they must support data policies and access controls directly on the raw data.Common vocabularies ensure common understanding. Shared data assets, such as product catalogs, fiscal calendar dimensions, and KPI definitions, require a common vocabulary to help avoid disputes during analysis.Data should be curated. Invest in core functions that perform data curation (modeling important relationships, cleansing raw data, and curating key dimensions and measures).Data flows should be optimized for agility. Reduce the number of times data must be moved to reduce cost, increase data freshness, and optimize enterprise agility. Data architecture components A modern data architecture consists of the following components, according to IT consulting firm BMC: Data pipelines. A data pipeline is the process in which data is collected, moved, and refined. It includes data collection, refinement, storage, analysis, and delivery.Cloud storage. Not all data architectures leverage cloud storage, but many modern data architectures use public, private, or hybrid clouds to provide agility.Cloud computing. In addition to using cloud for storage, many modern data architectures make use of cloud computing to analyze and manage data.Modern data architectures use APIs to make it easy to expose and share data.AI and ML models. AI and ML are used to automate systems for tasks such as data collection, labeling, etc. At the same time, modern data architectures can help organizations unlock the ability to leverage AI and ML at scale.Data streaming. Data streaming is flowing data continuously from a source to a destination for processing and analysis in real-time or near real-time.Container orchestration. A container orchestration system such as open-source Kubernetes is often used to automate software deployment, scaling, and management.Real-time analytics. The goal of many modern data architectures is to deliver real-time analytics, the ability to perform analytics on new data as it arrives in the environment. Data architecture vs. data modeling According to Data Management Book of Knowledge (DMBOK 2), data architecture defines the blueprint for managing data assets by aligning with organizational strategy to establish strategic data requirements and designs to meet those requirements. On the other hand, DMBOK 2 defines data modeling as, “the process of discovering, analyzing, representing, and communicating data requirements in a precise form called the data model.” While both data architecture and data modeling seek to bridge the gap between business goals and technology, data architecture is about the macro view that seeks to understand and support the relationships between an organization’s functions, technology, and data types. Data modeling takes a more focused view of specific systems or business cases. Data architecture frameworks There are several enterprise architecture frameworks that commonly serve as the foundation for building an organization’s data architecture framework. DAMA-DMBOK 2. DAMA International’s Data Management Body of Knowledge is a framework specifically for data management. It provides standard definitions for data management functions, deliverables, roles, and other terminology, and presents guiding principles for data management.Zachman Framework for Enterprise Architecture. The Zachman Framework is an enterprise ontology created by John Zachman at IBM in the 1980s. The “data” column of the Zachman Framework comprises multiple layers, including architectural standards important to the business, a semantic model or conceptual/enterprise data model, an enterprise/logical data model, a physical data model, and actual databases.The Open Group Architecture Framework (TOGAF). TOGAF is an enterprise architecture methodology that offers a high-level framework for enterprise software development. Phase C of TOGAF covers developing a data architecture and building a data architecture roadmap. Modern data architecture best practices Modern data architectures must be designed to take advantage of emerging technologies such as artificial intelligence (AI), automation, internet of things (IoT), and blockchain. Dan Sutherland, senior director, technology consulting, Protiviti, says modern data architectures should adhere to the following best practices: Cloud-native. Modern data architectures should be designed to support elastic scaling, high availability, end-to-end security for data in motion and data at rest, and cost and performance scalability.Scalable data pipelines. To take advantage of emerging technologies, data architectures should support real-time data streaming and micro-batch data bursts.Seamless data integration. Data architectures should integrate with legacy applications using standard API interfaces. They should also be optimized for sharing data across systems, geographies, and organizations.Real-time data enablement. Modern data architectures should support the ability to deploy automated and active data validation, classification, management, and governance.Decoupled and extensible. Modern data architectures should be designed to be loosely coupled, enabling services to perform minimal tasks independent of other services. Data architecture roles Here are some of the most popular job titles related to data architecture and the average salary for each position, according to data from PayScale: Data architect: $79K-$160KProject manager: $58K-$129KSolutions architect: $76K-$163KData engineer: $66K-$132KData analyst: $45K-$87KData scientist: $68K-$136K Related content news Appeal court overturns $1.6bn mainframe software ‘poaching’ ruling against IBM AT&T ‘independently decided” to replace BMC software, the appeals court found. By John Leyden May 03, 2024 3 mins Mainframes Legal news IBM and AWS forge global alliance, streamlining access to AI and hybrid cloud solutions This partnership will allow businesses in 92 countries access to IBM’s software products, including data technologies and AI, directly through the AWS Marketplace. By Gyana Swain May 03, 2024 5 mins Amazon Web Services Hybrid Cloud Artificial Intelligence feature UPS delivers customer wins with generative AI The multinational shipping company enlisted LLMs to automate customer message responses, reducing agent handle time and paving the way for genAI use across the enterprise. By Paula Rooney May 03, 2024 7 mins CIO 100 Generative AI Digital Transformation feature Scrum master certification: Top 13 certs for agile pros A Scrum master certification can prove you have the knowledge and competency to lead agile teams successfully. By Sarah K. White, Sharon Florentine May 03, 2024 16 mins Certifications Agile Development IT Skills PODCASTS VIDEOS RESOURCES EVENTS SUBSCRIBE TO OUR NEWSLETTER From our editors straight to your inbox Get started by entering your email address below. Please enter a valid email address Subscribe