Data Management: 10 Trends Shaping Tomorrow’s Landscape

Arvind Mehrotra
9 min readAug 21, 2024

--

In 2023, the global population generated an incredible 120 zettabytes of data. This number will increase by 150% this year — but how do you manage such massive volumes? Read on to know my picks for the top data management trends that’ll shape our present and future:

1. The widespread use of Data Management as a Service (DMaaS)

DMaaS is a cloud-based service model that outsources data infrastructure management to third-party providers. It offers a scalable, on-demand solution for storing, processing, and analysing data without upfront investment in hardware or software.

With DMaaS, you can leverage the provider’s infrastructure and expertise to handle tasks such as data ingestion, storage optimisation, and data lifecycle management. This lets you focus on deriving insights from your data rather than managing the underlying infrastructure.

Contrarian View: DMaaS offers convenience but might introduce data security and control concerns. Relying on third-party services could pose risks if the provider suffers a data breach or their service experiences downtime. Additionally, hidden costs or data lock-in issues could affect long-term flexibility.

Additional Steps:

  • Implement Strong SLAs: Ensure your DMaaS provider offers robust Service Level Agreements (SLAs) covering uptime, security, and data access.
  • Regular Audits: Conduct regular security audits and service provider compliance checks.
  • Multi-Provider Strategy: Consider a hybrid approach using multiple DMaaS providers to avoid lock-in and increase resilience.

2. AI/ML-driven DQ management

AI and ML technologies are increasingly automating data quality management processes. These technologies analyse large volumes of data to identify patterns, anomalies, and errors, enabling proactive detection and remediation of data quality issues.

AI/ML-driven DQ management involves data profiling, anomaly detection, and predictive analytics to ensure data accuracy, completeness, and consistency. Organisations can improve data quality by automating these tasks while reducing manual effort and human error.

Contrarian View: Relying heavily on AI/ML for data quality might lead to overconfidence in automated systems. These systems can miss contextual nuances or new data errors that human experts would catch. Furthermore, AI models can inherit biases present in the training data.

Additional Steps:

  • Human Oversight: Maintain a layer of human review to complement AI-driven data quality checks.
  • Bias Detection: Implement mechanisms to evaluate and correct any biases in the AI models regularly.
  • Continuous Training: Regularly update and retrain AI models to adapt to new data and emerging error types.

3. The rise of data observability platforms

Data observability platforms provide visibility and monitoring capabilities across distributed data systems and pipelines. They collect, analyse, and visualise data metrics and events to track data flow, latency, and real-time errors. These platforms enable organisations to identify performance bottlenecks, troubleshoot issues, and optimise data pipelines for efficiency and reliability.

Data observability platforms typically integrate with existing monitoring and logging tools to provide comprehensive insights into the health and performance of data infrastructure.

Contrarian View: Data observability platforms can become complex and expensive, potentially leading to an overabundance of monitoring and alerts that overwhelm users. They might also introduce new points of failure or dependencies.

Additional Steps:

  • Customisable Dashboards: Build and use customisable dashboards to focus on critical metrics and avoid alert fatigue.
  • Integration Testing: Ensure that the observability platform integrates well with existing systems to avoid adding unnecessary complexity.
  • Cost-Benefit Analysis: Regularly assess the value of observability tools against their cost and complexity.

4. Transitioning to a data mesh architecture

Data mesh architecture is a decentralised approach to data management that shifts the focus from centralised data warehouses to distributed data domains. In a data mesh architecture, data ownership and governance ownership are the responsibility of individual and domain teams that curate and manage their data.

Standardised APIs and data contracts facilitate data exchange and interoperability between domains. This approach promotes agility, autonomy, and scalability, allowing organisations to adapt to evolving data needs and requirements.

Contrarian View: Data mesh architecture promotes decentralisation but can lead to inconsistencies and inefficiencies if not appropriately managed. It also requires a significant cultural shift and alignment across different teams, which can be challenging.

Additional Steps:

  • Governance Framework: Develop a robust governance framework to ensure consistency and quality across decentralised data domains.
  • Training and Support: Provide comprehensive training and support to teams to facilitate the transition and adoption.
  • Pilot Projects: Start with pilot projects to test and refine the data mesh approach before full-scale implementation.

5. Advancements in data encryption algorithms (SHA-3, FHE, Blowfish and others)

Advancements in data encryption algorithms improve the security and efficiency of data protection mechanisms. For example, SHA-3 (Secure Hash Algorithm 3) offers more robust cryptographic properties than its predecessors, enhancing data integrity and authenticity.

Fully Homomorphic Encryption (FHE) enables computations on encrypted data without decrypting it, preserving confidentiality while allowing data analysis and processing. Blowfish is a symmetric-key block cypher algorithm known for its fast encryption and decryption speed, making it suitable for resource-constrained environments.

Contrarian View: Advanced encryption algorithms can introduce performance overhead and are thus complex to implement correctly. Additionally, new algorithms might become obsolete if we discover vulnerabilities over time.

Additional Steps:

  • Performance Testing: Regularly optimise encryption methods to balance security with performance.
  • Stay Informed: Keep updated with the latest research on encryption methods to anticipate potential future vulnerabilities.
  • Fallback Strategies: Develop and maintain fallback encryption strategies in case newer algorithms face issues.

6. Streaming data moves from niche to normal

Streaming data technologies enable real-time processing and analysis of data on a real-time basis. Traditionally used in niche applications such as financial trading and sensor data monitoring, streaming data is becoming increasingly mainstream across industries. Platforms like Apache Kafka and Apache Flink facilitate aggregating, processing, and analysing continuous data streams at scale.

By adopting streaming data architectures, you can achieve low-latency data processing, real-time analytics, and immediate insights into changing business conditions. This empowers you to make timely decisions and respond rapidly to events as they unfold.

Contrarian View: Shifting to streaming data can increase system complexity and require significant infrastructure investments. Real-time processing also demands high reliability, which can be challenging to achieve.

Additional Steps:

  • Scalable Infrastructure: Invest in scalable infrastructure that can handle increasing volumes of streaming data.
  • Resilience Testing: Implement rigorous testing to ensure system reliability and handle potential failures gracefully.
  • Cost Management: Monitor and manage costs associated with streaming data to ensure they align with business value.

7. DataOps bleeds into FinOps

DataOps practices focus on streamlining and automating data operations processes, from data integration and quality management to deployment and monitoring. Similarly, FinOps (Financial Operations) optimises the cost and performance of cloud resources and services.

As organisations increasingly rely on data-intensive applications and cloud-based infrastructures, the convergence of DataOps and FinOps becomes inevitable.

Contrarian View: Integrating DataOps with FinOps can blur the lines between data and financial management, potentially creating conflicts or inefficiencies. Focusing on financial efficiency also detracts from other critical aspects of data operations.

Additional Steps:

  • Clear Objectives: To avoid conflicts, define objectives and boundaries for integrating DataOps with FinOps.
  • Balanced Approach: Ensure a balanced approach considering operational efficiency and financial goals.
  • Cross-Functional Teams: Foster collaboration between data and finance teams to align goals and processes.

8. Serverless database adoption explodes

Serverless databases abstract the underlying infrastructure management tasks, allowing you to focus solely on data management and application development. With serverless database offerings such as Amazon Aurora Serverless, Google Cloud Firestore, and Microsoft Azure Cosmos DB, you can provision database resources on-demand and scale automatically based on workload demand.

It eliminates the need for capacity planning, provisioning, and maintenance, reducing operational overhead and cost.

Contrarian View: While serverless databases simplify management, they can introduce challenges such as unpredictable costs and vendor lock-in. They also lack certain features or performance optimisations available in traditional databases.

Additional Steps:

  • Cost Monitoring: Implement robust cost monitoring and forecasting to manage serverless database expenses effectively.
  • Feature Comparison: Evaluate and compare serverless databases against traditional options to ensure they meet all required features and performance needs.
  • Vendor Assessment: Regularly assess and evaluate the serverless database provider to mitigate the risk of vendor lock-in.

9. The death of data gravity

Data gravity refers to the tendency of data to accumulate and attract applications and services around it due to the high cost and latency of moving data over a distance. However, data gravity is losing relevance with the emergence of edge computing and distributed architectures. Edge computing brings computation and data storage closer to the data source, reducing latency and enabling real-time data processing and analysis at the edge.

Distributed architectures, such as microservices and containerised applications, facilitate data locality and mobility, allowing data to be processed and analysed wherever it resides. As a result, organisations can overcome the constraints of data gravity and harness the full potential of distributed data environments to drive innovation, agility, and efficiency in their data management strategies.

Contrarian View: The move away from centralised data centres can create challenges in data consistency, integration, and security. Edge computing and distributed architectures may also introduce data management and compliance complexities.

Additional Steps:

  • Data Consistency Protocols: Develop protocols to ensure data consistency and integration across distributed systems.
  • Enhanced Security: Implement robust security measures to protect data in a decentralised environment.
  • Compliance Checks: Ensure distributed architectures comply with regulatory and data protection requirements.

10. Towards natural language SQL (N2SQL)

Natural Language SQL (N2SQL) aims to simplify data querying and analysis by enabling users to interact with databases using natural language commands and queries. By leveraging natural language processing (NLP) and machine learning (ML) techniques, N2SQL interprets and translates human-readable queries into SQL statements, abstracting away the complexities of database querying and syntax.

As N2SQL technologies mature and become more widely adopted, organisations can foster data-driven decision-making and collaboration across diverse teams and departments, unlocking the full potential of their data assets.

Contrarian View: N2SQL might still need to effectively handle complex queries, which could lead to misunderstandings or misinterpretations of user intent. Its capabilities and flexibility could also be limited compared to traditional SQL.

Additional Steps:

  • User Training: Train users on the limitations and best practices for using N2SQL.
  • Hybrid Approach: Combine N2SQL with traditional SQL for complex queries and scenarios where natural language might fail.
  • Continuous Improvement: Invest in ongoing development to enhance the capabilities and accuracy of N2SQL systems.

As you can see, the data management landscape is profoundly transformed, driven by technological advancements, evolving business needs, and shifting regulatory requirements.

These contrarian perspectives and additional steps provide a more balanced view and practical considerations for effectively implementing these emerging data trends.

By staying on top of these top data management trends, you can future-proof your data strategy, unleash your assets’ full potential, and gain a competitive advantage in tomorrow’s digital economy.

If you found this article interesting, please join the conversation in the comments below. You can also email me at Arvind@am-pmassociates.com.

--

--

Arvind Mehrotra

Board Advisor, Strategy, Culture Alignment and Technology Advisor