With the exponential growth of data the value of large-scale data analysis has become an important aspect of the corporate agenda. Romain Picard, Regional Vice President South EMEA at Cloudera, tells us that an enterprise data cloud can empower businesses to get actionable insights from complex data anywhere.
Data has grown exponentially over the last 20 years and the potential for it to transform businesses is greater than it has ever been. IDC estimates that by 2025 the amount of data will hit a mind-boggling 163 zettabytes, marking the beginning of a digitisation wave that is showing no signs of abating.
Perhaps unsurprisingly, the value of data analysis at scale – including storing, managing, analysing, and harnessing information – has become an increasingly important part of the corporate agenda, not only for IT departments but also for senior management.
While most companies have now realised the business benefits of data analytics, developing the right strategy to harness the value of it can often be challenging. Although companies still need to rely on large data repository for analytics at scale, the widespread use of IoT devices – and subsequently the large amount of data coming from edge networks and the need for consistent data governance – has prompted a wave of modernisation, requiring an end-to-end technology stack underpinned by the power of the cloud.
The public cloud has now been experienced by a vast number of organisations, who value its simplicity and elasticity. However, unexpected operating costs and vendor lock-in have prompted enterprises to opt for some other cloud infrastructure models that would allow both choice and the ability to run demanding workloads no matter where they reside and originate, from the Edge to AI.
Same problems, new challenges
The most valuable and transformative business use cases – whether it’s IoT-enabled predictive maintenance, molecular diagnosis or real-time compliance monitoring – do require multiple analytics workloads, data science tools and Machine Learning algorithms to interrogate the same diverse data sets to generate value for the organisation. It’s how the most innovative enterprises are unlocking value from their data and competing in the data age.
However, many enterprises are struggling for a number of reasons. Data is no longer solely originated at the data centre and the speed at which Digital Transformation is happening means that data comes from public clouds and IoT sensors at the Edge. The heterogeneity of datasets and the spike in volumes that is leading to real-time analytics means that many organisations haven’t yet figured out a practical way to run analytics or apply Machine Learning algorithms to all their data.
Their analytic workloads have also been running independently – in silos – because even newer cloud data warehouses and data science tools weren’t quite designed to work together. Additionally, the need to govern data coming from disparate sources makes a coherent approach to data privacy nearly impossible, or at best, forces onerous controls that limit business productivity and increases costs.
Back to the drawing board
Simple analytics that improve data visibility are no longer enough to keep up with the competition. Being data-driven requires the ability to apply multiple analytics disciplines against data located anywhere. Take autonomous and connected vehicles for example, you need to process and stream real-time data from multiple endpoints at the Edge, while predicting key outcomes and applying Machine Learning on that same data to obtain comprehensive insights that deliver value.
The same applies, of course, to the needs of data stewards and data scientists in evaluating the data at different times in the processing chain. Today’s highest-value Machine Learning and analytics use cases have brought a variety of brand-new requirements to the table, which have to be addressed seamlessly throughout the data lifecycle to deliver a coherent picture.
Enterprises require a new approach. Companies have grown to need a comprehensive platform that integrates all data from data centres and public, private, hybrid and multi-cloud environments. A platform that is constantly informed about the location, status and type of data and can also offer other services, such as data protection and compliance guidelines, at different locations.
The rise of the enterprise data cloud
Since enterprises undergoing Digital Transformation are demanding a modern analytic experience across public, private, hybrid and multi-cloud environments, they are expecting to run analytic workloads wherever they choose – regardless of where their data may reside. In order to give enterprises flexibility, an enterprise data cloud can empower businesses to get clear and actionable insights from complex data anywhere, based on four foundational pillars:
Hybrid and multi-cloud: Businesses have grown to demand open architectures and the flexibility to move their workloads to different cloud environments, whether public or private. Being able to operate with equivalent functionality on and off premises – integrating to all major public clouds as well as the private cloud depending on the workload – is the first ingredient to overcome most data challenges.
Multi-function: Modern use cases generally require the application of multiple analytic functions working together on the same data. For example, autonomous vehicles require the application of both real-time data streaming and Machine Learning algorithms. Data disciplines – among which edge analytics, streaming analytics, data engineering, data warehousing, operational analytics, data science and Machine Learning – should all be part of a multi-functional cloud-enabled toolset that can solve an enterprises most pressing data and analytic challenges in a streamlined fashion.
Secured and governed: With data coming from various sources, comes great responsibility. Businesses want to run multiple analytic functions on the same data set with a common security and governance framework – enabling a holistic approach to data privacy and regulatory compliance across all their environments. It must therefore maintain strict enterprise data privacy, governance, data migration and metadata management regardless of its location.
Open: Lastly, an enterprise data cloud must be open. Of course, this means open source software, but it also means open compute architectures and open data stores like Amazon S3 and Azure Data Lake Storage. Ultimately, enterprises want to avoid vendor lock-in (to not become dependent on a single provider) and favour open platforms, open integrations and open partner ecosystems. In the event of technical challenges, not only one company, the original supplier, who delivers support, but the entire open source community can help. This also ensures fast innovation cycles and a competitive advantage.
To achieve their goals of Digital Transformation and becoming data-driven, businesses need more than just a better data warehouse, data science or BI tool. As new data types emerge, and new use cases come to the fore, they will need to rely on a range of analytical capabilities – from data engineering to data warehousing to operational databases and data science – available across a comprehensive cloud infrastructure.
Throughout their journey, they need to be able to fluidly move between these different analytics, exchanging data and gaining insights as they go. Being able to rely on an enterprise data cloud will future-proof their commitment to technology innovation and ensure business objectives are met across any division.