Don’t Just Rush to Adopt New Technologies — Plan a Path to Monetization by Following These Rules
multicolored numbers
A wide range of transformative, interrelated and disruptive technologies have arrived.
Volume XXIII, Issue 35 |

The success or failure of companies over the next 10 years will be increasingly driven by how well they can take advantage of their data. Despite the growing importance of data for supporting better decision-making and achieving competitive differentiation, many organizations have failed to realize its full transformative potential. This failure often occurs in organizations that are too focused on the data-transformation end game — for example, analytics and visualization, machine learning (ML) tools, and business intelligence software — rather than on the foundational inputs such as data integration, and the preparation and mastering of data. That’s like investing in the cart before buying the horse.

If companies do not have quality data to analyze, they will not see the return on investment they expect from data analytics tools. However, quality data comes from many different sources and in many different formats. Companies must be able to access and aggregate data from multiple silos in order to properly analyze it.

What’s driving the need for new infrastructure?

Several trends are driving the need for modern data infrastructure. They include:

  • An increasing volume of data creation and collection

  • The transition of workloads to the cloud, including both hybrid cloud/on-premises environments and multicloud environments

  • The fragmentation of applications, data environments and data sources — including enterprise resource planning (ERP), customer relationship management (CRM) and customer data platform (CDP) software

  • An increasing need for real-time decision-making

  • An increasing need for “self-serve” insights across business units

  • The increasing need of stakeholders outside the organization to access data and insights

  • The increasing importance of data as a driver of business change

Data is being generated everywhere — e.g., during ecommerce transactions, by the online tracking of customers and from Internet of Things (IoT) devices — and the impact is being felt across all industries. According to a survey by Dresner, 52% of businesses are using big data today. Even companies in lagging industries such as manufacturing, healthcare and retail — more than 40% of them, in fact — are currently using big data.

Organizations are capturing data — but most of it is never analyzed

All industries gather data, but much of that data is never used. Unused data, also known as “dark data,” accounts for more than 50% of all data held by corporations. That means companies are going to great lengths to gather data but never actually analyze it.

The principal reason for this phenomenon is the existence of legacy IT infrastructures that have data residing in siloed databases and in different formats. These complex data structures make the data unusable without significant integration efforts.

In addition, finding the right talent to make sense of this data can be challenging. According to a QuantHub survey, there is a shortage of data scientists, and talent is expensive.

Recognizing the value of unanalyzed data leads to large investments in cloud IT infrastructure

Before data can be made accessible to business and research analysts, companies must first integrate, clean, organize and store their data. A significant investment in a modern data infrastructure is required to accomplish this.

An increase in cloud investment was seen in 2020, with overall IT spending decreasing by 8% but cloud spending increasing by 18%. As more companies move more data to the cloud, investment levels are projected to increase. A KPMG study showed that 88% of surveyed companies are currently using cloud IT infrastructure for at least some of their data, and in the next two years, 50% of companies intend to move all their data to the cloud.

Moving data to the cloud is an investment that fuels future investment — especially in working with the data. Once the data is hosted, companies can seek out third-party vendors to do much of the heavy lifting. Companies can sign up for software solutions to integrate, clean, host, analyze and visualize the data. The cloud storage industry is growing at a CAGR of 22%, and the analytics and data visualization industries are also expected to grow rapidly (CAGRs of 13% and 10%, respectively).

Modern cloud architecture makes it easier to store, clean, extract and analyze data

Historically, companies gathered data and stored it on their own servers, in the format it arrived in, in whatever database it went into. The different databases did not talk to each other, but because they mainly operated independently of one other, this was not an issue.

As companies have increasingly realized the value of analyzing their data, they have begun to build out data analytics teams that can measure data across the organization and that can better inform decision-making.

However, simply gathering and combining the data that lives in many different locations is an extremely onerous task and can significantly reduce the efficacy of an organization’s data scientists or analysts. By switching to a modern cloud architecture, organizations can more efficiently and effectively integrate, store and clean data, thus making it easier to extract and analyze (see Figure 1).

When building a modern infrastructure to harness the power of data, organizations must consider a number of factors.

Data digestion and storage. Factors related to data digestion and storage include:

Data integration. Organizations have many choices for integrating/replicating data across databases and applications — for example, extract, load, transform (ELT)/extract, transform, load (ETL); change data capture (CDC); data virtualization; integration platform as a service (iPaaS); and application programming interface (API). Organizations will typically utilize multiple technologies in parallel depending on the data integration need. For example, CDC may be used to create real-time, harmonized repositories for high-volume, structured data across cloud/hybrid environments (e.g., to support real-time reporting), whereas APIs are more typically used for low-volume, high-frequency application-to-application linkages (e.g., to support daily operations).

Data storage. Two main examples of cloud data storage are data lakes and data warehouses. Data lakes store all data, regardless of format. Data warehouses are more organized and have all data in a common format, making it easier to analyze. Companies such as Snowflake specialize in making data lakes accessible to analysts by automating the process of cleaning the data lake and moving it to data warehouses.

Data exploration. Once the data is in a usable format, analysis can be performed. There are many open-source libraries for analyzing data (e.g., Pandas) and building ML models (e.g., TensorFlow) that have robust functionality and large communities supporting them, although they do require coding abilities. An increasing number of paid, enterprise, and low- and no-code solutions make the data more accessible to business analysts. This data democratization empowers citizen data scientists, defined as people who “create or generate models that use advanced diagnostic analytics or predictive and prescriptive capabilities, but whose primary job function is outside the field of statistics and analytics.”

Data visualization and output. Once the data is formatted and analyzed, software services/business intelligence solutions can provide tools and dashboards to help communicate data findings to the broader company and make the insights actionable. Some examples are Looker, Tableau and Power BI. Additionally, the data can then be used in internal apps or through third-party software (such as customer personalization through a digital experience platform) to provide value to customers.

The emergence of enterprise vendors offers a way to build a modern data platform

Historically, deploying artificial intelligence (AI) and ML models and applications into production has entailed numerous time-intensive manual steps and processes with various stakeholders contributing along the way, resulting in elongated lead times for deployment. However, enterprise AI platforms simplify data digestion and mapping and consolidate data exploration and visualization under a single AI tool, allowing various stakeholders to better focus on analysis, programming and development (see Figure 2). For example, these platforms support tasks across the analytics pipeline, from data ingestion, integration and visualization to advanced modeling, testing and application deployment including the use of AI and ML technologies.

For organizations seeking to play catch-up with their peers or ones that do not prioritize investing resources in technical staff or cumbersome tools, enterprise AI vendors are looking to fill the need with an end-to-end solution.

In sum

Momentum continues to build for organizations across industries to better harness their data in support of growth objectives and internal optimization initiatives. Companies are increasingly investing in solutions that can integrate with other systems, unlock predictive insights and empower a broader set of users beyond data scientists to participate in the process.

There is tremendous opportunity in this market. We are still in the early innings.

Related Insights