Where does Data Integration add value for Analytics

Data integration is of great importance in the context of big data and data blending, but is also relevant to any case where you need to leverage more than a few spreadsheets or relational database tables to get maximum results. As noted by TDWI in “Ten Ways Data Integration Provides Business Value,” sales and service teams depend on a complete view of a customer leveraging all relevant information, and data integration techniques and tools are key to facilitating this construct.

One area where data integration adds value for analytics is in BI and data warehousing. If you want to get insights on data flows over time, it’s essential to calculate, aggregate, and roll up data in time series fashion via a data warehouse. Another area where data integration adds value is in building new and valuable data sets — that is, data blending. The whole is greater than the sum of the parts: bringing together data from transactional systems, customer service records, and other operations can provide greater insight than any one piece alone.

A third area where data integration adds value is in providing a 360-degree view of business entities, such as for customer support. It takes data integration to blend different data sources to make sure that, for example, a customer support rep talking to a customer on the phone has all relevant information at the point of business impact.

[bctt tweet=”Torture the data, and it will confess to anything.” – Ronald Coase” via=”no”]

Evaluating Your Data Integration Capabilities

In the broader analytics marketplace, we often see organizations unsure of the extent to which they require data integration functionality. Often, aesthetically appealing visualizations are the first thing that comes to mind in planning an analytics project, but data integration is key to avoid a “garbage in garbage out” scenario.

Key questions to consider whether you need Data Integration capabilities:

  1. Do I need to blend several different data sources? Do you have more than one or two spreadsheets or RDBMS tables?
  2. Is my data cleansed and modeled in an analytics-ready format? Are countries and states spelled consistently in my data? How are null or zero values handled? Can you translate database concepts to business concepts? For example, properly identifying measure (metrics), dimensions (categories), and associated hierarchies from database fields.
  3. Do I want to enrich my data with new data sources? This is useful for getting the complete data set to deliver the insights you’re looking for, even if these sources are files of the same type.
  4. Have I already captured all the data I need? It’s likely that in the future there will be more that you need to be able to leverage. Data integration capabilities allow you to on-board a new data source and make use of it effectively.
  5. Will my data sources change in 6, 12, 18 or months? Establish flexible processes to accommodate these changes and fuel consistent analytics that reflect new information. Data integration can help you dynamically respond to this as well.
  6. Do I need ad-hoc and drill-down analytic capabilities? If yes, then data blending and analytic modeling (such as multi-dimensional modeling) must be leveraged in concert.

If you answered “Yes” to two or more of the following, data integration is an important requirement to consider when evaluating vendors, to make sure you’re not missing anything.

Reference: Data Mashups for Analytics ~ Pentaho

About Pentaho
Delivering the Future of Big Data Integration and Analytics Pentaho, a Hitachi Group company, is a leading data integration and business analytics company with an enterpriseclass, open source-based platform for diverse big data deployments. Our mission is to help organizations across industries harness the value from all their data, including big data and Internet of Things (IoT), enabling them to find new revenue streams, operate more efficiently, deliver outstanding service and minimize risk.

Team Bisilo
We are a team of Data Enthusiast scouring the web for the latest on #DataScience, #MachineLearning, #BigData, #Security, #PredictiveAnalytics, from the TOP leading authorities on the matter. All in one place you can read the trending news on the topics.

Leave a Reply

IBM and the Apache Spark innovation

Hadoop is disruptive