Cognitive Information

Time Value of Data

DollarCloudThe concept of ‘Time Value’ of data and the associated costs of data are a segment of the ‘BI Basics’ session which always brings up a lot of discussion at SQL Saturday.  Here, I will expand on the discussion and hope you will join in below.  This session was given at SQL Saturday in Oklahoma City and Kansas City and can be downloaded here.

My background is in economics with a good bit of finance thrown in. In Economics, we often look at markets in terms of supply and demand curves.
In Finance studies, the ‘time value of money’ is a concept which is played out in countless equations. The basic concept is that a dollar today is worth more than a dollar in the future. Would you rather I give you a dollar today or a dollar tomorrow?

My theory about the ‘Time Value of Data’ incorporates elements of time value and then charts ‘value & cost curves’ similar to ‘supply & demand curves’.  This is because I needed a way to visualize the concept and share it.  So far, the values shown are merely relative and are not based on scientific survey of business users.    It is based on my experience in many industries and many companies over the past 16 years of helping companies use their data.

Time Value of Data:

My ‘Time Value’ curve sort of combines those two basic theories by looking at the latency of getting data ready for business users to make business decisions. For a typical company there is a lot of value in having data about last year, last month and even quite a bit of value in yesterday’s data. Few businesses will consider it added value to have up to the minute data, so for our purposes current data has relatively less value. There are always exceptions, like monitoring a nuclear reactor but we’re talking about a typical company selling typical stuff.

BI-Basics-time-valueTime Costs of Data:

The other component to look at is the ‘Cost of Data’ based upon latency. These costs can include data storage, data cleansing, data extract, data transfer, programming, hardware, software, network infrastructure and even purchasing outside data.

Costs of Data include but are not limited to:

  • Storage.  The people who say ‘storage is cheap’ are usually not the person writing a check for a new SAN.
  • Acquisition. This may include paying for data sets or subscribing to data feeds.
  • ETL – Extract Transform & Load
  • Cleansing.  This may include services such as cleansing addresses.
  • Data Quality Initiatives
  • Data Governance Initiatives
  • Archiving

To aggregate and store last year’s data pretty affordable and probably already done in most companies. Many companies companies have also built or bought some type of monthly reporting and analysis system. This shift in latency from last year to last month, usually increases the costs only slightly. To have yesterday’s data today in a form that easily usable by business users, often requires significant investment in hardware, software and programming to accumulate the data, verify it and prepare it for users. These initiatives are often data marts or data warehouses designed for reporting and analysis.

Decreasing data latency from daily to near real time can be very expensive.

I always stop here to mention, data values and costs can vary widely within a company, especially large enterprises.   The executives may highly value a daily summary of the company activity, but the production manager on a plant floor may value ‘up to the minute’ details.   The former requires company wide data acquisition, while the latter may only require reporting from one system.   This example shows that within one company there are varying values and varying costs.  Balancing those needs requires thorough knowledge of the requirements when designing solutions.

Sweet Spot

The point at which data value outweighs the costs to collect and aggregate data is the sweet spot where IT needs to deliver solutions.  As we discuss in the BI Basics class, data warehousing is an architecture while big data is a technology, both of which can help deliver data quickly in usable form to the users.   Which path you take will depend on many other factors learned during your requirements gathering across your company.

I have seen data warehouse projects which started with a target of daily refresh and then find incredible value in their data which justified the costs to get the data near real time. This shift can easily cost six-figures for a medium sized enterprise.   In other words, once IT delivered data to meet the sweet spot shown above with daily latency,  the business found higher value in near real time data.   In economics, we would call this a demand shift.

I hope this article will provide some discussion points for you with your data managers.   You will find your company ends up in the spectrum somewhere between “gotta have it now”  to “last week’s data is good enough”.   Either way, wishing you the best of success with your data projects.

 

About the Author:

Allen Smith is a business intelligence and data warehouse consultant, speaker and trainer. Allen has an MBA in Computer Sciences and has worked in database services and data warehousing since 1998.  Allen has a MCTS certification in Business Intelligence. Allen is a principal consultant at Cognitive Information Inc., a consulting firm in Edmond, OK.

About the graphic:   I couldn’t resist a graphic based on a recent Dilbert.

 

 

 

 

Tags: , , ,

Sorry, the comment form is closed at this time.

Business Intelligence & Data Warehouse Consulting

%d bloggers like this: