FEATURES Spring 2018

A strategy for end-to-end workforce data analytics using Oracle BI

By Raghav Venkat, City of Las Vegas

“Employers need an analytic system to automatically sift through a variety of data to report, analyze and find hidden patterns and discover valuable new insights to help make better HR decisions.”

People First

One of the factors behind organizations giving a lot of attention to their people is the nature of the firms in the current business environment. Even in the public and traditional sectors, the need to remain competitive has meant that firms in these sectors deploy strategies that make effective use of their resources.

This changed business landscape has come about as a result of a paradigm shift in the way businesses and firms view their employees as more than just resources and instead adopt a “people first” approach. Being able to track and retrieve workforce information and provide the right forward-looking analysis to support strategic initiatives is key to HR success.

Niche Applications for Workforce Management

Workforce management technology encompasses all the activities needed to maintain a productive workforce. It addresses a critical business requirement by aligning HR activities to organizational business goals.

Employers these days use several niche applications to aid in performing the business function with regard to work force.

For example, in the public sector, recruitment processes are typically conducted through a cloud-based suite: NeoGov. Several employers use other sets of niche services and software applications to track background verifications. After recruitment, there are modules of large-scale ERP suites that are utilized for collecting and retaining employee information, assignments and jobs, managers and organizations, benefit administration, absence management, entitlements, time and labor, payroll and taxes, talent management, succession planning, training, etc.

A specific set of applications is used to manage employment conditions, such as labor contract management suites and employee retirement management. Applications and document management systems that aid processes such as claims management and disciplining are widely used. Maintaining regulatory compliances and reporting needs are also major HR functions that certain specialized applications handle.

“Without proper discovery, it’s just data…not useful actionable information.”

What’s the Need?

In the current landscape of siloed data aiding several HR business processes, employers need something more than what current reporting or analytic solutions deliver. The need of the time is an all-round analytic system that can ingest data from a variety of sources; can cleanse, transform and enrich the data to suit the business need; has the capability to render traditional reports and dashboards; can power the end user with self-service components and visualizations; and provides opportunities for data discovery to identify patterns, relationships and trends among the data collected.

A curated analytic system should also be able to help guide management in simple analysis, such as predicting attrition/separation rates in future years. It should enable users to find answers to questions like: “What might be the entitlement balances of the workforce in the next four years?”

The system should have the capability to provide guidelines or answers to complex yet quantifiable questions such as: “How does the benefit rate relate to changes in labor contracts? Or, “How many new positions will I need next year?”

It is, essentially, a system that helps HR managers go from “I think” to “I know” when making business decisions.

“Dialogs contain critical, untapped insights.”

Disparate Data

The use of several disparate applications that are tailor-made to meet specific needs also creates data in several silos. Data is stored locally in the company’s own servers, in a shared cloud server or in a hybrid architecture. Performing end-to-end analytics becomes difficult and sometimes impossible when the HR landscape uses a variety of applications that may or may not be interfaced together. There is a need for an analytic system that can bring together a unified view of data from these applications.

Unstructured Data

There is even more workforce data in semi-structured and unstructured formats.

Some valuable information such as random facts; notes obtained from casual conversations, meetings or discussions; data from third parties; and information from non-workforce management systems, digital documents and more; should also be included in building an analytics suite that provides end-to-end workforce analytics and data discovery.

The Strategy

To build an analytic system that can serve above and beyond traditional reporting and analytic needs, a strong technical foundation is a requirement. This article breaks down the technical back end into three high-level layers needed for an analytic system that is capable of end-to-end HR reporting, enables discovery from the wealth of data collected and possibly predicts outcomes to a good degree of accuracy. They are:

  1. Data Ingestion Layer.
  2. Foundational Data Layer.
  3. Metadata Based Analytic Model.
Data Ingestion Layer

The objective is to bring all related data together in a place where we can build a strong data foundational layer.

In this case, the biggest challenge is the data itself. The data resides in several databases and files across different platforms, on premises and in the cloud. When designing, one must be aware of the security implication when dealing with workforce data. An enterprise data ingestion tool is a good fit for this effort. A tool that can ingest data, not just from on-premises relational database systems but also cloud-based applications, one that can scrape data from websites, ingest data from social media as well as from documents and other types of data sources, is required. It’s highly recommended to use a combination of tools.

Some tools specialize in working on relational data and some on semi-structured and unstructured data ingestion. Build a strategy to include the best set of tools that your organization might already own that can accomplish the task of ingesting data of interest (work force) to a single data pool. In some cases, the source of data can also be an existing HR data mart. Once designed, program the tool sets to refresh the data on a scheduled basis benefiting the end users. At the end of these steps, data is still found in disparate data sets, but closer together.

Foundational Data Layer

The objective is to build a data layer that supports a variety of ways to analyze work force data from a variety of sources.

A data cleansing activity using enterprise data quality rules is to be performed on the data to bring structure and quality to the disparate data.

De-Normalization and Consolidation
A highly de-normalized data structure helps in quick analysis. From the data gathered through the prior steps, highly de-normalized data objects are built combining as much data as possible. For example, there could be one data set that combines all data from one specific application such as one data set from recruitment lifecycle of requisition to on-boarding. Depending on the complexity, not more than 15 de-normalized and consolidated data sets are created for entire HR lifecycle/business processes.

To integrate data, every data set created should have an integration point, such as ID’s for people, department or jobs, or could be a business rule or a logic that can put different disparate data sets together.

Time tracking in workforce is an absolute necessity. When designing the de-normalized objects, include several dates that provide context to the data. There could be start and end dates of several data values in a combination that provides a true and valid record at a given point in time. See Figure 1.

These de-normalized objects have to be physically able to store the data in them as opposed to being views or other referential objects. This enables better performance, brings options to incrementally load data, provides the ability to devise a change capture mechanism and sometimes aids in better integration. Once these data objects are designed, formulate a data refresh strategy. A modern ETL tool will aid in performing most functions related to building this data layer.

Figure 1. Several dates are tied to every record to track time.

Metadata Based Analytic Data Model

The objective of the third step is to build a logical data model that has the capability to perform end-to-end workforce analytics, enable data discovery and aid users in performing basic predictive analytics. Oracle BI’s metadata based capability can be exploited to build a data model that enables combining a variety of data.

Common Enterprise Information Model
Oracle BI packs in a Common Enterprise Information Model, which can be customized to meet needs such as applying business logic, data federation, aggregation and query distribution between disparate data sources to achieve a unified result set. This layer acts as the core of the proposed design. See Figure 2.

Figure 2. Oracle BI Common Enterprise Information Model.

Define all the business metrics that are specified in the user requirements. Irrespective of the source, logically model a dimensional structure that the requirements command. Create basic facts, dimensions and hierarchies that you may need from the de-normalized physical structure. They might come from various, multiple sources and might have different structures physically. The logical model enables you to choose multiple sources and combine them to logically create dimensional design objects such as facts, dimensions, hierarchies, etc.

Physical Layer
Import all the de-normalized objects into the physical layer of Oracle BI metadata. As these objects have been designed with hooks to integrate them with each other, use simple joins to relate the objects. There are cases where non-equivalent joins and condition-based joins would be needed. Oracle BI has a mechanism called complex joins that can be used to satisfy this need. See Figure 3. Time tracked data elements, specifically, would need to use this type of join to relate and have valid data among these objects.

Figure 3. Simple and complex joins.

Business Model and Mapping Layer
The logical modeling on the Business Model and Mapping (BMM) layer gives the ability to create logical facts, dimensions, hierarchies, aggregates and objects that are necessary for reporting, data analysis, discovery applications and basic prediction. For example, a central employee dimension can be created by combining data from several physical layer objects as a logical table source.

Though this data object does not exist physically, the logical model enables one to use data from disparate sources to make objects that fit several analytical applications. See Figure 4. This dimension might act as a key object that contains information about workforce profiles, which can be combined with other facts and related dimensional sources from various applications. The grain level of data should be consistent among the logical objects that are created.

Figure 4. A high-level depiction of how logical joins can bring together physically disparate data.

The resulting design should make it easy to combine structured and semi-structured data to understand key metrics in their relevant context and evaluate new business situations found in the workforce data. This also builds the capability to ask unanticipated questions of any data through intuitive, flexible and highly interactive online discovery and predictive applications.

Tips for Success

Powering end users with the ability to perform simple and advanced analysis creates a widespread usage of the analytic system. Make self-service a key component of the design and implementation.

For this project, create a small team that consists of both business and technical members who are creative, open to new ideas, flexible and adaptable.

“The probability of a disciplined employee separating from the job within six months of the event varies by gender.”

The Workforce Analytics System

An analytic system powered with this backend layer can perform the basic reporting any organization needs. Data can be summarized at a high level by time and can be drilled down to the most detailed level when needed. A variety of data visualizations would be supported by this design.

A single dashboard would retrieve every bit of data about an employee, from the time an initial requisition was made to hire, to all the way till retirement, date tracked and preserved with contextual information.

Several key business processes such as hire and separations, recruitment, position and budgeting, position action requests, employee history, job-assignment and salary history, labor contracts, benefits, attrition and retirement can be queried/analyzed by end users in a self-service fashion or be displayed in detailed dashboards and reports. This feature empowers HR analysts and managers in making fact-based decisions and gives them a multi-dimensional view of the entire HR lifecycle data in a single page. Visualizing and analyzing data in this manner provides better insights.

“Powering end users with the ability to perform simple and advanced analysis creates a widespread usage of the analytic system. Make self-service a key component of the design and implementation.”

As the logical design integrates data from all sources, a Google-search-like data discovery feature is enabled. An end user can type a name into the system and extract all relevant information about the person and their history at the organization with respect to HR business processes.

Based on historical data and certain statistical functions, the well de-normalized data supports predictive analysis with a good degree of accuracy. The data structure readily enables one to perform advanced statistical functions when querying through the metadata. More predictive modeling can be enabled by extending the design to include supervised learning algorithms on the data with the resulting flat structure.

Key Benefits

  • Combines several sources, types and varieties of data.
  • Analytics on the entire HR life cycle from requisition to retirement in a single unified data model.
  • Enables data discovery feature as data is well transformed and joined.
  • Aids in self-service and provides user-friendly dataset like structure for end user data visualization applications.
  • Clean, transformed and unified single source of truth to analyze workforce data.
  • Easy to maintain, support and extend the design.
  • Lower cost of making changes as core of the design is logically modeled.
  • The de-normalized structure provides good query performance.
  • Good time to value.


Workforce analytics plays an important role in every organization to gain insights and make better decisions on their most valued resource – its people. Organizations are embracing the use of multiple disconnected applications to manage HR business processes, creating several varieties of data silos on premises and in the cloud. End-to-end HR reporting can become a challenge in this environment. Organizations might also lack the capability to perform data discovery or advanced analytics in such environments. By strategically designing data layers and using Oracle BI’s logical modeling capabilities, an analytic system is built to analyze disparate pieces of workforce data together in a single platform, perform data discovery among disparate data sets and conduct predictive analytics with historical data.

Raghav Venkat works extensively architecting, developing and supporting the enterprise business intelligence systems at the City of Las Vegas. He is a regular presenter on data and BI topics at COLLABORATE and Oracle OpenWorld conferences.