Data Analytics Done Right in Oil and Gas

Data Analytics Done Right in Oil and Gas
May 31, 2017|Hector Klie

Data Analytics is a hot topic nowadays, but many companies struggle to implement a stable and productive platform to quickly respond to the needs of the Oil and Gas industry. In my opinion, we can address this problem by effectively executing on the following three components:

  1. Incubating a Data Analytics team
  2. Scaling the application of Data Analytics
  3. Building a data-driven culture

Traditionally, Domain Experts such as Engineers and Geoscientists have received support from two specialized groups:

  • The R&D Staff, which is mostly comprised of mathematicians who focus on numerical models, and statisticians who focus on probabilistic models. Despite the differences, both roles are typically expected to have a solid grasp of the Oil and Gas industry in order to derive practical solutions, and
  • The Tech Support Staff, which is comprised of Computer Scientists or IT Specialists that provide software solutions in the form of scripting, application development, IT product support, software performance optimization, and other similar tasks. The level of domain experience may vary widely within this field.

At the same time, there may be a few Domain Experts who are strong at mathematics or programming and may be able to work on R&D or Tech Support.

However, due to several technological trends and increasing competition, Oil and Gas companies are relying more on specialists that can help make sense of the vast amounts of data to make decisions faster and more accurately. From this need, the Data Science role has been adopted by this industry. Data Science emerged from a combination of developments in Mathematics, Statistics, and Computer Science. In the Oil and Gas industry, Data Scientists are often expected to have a good grasp of the business, otherwise they might fall into a third group of Machine Learning Experts. On the Venn diagram shown below (Fig. 1), I attempt to illustrate how these roles relate to each other.

This is not to say that Data Scientists are unicorns that can fill the shoes of Computer Scientists, Mathematicians or Statisticians, and Domain Experts. Instead, typically a Data Scientist in the Oil and Gas industry will master one or two of these domains, with a basic understanding of another domain. Nevertheless, talented Data Scientists are in high demand and low supply, and to some they may be perceived as unicorns.

Incubating a Data Analytics Team

Given how difficult it is to find talented Data Scientists, it is often more feasible to gather a group of specialists that can work together towards the goal of solving a wide range of data-driven problems. Below I break down the steps to incubate an exemplary Data Analytics team:

  • Gather individuals with a good balance in Mathematics, Statistics and Computer Science, that also have Oil and Gas experience. These individuals may be easier to find in existing R&D or Tech Support groups.
  • Identify individuals that can adapt quickly and have a keen interest in Data Science.
  • Strive for a team of diverse professionals to ensure a well rounded Data Analytics team. For instance, a Data Scientist with a strong background in Computational Science might complement a Data Scientist with a strong background in Physics to develop data-driven physical models.
Scaling the Application of Data Analytics

Data Scientists are heavily dependent on the availability of large volumes of quality data. However, to facilitate access to these data, a robust data platform is needed. This platform is generally referred to as Big Data, and building such platform is not a trivial task. This is especially true in the Oil and Gas industry where data tend to be very fragmented, diverse, unstandardized, and difficult to access compared to other industries.

Additionally, Data Scientists are primarily focused on developing fast prototypes to prove out a model. Once the model has been validated, a significant amount of work is needed to convert the prototype into a production ready solution. However, we often observe companies misusing the skills of a Data Scientist or members of the Tech Support Staff, which often results in the release of an unfinished product. From my experience, this is a critical gap in the Oil and Gas industry that may be addressed by understanding better how the technology industry is able to build sophisticated Data Analytics solutions.

Technology companies such as Google, Microsoft, Amazon, and others typically have large teams of Software Engineers with a few Data Scientists supporting the research and development of their products. Coincidentally, this is the industry where most Machine Learning experts are found. These individuals work closely together through an iterative process called Agile. Below I describe further how these roles collaborate together (Fig. 2).

The roles depicted in the figure above can be broken down as follows:

  • Data Scientists: Focus on developing prototypes that solve data-driven problems by leveraging or creating advanced machine learning, optimization and uncertainty quantification algorithms. They also do extensive analytics and cleanup of the data in order to effectively train their models.
  • Data Analysts: Act as a supporting role for Data Scientist by performing data exploration and cleansing, and building solutions from proven analytical models.
  • Software Engineers: Focus on developing scalable distributed architectures for various applications such as Big Data Analytics solutions in the Cloud, optimizing software for HPC (High Performance Computing), integrating IoT (Internet of Things) sensors and robots, building personalized BI (Business Intelligence) solutions, among others. They can also be very helpful at converting a Data Scientist’s prototype into production ready code.
  • Data Engineers:Focus on supporting and extending the Big Data platform as well as translating prototypes to fit the architecture laid down by Software Engineers.

My recommendation for Oil and Gas companies would be to build Software Engineering teams inspired by successful tech companies. This may require a significant change in culture and skillsets from existing Tech Support or IT Specialist groups. However if implemented successfully, it would significantly help accelerate a Data Scientist’s ability to develop innovative data-driven solutions.

Building a Data-Driven Culture

In the tech industry, the Agile methodology is widely used to help increase collaboration and productivity between individuals of diverse backgrounds. In the contest of Oil and Gas, my theory is that as cross-functional teams work together following Agile practices, we will naturally see:

  • A stronger data-driven culture
  • A greater alignment towards a common vision, and
  • An increase in knowledge-sharing and multi-disciplinary experience.

As Data Analytics starts playing a more important role in the Oil and Gas industry, I can predict a greater need for collaboration between Data Scientists, Software Engineers, and Domain Experts. As a result, we may see our earlier Venn diagram evolve into the figure below (Fig. 3).

I believe the combination of these three roles are essential in order to form a successful Data Analytics team in this industry. Assuming the right talent is in place, and there is a close collaboration between these individuals, we may see a new breed of AI specialists emerge from this intersection.

The Data Analytics Funnel

Putting everything into perspective, we can think about how a business would be impacted when the three components introduced in the beginning are effectively executed. I will attempt to illustrate this in what I call the Data Funnel, which describes the degradation of information across a data workflow from the source to the point of action.

The Oil and Gas industry typically has a narrow but long funnel. This means that there is a significant delay in the response time (latency) and a limited amount of data (bandwidth) that can be processed at a given time. The illustration below attempts to summarize this idea (Fig. 4)

An example of a typical manual data workflow might be broken down into three phases:

  1. Integration: A DBA (DataBase Administrator) will manually ETL (extracting, transforming, and loading) data for the consumption of Domain Experts. The lack of governance and integration with available data sources, and the inability to scale this data may severely hinder a DBA’s ability to prepare data for downstream consumption.
  2. Analysis: Domain Experts spend a significant amount of their valuable time analyzing and computing large amounts of inconsistent data using several disjointed commercial tools. Many factors such as human bias, noise, gaps in the data, and tool limitations can result in further degradation of information upon analysis.
  3. Decision: Strategists may be Domain Experts, Managers, or Executives that will carefully look through the analysis to evaluate a set of possible scenarios that may lead to the best business decision. However, a lack of data provided by upstream workflows may potentially lead to uninformed or inaccurate decisions.

In summary, even though this process has been effective in the past, as competition increases and resources become more scarce, it is difficult to scale such model to take advantage of all potential opportunities. In contrast, a well established Data Analytics team would be able to build a far more scalable Data Analytics platform that enables the business to automatically process significantly larger volumes of data at a much faster rate (see Fig. 5). The platform also facilitates a more robust feedback loop where the outcome of actions can be relayed back into the data source.

In this ideal scenario, a Data Analytics Platform would act as a solid bridge between sources and actions, and therefore shortening the latency and increasing the bandwidth of the Data Funnel. On the diagram above, you will also notice a new set of opportunities (highlighted in red) that are unlocked by such platform.

In other terms, the Data Analytics Platform would be able to provide a stronger integration with all available data sources, and can execute advanced analytics techniques to generate actionable business recommendations through dashboards or BI visualization tools. This would allows Strategists, Domain Experts, and Field Operators to respond swiftly and proactively to multiple business needs.

Ultimately, the purpose of this model is not to build solutions that will assist humans to perform monotonous tasks, but instead to train machines to do these tasks for humans so they can focus on more challenging and creative problems.



DeepCast News

DeepCast Podcast Interview at Digital Wildcatters

November 10, 2020

DeepCast Podcast Interview at Digital Wildcatters

Learn More
DeepCast at SIAM News

December 30, 2020

DeepCast at SIAM News

Learn More
Our CEO at D2K RECODE, Rice University

September 2, 2020

Our CEO at D2K RECODE, Rice University

Learn More
See all articles