In this episode of Sound Off, Joe Perino discusses how big data and advanced analytics are used in refining.
In this current low oil price environment, one of the largest investments that oil and gas companies continue to make is in operational efficiency, and big data and analytics is a key part of this drive.
Joe is bringing decades of oil and gas industry experience to one of the hottest topics in the industry today.
Some of the key topics of this episode include:
- What exactly is big data and analytics?
- How is big data and analytics currently used in refining?
- The 5 V’s of big data.
Be sure to drop any comments or questions in the comment section below, or members can also reach out in the discussion forums.
Listen to Sound Off with Joe Perino below:
Links:
American Fuel and Petrochemical Manufacturers
Oil 101 – A FREE Introduction to Oil and Gas
EKT Interactive Oil and Gas Podcast Network
Upstream and Downstream: Learn the difference
Transcript:
Hello, I’m Joe Perino and welcome to Sound Off. This podcast is part of the EKT Interactive Learning Network and is brought to you by Oil 101, a free 10-part introduction to the oil and gas industry.
Welcome back to Sound Off. Today I’d like to deal with a topic of big data and analytics in the refining industry.
We’ve been hearing a lot about big data and analytics in the upstream exploration and production space but not too much about big data and analytics in the downstream space, so I decided to take a look into that and find out what’s going on. First, we need to understand what big data and analytics are, then we’ll take a look at big data and analytics in the refining or manufacturing area itself, and then in the related areas of supply, planning, and trading.
What is big data?
Big data is characterized by the five V’s, V as in Victor, and those V’s are volume, meaning a very large volume of data, velocity, meaning the data is coming at you very quickly, variety, meaning there are lots of types of different data and not the typical deterministic data like pressures, temperatures, and flows, but textual data and all kinds of other unstructured data. The fourth V is veracity, which refers to the quality of the data, and then finally the fifth V is value. Some data is more valuable than others and how do you go about determining that when you have all this volume, velocity, variety, and veracity coming at you.
The second element of this, of course, is analytics.
What do we mean by analytics?
To analyze is to investigate and understand the inner workings of something. In this context, analysis really refers to using advanced analytic techniques such as text analytics, machine learning, predictive analytics, data mining, statistics, and natural language processing to understand what is going on in the data.
Let’s delve into refining.
The first place I’d like to talk about, of course, is the refinery itself, the manufacturing process. I’ve had some background in this area having sold and worked in this space for a long time. It would seem that the information system architecture in the modern refinery has matured quite a bit and has been pretty stable for the last 15 years.
If we look across the typical systems, and we’re at the disadvantage of having a diagram in front of us, but I’ll try to describe it, is that the lowest level, of course, you have your process control systems. You may also have emergency shutdown systems related to that that are integrated, there may be specific rotating equipment interfaces that are connected to the control system, etc.
The other two big sources of data are the laboratory information system, which includes online analyzers, the data collected in the lab, and also sample data that collected manually in the field.
The third big area of information is in the asset management system, which includes the computerized maintenance management system for managing all of your work plans, your work orders, and repairs, but it also probably contains your inspection data that is done manually.
Related to that and depending on the refinery, you may have a reliability management system which allows people to bring in data either manually or automatically to calculate reliability of rotating equipment and other pieces of equipment.
What does this look like in practice?
These three big pieces of information islands basically are all separate. They all have their own separate databases and are typically only connected at some level by the plant historian.
The maintenance and reliability systems typically work off of a relational database, the lab system also does, and the process control system has it’s own proprietary database in which most of that data is pulled up into the historian, a time series type of database.
Along with those three major databases in a refinery. We also have a number of applications which require data from those systems. One of them, of course, is your planning system, your linear program.
Then there’s also going to be scheduling, scheduling of the crude coming in and of the products going out. You will also have a data reconciliation and material balance program and you will have a blend planning and optimization program and you may have a few other programs going around.
Most of these applications all have their own database, be they a flat file or a relational database, and so when we step back and look at this we have a variety of systems, applications, and so forth that in and of themselves are not integrated.
How have these been typically integrated?
There’s been two methods to doing this.
The first method was to integrate elements of the data from all these separate applications of databases into one larger database.
That’s been superseded about 15 years ago by a different type of database, a metadata structure which actually connects the relevant elements of these smaller data sets or the other applications together into a reference architecture, which then allows a dashboard of sorts – a webpage usually based but sometimes also fat-client based, that pulls the data in and puts the relevant information from these different applications and systems in front of the user so that they can see what’s going on, how the refinery is performing, etc.
That type of data system is quite popular now and is typically what is found out in the refinery.
Suppliers of that type of database include Honeywell with their Uniformance product, P2’s BabelFish, and Siemens XHQ. These are all metadata systems that allow the refinery to be defined and all the inputs from the various applications and databases to be integrated into a coherent picture.
Where does big data and analytics fit into this picture?
Refiners will probably tell you that they’ve been doing big data and analytics for a long time. Obviously, today’s modern refinery may have in excess of a thousand tags in its historian, another thousand points in the laboratory database for a large integrated refinery, a conversion refinery if you will, and then there are any number of other thousands of pieces of data in the asset management system in all of the other suites of data that are going on, so they have big data.
Much of this big data, the majority of it, is deterministic or structured as opposed to unstructured, but there are examples of unstructured data.
There are text data. For example, the property of a diesel product, maybe its cloud point, so is it cloudy or is it clear. Text data is quite common, particularly with inspection data where it’s not a deterministic value you’re looking at, it’s a matter of inspection and an opinion about what the condition of the equipment is.
They do have a variety of data, they do have a volume of data, and they do have a velocity of data because much of the data coming in is at one-second, five-second, or 10-second intervals, and some data may be sub-one second data, for example monitoring or rotating equipment.
What about the analytics side of this?
Refiners will tell you, “We’ve been using analytic equations and neural networks for property prediction and other things since the late 1980s,” and they’d be right. “We’ve also been using analytics on rotating equipment to detect or predict failure and to detect poor performance.” They’d be right again.
The real question is when we think of big data and analytics in today’s context is have all of these separate big data and analytics applications ever been put together in one large platform with one tool kit that can be applied to all kinds and types of cases that need to be solved.
That’s really what’s been popular with today’s technology that’s come out. The answer to that from what I can tell is no.
There’s been no real adopters of a plant or refinery-wide platform for analytics and big data.
Instead, they have a layered architectural system with one or more solutions that may be considered point solutions, some for rotating equipment, others for analytics, some of these are invented in the control system, others are stand-alone, others may be done in an Excel spreadsheet periodically or in a stand-alone application interfaced to the historian.
That’s the current state of where we are today, and there doesn’t seem to be a big trend towards this ubiquitous big data and analytics platform.
To back that up, I did a little bit of research into the places or events and conferences where this discussion might take place, and of course I went to the American Fuels and Petrochemicals Manufacturing Association, that used to be the old National Petroleum Refiners Association, and in 2015 was the first time that big data and analytics made it onto their agenda as part of their plant automation and decision support section.
It was again a topic in 2016. There is some interest in this area, but it would seem to be early in the going in terms of really understanding how to apply the big data and analytics platforms that are being offered today by various technology suppliers.
If we move onto the other area and say, “If refiners aren’t doing this, what part of the refining industry is adopting these more modern platforms and tool sets?”
That area is in the demand around supply, trading, pricing, competitive analysis, and demand optimization. In this place you have another example of big data because there’s always the pricing industry is going on.
Pricing is very dynamic in the refining industry, both the products as well as crudes. Supply and trading is very dynamic and that goes on all the time, price optimization is a real challenge for the refining industry, particularly as you get into wholesale and retail.
And then finally, obviously, analyzing what your competitors are doing in terms of being able to acquire crudes or move products downstream into the refinery distribution system.
These are areas where we’ve seen a greater application of the modern platforms and tool sets that are available today. Since I mentioned those, who are these people that are offering modern platforms and tool sets?
Some of them include the familiar names, IBM, Oracle, SAS, who’s long been in the analytics business, Teradata, EMC, and there are a number of other companies in the technology space that have point solutions out there.
One of the most recent ones is GE with their SmartSignal software, which is now part of a larger offering that GE Oil and Gas is offering. In the past or up until this point, most of their focus has been on the upstream because they also own a large share of the compressor market.
They’re doing a lot of rotating equipment monitoring in pipelines and also in offshore platforms, but they haven’t gone as far downstream into the refinery as perhaps they would like to. They’re a very large player now.
What we’re seeing if we step back is that we’re early on in applying the advanced big data platforms and tool sets but that refining would tell you they’ve been doing this for a long time. The challenge will be how does this new platform and tool set fit in with the existing architectures and how will these existing architectures evolve.
If you have a comment or a question about this podcast, we’d sure like to hear from you. Thank you for listening.