Subject ▸ petroleum engineering

Science is about curiosity; Data Science is asking the right questions

Today, I had an enriching dialog with a Petroleum Engineer going through his Master’s. I thought it would be altogether fitting and proper to share some of the questions a Subject Matter Expert (SME) - a Petroleum Engineering professional-, should ask to move ahead in the transitioning phase to decarbonization. What it is called now by investors, and major oil and gas companies, Net Zero-2050. Identifying a Data Science problem. The inner zone of influence.

Read More…

Reading wells from SPE data repository

Okay. There are two ways of downloading the data for all the wells in the SPE repository: the manual way (one file at a time with “Save As”), and the non-interactive automated way. The manual way is the easiest and require that you provide your SPE username and password in your usual login page. Then, you click on the link to the repository https://www.spe.org/datasets/, and start right-clicking on each of the files under the data folders.

Read More…

rProsper adds batch automation, dataframe generation

For those working in #oilandgas #productionoptimization this will make a good addition to your #datascience and #machinelearning toolkit. I’m working on the last details of new #rstats package rProsper. rProsper adds batch automation, dataframe generation, customized ggplot2 plotting, and powerful statistical analysis to the daily well modeling workflow. It makes production optimization faster and more reliable where well models are not treated as isolated units (one model, one file) but part of a statistical worldview (one well, one row).

Read More…

Is there a clash between Data-driven modeling vs Physics-based modeling?

Article in progress. Leave your comments for debate. Will try to integrate later in the main body. The more I learn on machine learning algorithms, putting in practice advanced applications of neural networks, deep learning convolutional networks, generative adversarial networks, recurrent neural networks and the like, the more I find similarities between these data-driven models and physics modeling. My view is that (i) physics-based models have stand the test of time (centuries) and still will; (ii) data-driven models successes have been hyped because of the novelty of new algorithms and faster, ubiquitous computer power; (iii) some of the data-driven “everything” wave has been put forward with commercial interest in mind; (iv) the successes of data-driven models have been caused by a profound gap in physics-based modeling applications and software, -the “data wave” almost totally drowned to death to the physics-based modeling world; (v) physics-based modeling software neglected the effect of the continuous stream of data and chose to stay in the comfortable paradigm of charging for licenses; (vi) commercial physics-based modeling software underestimated the value brought by statistics, data science and machine learning; (vii) traditional physics-based modeling software companies have started to react but still don’t get it, -they have preferred to rename their products to *“any-word-here” + “intelligent”* to transmit their clients they have caught with the times of “artificial intelligence”.

Read More…

Any Petroleum Engineer can do reproducible Machine Learning

As I prepare to release couple of examples using Generative Adversarial Networks (GANs) for creating synthetic datasets using rTorch, I found that I have several Rmarkdown notebooks loose out there while learning PyTorch. So, I decided to put these notebooks in a sort of an online ebook in GitHub. These notebooks range from unit tests for testing functions I implemented in *rTorch*, a wrapper of PyTorch, written in R, to small neural networks for logistic regression and linear regression.

Read More…

Transforming Petroleum Engineers in Data Science Wizards. Update 2019

Note. This is an update of the original article I published in 2017. Many, many things have changed, or have progressed, so fast that the article needs some rewriting. Very often I receive questions from colleagues asking for tips on Data Science and Machine Learning applied to Petroleum Engineering. These answers address some of those questions I have collected over time. In this case, you may call this, some advice to becoming a Petroleum Engineer and Data Science wizard:

Read More…

Being proven wrong on Linux every day

Being proven wrong every day: “Open Source has no-warranty, no maintenance, insecure, no leaders” From the Blog of Alfonso R. Reyes at blog.oilgainsanalytics.com That’s what they said about Linux. And look where it is now. Ninety percent and above of the worldwide servers use Linux; 98% of mobile smart devices use Linux or Unix derivatives; active sensors, microprocessors and controllers in the field have underlying operating systems based on Unix. Finally, cherry on top: 99.

Read More…

Transforming Petroleum Engineers in Data Science Wizards

Once in a while I get messages from colleagues asking for tips on Data Science applied to Petroleum Engineering. This is stuff I have collected over time (responses), advice to follow to become a Petroleum Engineer and Data Science wizard: Complete any of the Python or R online courses on Data Science. My favorites are the ones from Johns Hopkins in Coursera, complementing with DataCamp short workshops. Just two that come quick to my mind.

Read More…

Data Science for Petroleum Engineering - Part 5.2: Finding and filling missing data

NOTE. You can find the PDF version of the R markdown notebook in GitHub at this link. The reproducible R markdown notebook (.Rmd) itself is here. Both are full versions of this LinkedIn article. For the time being, LinkedIn publishing does not support markdown which would make sharing scientific and engineering documents much easier. Mistyped data One of the challenges in cleaning up well data is having uniform and standard well names.

Read More…

Data Science for Petroleum Engineering - Part 5: "Transforming Excel well raw data into datasets.​"

One of the big challenges of this new era of data science. machine learning and artificial intelligence is getting unhooked from the habit of working with spreadsheets. They have been around for 30+ years and were awesome. But spreadsheets - or worksheets - do not scale well with massive amounts of data; or continuous streams of data; or other characteristics that are key for taking good and sound decisions such as reproducibility.

Read More…