Skip to content
/ 3W Public

Promotes development of ML algorithms for early detection and classification of undesirable events in offshore oil wells.

License

Notifications You must be signed in to change notification settings

petrobras/3W

Repository files navigation

Apache 2.0 CC BY 4.0 Code style Versioning

Tip

How about collaborating on a data article about the 3W Dataset 2.0.0?

Everyone is invited!!!

Key benefits:

1) Learn more about the 3W, one of Petrobras' main projects on GitHub: https://github.com/petrobras/3W;
2) Interact with other members of the 3W Community throughout this work;
3) Become a co-author of a relevant data article with good potential for being highly referenced;
4) Enrich your professional portfolio.

This work has just begun and will be developed openly and collaboratively in this Git repository: https://github.com/ricardoevvargas/data-articles-3w-dataset.

Are you interested? Excellent!!! Visit this repository for more information, including the rules and ways to collaborate.

Table of Content

Introduction

This is the first repository published by Petrobras on GitHub. It supports the 3W Project, which aims to promote experimentation and development of Machine Learning-based approaches and algorithms for specific problems related to detection and classification of undesirable events that occur in offshore oil wells.

The 3W Project is based on the 3W Dataset, a database described in this paper, and on the 3W Toolkit, a software package that promotes experimentation with the 3W Dataset for specific problems. The name 3W was chosen because this dataset is composed of instances from 3 different sources and which contain undesirable events that occur in oil Wells.

Motivation

Timely detection of undesirable events in oil wells can help prevent production losses, reduce maintenance costs, environmental accidents, and human casualties. Losses related to this type of events can reach 5% of production in certain scenarios, especially in areas such as Flow Assurance and Artificial Lifting Methods. In terms of maintenance, the cost of a maritime probe, required to perform various types of operations, can exceed US $500,000 per day.

Creating a dataset and making it public to be openly experienced can greatly foment the development of tools that can:

  • Improve the process of identifying undesirable events in the drilling, completion and production phases of offshore wells;
  • Increase the efficiency of monitoring the integrity of wells and subsea systems, whose related problems can generate invaluable losses for people, environment, and company's image.

Strategy

The 3W is the first pilot of a Petrobras' program called Conexões para Inovação - Módulo Open Lab. This pilot is an open project composed by two major resources:

  • The 3W Dataset, which will be evolved and supplemented with more instances from time to time;
  • The 3W Toolkit, which will also be evolved (in many ways) to cover an increasing number of undesirable events during its development.

Therefore, our strategy is to make these resources publicly available so that we can develop the 3W Project with a global community collaboratively.

Ambition

With this project, Petrobras intends to develop (fix, improve, supplement, etc.):

  • The 3W Dataset itself;
  • The 3W Toolkit itself;
  • Approaches and algorithms that can be incorporated into systems dedicated to monitoring undesirable events in offshore oil wells during their respective drilling, completion and production phases;
  • Tools that can be useful for our ambition.

Governance

The 3W Project was conceived and publicly launched on May 30, 2022 as a strategic action by Petrobras, led by its department responsible for Flow Assurance and its research center (CENPES). Since then, 3W has become increasingly consolidated at Petrobras in several aspects: more professionals specialized in labeling instances, more projects and teams using the resources made available by 3W, more investment in developing the digital tools needed to label and export instances, more interest in including different types of undesirable events that occur in wells during the drilling, completion and production phases, etc.

Due to this evolution, from May 1st, 2024 the 3W's governance is now done with the participation of the Petrobras' department responsible for Well Integrity.

Contributions

We expect to receive various types of contributions from individuals, research institutions, startups, companies and partner oil operators.

Before you can contribute to this project, you need to read and agree to the following documents:

It is also very important to know, participate and follow the discussions. See the discussions section.

Licenses

All the code of this project is licensed under the Apache 2.0 License and all 3W Dataset's data files (Parquet files saved in subdirectories of the dataset directory) are licensed under the Creative Commons Attribution 4.0 International License.

Versioning

In the 3W Project, three types of versions will be managed as follows.

  • Version of the 3W Toolkit: specified in the init.py file;
  • Version of the 3W Dataset: specified in the dataset.ini file;
  • Version of the 3W Project: specified with tags in the git repository;
  • We will exclusively use the semantic versioning defined in https://semver.org;
  • Versions will always be updated manually;
  • Versioning of the 3W Toolkit and 3W Dataset are completely independent of each other;
  • The version of the 3W Project will be updated whenever, and only when, there is a new commit in the main branch of the repository, regardless of the updated resource: 3W Toolkit, 3W Dataset, 3W Project's documentation, example of use, etc;
  • We will only use annotated tags and for each tag there will be a release in the remote repository (GitHub);
  • Content for each release will be automatically generated with functionality provided by GitHub.

Questions

See the discussions section. If you don't get clarification, please open discussions to ask your questions so we can answer them.

3W Dataset

To the best of its authors' knowledge, this is the first realistic and public dataset with rare undesirable real events in oil wells that can be readily used as a benchmark dataset for development of machine learning techniques related to inherent difficulties of actual data. For more information about the theory behind this dataset, refer to the paper A realistic and public dataset with rare undesirable real events in oil wells published in the Journal of Petroleum Science and Engineering (link here).

Structure

The 3W Dataset consists of multiple Parquet files saved in subdirectories of the dataset directory and structured as detailed here.

Overview

A 3W Dataset's general presentation with some quantities and statistics is available in this Jupyter Notebook.

3W Toolkit

The 3W Toolkit is a software package written in Python 3 that contains resources that make the following easier:

  • 3W Dataset overview generation;
  • Experimentation and comparative analysis of Machine Learning-based approaches and algorithms for specific problems related to undesirable events that occur in offshore oil wells during their respective drilling, completion and production phases;
  • Standardization of key points of the Machine Learning-based algorithm development pipeline.

It is important to note that there are arbitrary choices in this toolkit, but they have been carefully made to allow adequate comparative analysis without compromising the ability to experiment with different approaches and algorithms.

Structure

The 3W Toolkit is implemented in sub-modules as discribed here.

Incorporated Problems

Specific problems will be incorporated into this project gradually. At this point, we can work on:

All specification is detailed in the CONTRIBUTING GUIDE.

Examples of Use

The list below with examples of how to use the 3W Toolkit will be incremented throughout its development.

For a contribution of yours to be listed here, follow the instructions detailed in the CONTRIBUTING GUIDE.

Reproducibility

For all results generated by the 3W Toolkit to be consistent, we recommend you create and use a virtual environment with the packages versions specified in the environment.yml, which was generated with conda. Our current recommendation is to use the conda distributed by Miniforge. Download and install Miniforge according to the official instructions. Open a prompt on your operating system (Windows, Linux or MacOS). Make sure the current directory is the directory where you have the 3W. Run the following commands as needed:

$ conda env create -f environment.yml
  • To activate the created virtual environment:
$ conda activate 3W
  • To use the 3W Toolkit resources interactively:
$ python
  • To initialize a local Jupyter Notebook server:
$ jupyter notebook