Sorry, you need to enable JavaScript to visit this website.
Industry Article

A machine learning pipeline for document extraction

Back to Technical Content
Each year the geoscience industry creates huge volumes of documents containing a wealth of knowledge which cannot be easily queried or extracted. Key to the successful extraction and transformation of data is an understanding of the nature of the data that exists within a corpus of files. For large datasets, it is time-consuming to manually open and review each document in turn. Therefore, in this article, we discuss how machine learning is used at CGG to classify documents in our automated pipeline and reduce project times significantly.
Download Resource

Publications

First Break.

Authors

Chin Hang Lun, Thomas Hewitt, Song Hou

Month

February

Copyright

©2022 EAGE
Share Link
LinkedIn icon Facebook icon Twitter icon