This month I was lucky enough to attend a week long ‘geocomputing’ training course sponsored by the UK Oil and Gas Authority. The course targeted geoscientists in the oil industry - specifically those working on the UKCS - and aimed to arm us with the tools to help maximize our use of North Sea data.
The five day programme was taught by the very brilliant (and very patient) Agile team: Matt and Rob. They started at the basics of computer programming, which was just as well as I have NEVER had a single computer science class in my life! We covered:
The basics of writing in computer code
Functions and how to build them
Manipulating datasets
Data visualization (my favourite part!)
Finally, we were introduced to the exciting future of machine learning in geoscience.
The level of energy and enthusiasm in the room is something I have very rarely encountered working in industry or academic. It got me thinking: 1) why is this such an engaging subject for geoscientist (who usually are repulsed at the thought of mathematics); and 2) why are we not being educated in the use of these tools at a much earlier stage in our Earth Science training?
"You have made infinity % progress in your coding knowledge this week" - Matt Hall
Why is computer coding such a pivotal tool for geoscientists going forward?
A couple of factors combine to make various data manipulation and visualisation tools so powerful for solving geoscience problems. Firstly, geoscientists are trained to think visually. We learn in the field by looking at an outcrop and extrapolating the observations into the subsurface. However, in the workplace, qualitative observation-driven interpretation is not going to drive business decisions, and therefore quantitative data analysis is required. As a result, geologists and geophysicists spend large amount of time dealing with large arrays or numerical data.
Above is an example from my PhD research. Grainsize analysis of 770 sediment samples became over 10,000 individual data points within an excel spreadsheet. I challenge anyone to get excited about a table with over 10,000 numbers in it! Analysing this data to assess the relationships between the data points took many weeks of careful graph construction, and plenty of cutting and pasting of data between numerous spreadsheets and workbooks. You can imagine then my delight when, within a morning, I had uploaded all my data into a format that Python can work with and was able to quickly interrogate the data through functions. This included one function which allowed me to cross plot all variables against each other for a very quick overview of all the possible relationships in the data set (see below).
My PhD data was a comparatively small volume of data compared to that which companies deal with. The volume of data at our fingertips is accelerating at an astonishing rate. This data becomes overwhelming due to the various formats, nomenclatures, and qualities used. It would simple take decades to analyses thoroughly.
Through building functions, computers can do much of this work for us.
By our very nature, we as scientists are problem solvers. Python provides a powerful platform for big problems to be solved. And for that reason, I would urge you to get involved with the geocomputing revolution. Who knows, perhaps we could solve more than just the geoscience problems!
Check out some of the fantastic projects driven by geo-computer scientists on the Agile Website: https://events.agilescientific.com/event/subsurface-hackathon-2018/projects
-- Happy Exploring --
Comments