skip to primary navigationskip to content

C429: What can machine learning tell us about the physical and biogeochemical structure of the Southern Ocean? (Lead Supervisor: Dan Jones, British Antarctic Survey)

Supervisors: Dan Jones (British Antarctic Survey), Andrew Meijers (British Antarctic Survey), Peter Haynes (DAMTP) and Dr Emily Shuckburgh (BAS)

Importance of the area of research:

Since the 1970s, the ocean has absorbed more than 90% of the extra thermal energy added to the climate system via anthropogenic emissions. The Southern Ocean has been an especially important region in this context, accounting for more than 75% of excess heat uptake and roughly 50% of oceanic carbon absorption. The physical and biogeochemical structure of the ocean is complex and difficult to characterise, as it displays significant variability across a wide range of scales. Partly because of this complexity, there is no objective, unified classification scheme for the layered and variable structures of heat, salinity, carbon, and other biogeochemical species in the Southern Ocean. A robust, algorithmic classification scheme would greatly enhance our ability to understand the observed changes in the Southern Ocean, trends in climate models, and projections of future ocean states. To address this need, the studentship will develop and evaluate classification/clustering methods using a suite of Southern Ocean data. This project is timely, as both the statistical approaches and the relevant data are readily available and well supported by the research community.

Project summary:

In this project, the student will apply various unsupervised classification techniques, a subset of machine learning methods, to available physical and biogeochemical Southern Ocean data to identify robust structures and understand their variability. Unsupervised statistical clustering techniques have never been applied to a large suite of Southern Ocean data, partly because the oceanographic community has only recently compiled a critical mass of observations on which these techniques can be usefully applied. The results will be used to identify new possible classification schemes for observational and ocean/climate model data. The classification scheme will be applied to CMIP model data to identify trends in the physical and biogeochemical structure of the Southern Ocean.

What the student will do:

The selected student will explore possible approaches to clustering or classifying Southern Ocean observational data. Possible estimators include Gaussian Mixture Models, which have recently been successfully applied to temperature data in the North Atlantic (Maze et al., 2017), KMeans clustering, which has been recently applied to oceanographic time series (Santosh Kumar et al. 2017), and several others. The suite of oceanographic data will at minimum include 4D temperature and salinity data from Argo floats, and it may also include profile data from tagged seals, newly available biogeochemical Argo profiles, and ship-based measurements. In collaboration with the supervisors, the student will identify and apply the appropriate machine learning methods for various dataset combinations, exploring the effect of adding new variables or selecting subsets (e.g. individual seasons) on the classification/clustering analysis. Although there is significant room for the student to customise the analysis, at every stage the classification process will be constrained by our need to relate the statistically-derived structures to physical/biogeochemical processes and their variability.

Please contact the lead supervisor directly for further information relating to what the successful applicant will be expected to do, training to be provided, and any specific educational background requirements.


Maze, G. et al., 2017. Coherent heat patterns revealed by unsupervised classification of Argo temperature profiles in the North Atlantic Ocean. Progress in Oceanography, 151, pp.275-292. doi:10.1016/j.pocean.2016.12.008

Santosh Kumar, D.J. et al., 2017. Time Series Analysis of Oceanographic Data Using Clustering Algorithms. In S. C. Satapathy et al., eds. Computer Communication, Networking and Internet Security: Proceedings of IC3T 2016. Singapore: Springer Singapore, pp. 245–252.

Follow this link to find out about applying for this project.

Other projects available from the Lead Supervisor can be viewed here.

Filed under: