The demonstrator “Plankton Genomics” is led by the European Bioinformatics Institute (EMBL-EBI) and created by the Faculty of Sciences at Sorbonne University, with contributions from Flanders Marine Institute (VLIZ).

The aim of the plankton genomic demonstrator is to showcase a deep assessment of plankton distributions by mining data across biomolecular, imaging and environmental domains. It will draw on the outputs of initiatives such as Tara Oceans and will focus on two key objectives:

  • Notebook 1 - Species and functions discovery:
    • Discovery of as yet undescribed biodiversity from genetic and morphological signals from the characterisation of their geographical distributions, co-occurrences/exclusions and correlation with environmental contexts.
  • Notebook 2 - Biodiversity and ecology:
    • Exploration of genetic and morphological markers of plankton diversity and abundance, in particular the new ones discovered above, to predict their spatiotemporal distribution and serve as high-resolution EOVs for biological processes.

The initial users of the plankton genomics demonstrator are, primarily, scientific researchers, including taxonomists, computational ecologists and bioinformaticians with extensive knowledge of the data collected during the Tara Oceans Expedition. In the short term, we expect an important uptake of the demonstrator by European initiatives such as the H2020 Blue Growth project AtlantECO, the Ocean Sampling Day initiative, and the Marine Genomic Observatories in close collaboration with EMBRC and ASSEMBLE Plus.

The end-users include a broad base of scientists in quest of the identification of unknown sequences in the oceanic environment, and also interested, for example in plankton biogeography, marine biogeochemistry, ecosystem health, and climate science.


The plankton genomics demonstrator will consist of two Jupyter Notebook with R packages that allow users to

  • Obtain lists of unknown taxonomies and functions,
  • Correlate these unknowns with environmental parameters,
  • Model the biogeography of unknowns using environmental climatologies from Copernicus,
  • Visualize these biogeographies on maps.