Partner 1 - Drs. Colomban de Vargas, Sarah Romac, Fabrice Not, and the ‘Plankton Group’; CNRS UMR7144, team EPEP, Station Biologique de Roscoff, France.
In addition to coordinate the whole project, design the original sampling protocols and co-organize the 10 sampling campaigns, handle all genetic samples from DNA/RNA extraction to NGS-sequencing, and develop the main bioinformatics pipeline to quality-check and taxonomically-assign the BioMarKs metabarcodes, P1’specific scientific outcomes are:
- Development of a novel method to cluster NGS amplicons in a less arbitrary and biologically more meaningful way -SWARM (Mahe et al. submitted). Current amplicon clustering methods suffer from arbitrary clustering thresholds and centroid selection induced input-order dependency. SWARM was developed to solve this by iteratively clustering nearly-identical amplicons using a local threshold. This fast and input-order independent approach produces robust operational taxonomic units, improving the amount of meaningful biological information that can be extracted from amplicon-based studies.
Figure 2: Phylogenetic tree of 16S rDNA reference sequences in PhytoRef, representing all known lineages of phototrophic eukaryotes.
- Development of two major reference databases of taxonomically-curated rDNA barcodes: (i) PR2 - the Protist Ribosomal Reference Database (Guillou et al. 2012), an assemblage of >130,000 expert-annotated 18S rDNA sequences which represent the best tool currently available for accurate taxonomic assignation of eukaryotic metabarcodes from the smallest cells to animals; (ii) PhytoRef - (Decelle et al. to be submitted in 2014), a database of >1,000 curated plastidial 16S rDNA sequence representing all known lineages of photosynthetic eukaryotes, allowing fast and precise recognition of all primary producers in any ecosystems.
Figure 3: The quantitative high throughput fluorescent optical sectioning offered by e-HCFM will revolutionize the morphological and functional characterization of environmental protistan cells, promoting a wide range of applications in both fundamental and applied aquatic research.
- Development of e-HCFM - environmental High Content Fluorescence Microscopy - (Colin et al. to be submitted in 2014), a confocal microscopy technology allowing automated cell by cell high-resolution 3D fluorescence imaging of plankton samples, and machine learning taxononomic recognition.
- Exploration of the environmental diversity of marine calcifying protists -coccolithophores and foraminifers- using nuclear, mitochondrial, and chloroplastic metabarcodes to explore various taxonomic levels from phyla (all haptophytes, Bittner et al. 2013), to family and genera (Bendif et al. 2013, 2014), to the description of novel key species (Siano et al. 2010). In particular, a phenomenal diversity of novel haptophytes microalgae was uncovered in the nanoplankton using a group-specific priming approach (Bittner et al. 2013), demonstrating the depth of our ignorance concerning protistan biodiversity even in one of the most classical group of phytoplankton.
- Creation of the international Pro-WG -Protist Working Group- and ProBP -Protist Barcoding Project- under the Consortium for the Barcoding of Life (CBOL) umbrella. This international group of >30 experts in the taxonomy of most protistan lineages reviewed the current knowledge of protist biodiversity based on classical taxonomy and DNA barcoding, and proposed a novel two-step barcoding strategy based on pre-barcoding of the V4 rDNA fragment, as implemented in BioMarKs, followed by group-specific barcoding using faster evolving gene fragments.