A book published in 2007 (Combes, Darwin, dessine-moi les hommes. Ed. Le Pommier) mentioned that "The constituents of life are being constantly renewed, it is the organisation, and it alone, which perpetuates itself’’. Part of this organisation resides in the networks of interactions existing between the genes, proteins, and small molecules that compose a living organism. It is now commonly accepted that the functioning and development of any organism is under the partial control of such networks, or, as the book also indicated, "What constitutes life, are the interactions between molecules of which there exist millions of different types’’. Studying these interactions and their underlying complexity is therefore crucial to understand living systems. The interactions may be direct or indirect. The term "network of interactions’’ is habitually and generically used to designate what are in fact at least three different biochemical processes: metabolism, gene regulation and signal transduction. The first is the complete set of chemical reactions that occur in living cells while gene regulation (or regulation of gene expression) refers to the cellular control of the amount and timing of changes to the production of a protein or RNA. Signal transduction in its turn refers to any process by which a cell converts one kind of signal or stimulus into another, most often involving ordered sequences of biochemical reactions inside the cell. Each of these processes involves different types of molecules among genes (DNA), gene transcripts (RNAs), various sorts of proteins (enzymes, transcription factors), and small molecules called metabolites or compounds. Besides these, two further features essential for building some such networks require to be studied apart given their importance and complexity. These are DNA and RNA sequence or structure regulation motifs and gene expression profiling, the latter more usually called transcriptomics.
The amount and spread of data now becoming available by the high-throughput techniques developed in recent years have enabled to obtain increasingly bigger and more realistic networks although still filled with errors and noise. This has also allowed to introduce an evolutionary perspective into the study of such networks. Indeed, that organisation per se may perpetuate itself does not mean it undergoes no changes. As with genomic sequences, determining whether and how the different networks evolve, indeed how they may possibly co-evolve with the molecules whose interactions they model, is an essential issue, in itself and to get at a better understanding of living systems.
Like for the study of chromosomal evolution and dynamics, the various topics addressed by BAMBOO for analysing processes and networks appeal to algorithmics on texts (for identifying such basic features as regulatory elements), on numerical matrices (for analysing expression data) and on node and arc-valued directed or mixed graphs or hypergraphs (that represent the networks). The mathematical techniques used likewise include combinatorics, probability and statistics at all levels: of modelling, algorithmic development, algorithmic complexity analysis, and evaluation of the results obtained on real data. The latter aspect requires once again to obtain good random models, also in this case a largely open issue. The need to be efficient when treating massive or complex data may further require the elaboration of powerful filtering techniques and of special data structures such as smart indexes for strings, trees and graphs. In the case of networks, analysis of their capacity raises also issues, many open, on the efficient computation of flow vectors and optimal fluxes.
The main more specific topics studied by BAMBOO are:
- reconstruction or inference of the genetic, metabolic and protein-protein interaction networks which implies important upstream work on genome annotation, and basic data analysis such as identifying DNA and RNA sequence and/or structure motifs, exploiting transcriptomic and proteomic data, etc.;
- analysis of the structure, capacity and evolution of such networks;
- analysis of the link between such interaction and biochemical networks and the life traits of organisms that requires developing network comparison algorithms which, depending on the specific question put, take into account some of or all the different characteristics of such networks (structure, capacity, etc);
- exploration of the relation between network evolution and evolution at the genomic level.
Metabolic and genetic networks are more specifically (although not exclusively) studied in the particular context of symbiotic relations. This is a special type of (species) interaction, that is also addressed by BAMBOO. Symbiosis is a pervasive phenomenon, often of a long term nature. It has been estimated that 50% of all known species are parasites, i.e. maintain a symbiotic relation with another species from which they benefit while the partner in the relation is harmed, and that close to a 100% of all plants and animals are parasitised as individuals. Indeed, there are thought to be 10 times more bacterial cells in a human body than human cells. There is growing recognition that symbiosis has a profound impact on the origin and maintenance of the biome and of its ecosystems, on the health of living organisms, and even on sex! The more specific questions addressed concern:
- analysis of host-parasites co-evolution notably by co-phylogenetic (tree/network comparison) approaches;
- study of the dynamics of the invasion of a host by the symbionts (parasites, commensals, endosymbionts) with which the host lives in an intimate relation;
- exploration of the possible evolution of this intimate relation (for instance, for endosymbionts, into an organelle);
- analysis of the genetic and metabolic dialog between a host and the endosymbionts living inside its cell.