TOXsIgN is a new multi-species public repository for toxicogenomic signatures... but wait a second, what is a "toxicogenomic signature"?
[Toxicogenomics signature]: the description of -omics (e.g. transcriptomics, proteomics, epigenomics) effects on individuals or their descendants, after exposure to single or combined environmental factors, including chemical (e.g. pesticides, plasticizers, drugs, endocrine disruptors), physical (e.g. radiations, temperature) or biological (e.g. pathogens, parasites) factors. When using transcriptomic technologies, the corresponding signature is the set of genes whose expression is known to be positively or negatively altered after an exposure to these factors.
Considerable worries have recently been raised regarding the lack of reproducibility of biomedical research, and more particularly in the field of toxicology. Funding agencies, such as the National Institutes of Health (NIH), share this concern and discuss ways to enhance reproducibility in environmental sciences by notably providing greater transparency of data, including negative findings or contradictory data that should also be adopted more widely in peer-reviewed journals. Journals and public funding agencies have thus begun to make public availability of raw and analyzed data a condition for publication or funding, respectively.
In addition, we believe that it is also a matter of give and take. Making toxicogenomic signatures data available to the community: i) on one hand, it will facilitate data comparison and the development of innovative tools; and, ii) on the other hand, it will make your toxicological studies and publications more visible (as investigators will be able to compare their own signature data with yours) and it will thus boost citations of your scientific articles.
This repository is not intended either to archive raw data, as GEO and ArrayExpress do, or to replace existing toxicogenomics databases such as CTD, LINCS, diXa and CEBS, but rather to complement these resources by acting as a distribution hub of toxicogenomic signatures directly submitted by scientists of the community. In addition to serving as a public archive, TOXsIgN also intends to become a warehouse for toxicogenomics and predictive toxicology tools. Its modular design will facilitate the implementation of tools leaning on the deposited signatures that will help investigators analyze, predict and prioritize the toxicological effects of environmental factors.
The TOXsIgN database is organized in a four-layer architecture (Project > Assay > Factor > Signature) associated with a unique identifier that can be reported in submitted manuscripts, like the raw data identifiers from GEO or ArrayExpress. The project layer (associated with an identifier with the “TSP” prefix) covers one or several studies (or subprojects, “TSE” prefix) addressing specific questions. Each study in turn is associated with at least one assay (“TST” prefix) that assesses the exposure of a given study model (e.g., cell culture, living animal, human population) to at least one chemical (e.g., pesticide, plasticizer, drug, or endocrine disruptor), physical (e.g., type of radia-tion or temperature) or biological (e.g., pathogen or parasite) agent at a given dose and for a given time of exposure. Specific outcomes are extracted for each assay in the form of toxicogenomic signatures (“TSS” prefix), i.e., the set of over- and underexpressed genes. Importantly, this organization is compatible with mixtures and transgenerational studies.
In its current state, TOXsIgN is a gene-centric repository that only accepts NCBI Entrez Gene identifiers (IDs) for molecular and "omics" signatures. This restriction has one major advantage: it allows users to perform cross-technology and cross-species comparison of toxicogenomic signatures. As reliable and consistent gene ID conversion is a complex problem. Because reliable and consistent identifier conversion is a complex problem, toxicogenomic signatures should be converted to Entrez Gene IDs from up-to-date resources before being submitted in TOXsIgN (AILUN, DAVID or UniProt). For experiments with Affymetrix GeneChip technologies, it is highly recommended that users normalize their raw data (CEL files) with the BRAINARRAY custom Chip Description Files (CDF so that intensity values are not summarized for each probe set but directly for each Entrez gene ID (Dai et al., 2005).
In its current state, bulk submissions of large datasets (e.g. toxicogenomic signatures for hundreds of environmental factors) is not yet incorporated in TOXsIgN. Nevertheless, the TOXsIgN team will be pleased to help you submit your toxicogenomic signatures. Do not hesitate to contact us!
In a quick and easy submission procedure, investigators will record all required information in a dedicated Excel template. This file embeds one tab for each layer (Project, Study, Assay, and Signature) and it integrates a dozen landmark controlled vocabularies (ontologies) allowing scientists to describe their toxicogenomic studies and their outcomes with precision. Once uploaded, the TOXsIgN webserver performs an initial evaluation of the Excel template to identify: i) “critical errors” about essential information that may not have been properly completed (such as the project title) and could therefore prevent the project upload; ii) “warnings” for important but not essential missing information (such as a PubMed identifier); and, iii) “information” for any other data not appropriately completed (such as additional information). If the system detects no “critical error”, it next invites the user to upload the associated toxicogenomic signatures. Each signature comprises three one-column text files specifying: i) all interrogated genes; ii) significantly overexpressed genes; and iii) underex-pressed genes.
By default, each submitted project and its related signatures are tagged with a “private” status meaning that only authorized users (the owner but also coauthors) can access the uploaded data. At this stage, information can still be modified simply by uploading an updated version of the Excel template. A button is available on the web interface for each project to request the TOXsIgN administrators to change the project status from private to public. If “warnings” are still detected, this demand is rejected. The administrators will then help the investigators make the necessary modifications to change the status. Full instructions and examples of the submission procedure are provided in the tutorial section.
A powerful search engine is implemented to access all public information within TOXsIgN. Users can thus interrogate the database ac-cording to many distinct fields, such as environmental factors, organisms, tissues, and technologies. They can also easily make more advanced queries by using ontologies to describe toxicogenomic signatures.
Several toxicogenomic tools are already available in the warehouse and others are currently being developed in our lab. If you want to make a contribution to TOXsIgN by sharing your own developed tools, please feel free to contact us and we will do our best to help you integrate this new module in the web interface