The EPD is set up as a relational database consisting of a large number of tables to allow querying the data in complex ways and to minimise the storage space required. The aim of the database structure support group is to find and maintain a table structure and database management system that is adequate and up to date for the widest possible use of the EPD. The group is maintaining and fostering links to other palaeo-environmental databases like the ALPine Pollen DAtaBAse (ALPADABA) housed in Bern (Switzerland), Neotoma, the African Pollen Database (APD), PANGAEA, and BugsCEP.
The Publishing Network for Geoscientific & Environmental Data PANGAEA offers to support the EPD
We are keeping track of developments of pollen databases and had a closer look at the Neotoma database
Walter Finsinger, Simon Brewer, Thomas Giesecke and Basil Davis met in Aix between Mai 29 and 31. We reviewed the EPD table structure, discussed working protocols with Michelle Leydet and helped John Keltner to correct mistakes in the database. We are grateful to John, Michelle and Valérie Andrieu for advice in discussions and support. (Simon Brewer, Basil Davis, Walter Finsinger, Thomas Giesecke)
Also present at the meeting: John Keltner, Michelle Leydet
This report attempts to summarize the major discussions from the work meeting and suggests a few guidelines or protocols. None of the items are set in stone but they should serve as a base of discussions.
We first reviewed the paradox table structure (Fig. 1) that the EPD is currently held in and identified fields that have not been used, items that should be combined or items that should be added.
Fig. 1: Table structure and relationships of the most important EPD paradox tables (here imported in Access).
Table ‘Coredriv’: Was seldom used and could be deleted – the data that is present could be stored in a free text ‘notes’ field in an appropriate table.
Table ‘Section’: The information here could be better combined with another table.
Tables ‘Sitedesc’ and ‘Siteloc’ could be combined.
Table ‘Entity’: IsCore, IsSect, IsSSamp could be combined into a field or combined with Descriptor.
Table ‘Entity’: It would be good to add an identifier for a single publication that should be cited when using the dataset. The database should hold references to many publications that are describing the dataset, but when citing many records it is often only possible to cite one publication. Ideally, the person who submits data to the DB should indicate the reference to the publication that should be cited whenever the dataset is used. However, one single publication is in some cases not sufficient: (1) Sometimes a diagram is published in two parts in different publications (e.g., Late Glacial in one and Holocene in another). (2) Sometimes the C14 dates, or part of them, are published in a later publication, separate from the original publication with the pollen diagram.
Table ‘Entity’: The variables ‘IceThickCM’ and ‘C14DepthAdj’ may be deleted and the data that is present could be stored in a free text ‘notes’ field in an appropriate table.
Table ‘Descr’: This table should be reviewed in detail. At the moment it does not contain a variable describing whether or not a lake has an in- or outflow. We initially proposed to add:
However, we realize that these will bring duplications as e.g. a lake of fluvial or glacial origin may have an inflow or not. Therefore it seems necessary to review the list of choices in the ‘descr’ Table.
Table ‘Litholgy’: The sediment description is currently free text and therefore can not be queried. It would be helpful to add a general variable where the choice has to be made between e.g. gyttia and peat. In this way sites that went from lake to mire could be identified.
Tables containing dating information like ‘Pb210’, ‘tl’ … may be combined into a single table containing a variable that identifies the type of age determination.
Table ‘Sitedesc’: We felt it important to have a variable that describes the general vegetation around the site in the way that the variable IGCP-type does. However, the suitability of the latter variable should be reviewed. This variable could possibly also be generated through a GIS query.
Secondly we reviewed the scorpion table structure designed by John Keltner. We acknowledged that most of the things that we identified as needing change in the paradox table structure were realized in the scorpion table structure. Additionally in scorpion all look up tables were combined into a glossary table. Furthermore, the scorpion table structure is better capable of holding more proxies than pollen and LOI that may have been analysed from the same core or at the same site but a different core.
In Europe we are in a situation where two macrofossil databases are being build up that will eventually be made publicly available. We therefore recommend that the EPD adopts a table structure that can comfortably accommodate several proxies and that the European Pollen Database shall be combined with the macrofossil databases in the same tables.
Although the scorpion table structure is a desirable step forward into the future, no import tools are currently available for adding new datasets to the scorpion database. Currently, Eric Grimm and co-workers are preparing a new database for multi proxy datasets. The new version of Tilia, which was presented as a β-version during the EPD meeting, will serve as an import portal for this new database and import tools will be developed. For these reasons, we recommend that the EPD continues working with the old paradox tables until a new complete solution is available.
a) Minimum b) Desired c) Maximum possible
When receiving a new dataset:
Each dataset receives a tracking number which is posted under appropriate categories (e.g. received, work in progress, ready to upload) on the webpage or wiki. This number will also be used for internal tracking e.g. when the dataset goes out to support groups.
Check if minimum and desired metadata is available. If a and b = TRUE go to enter data. If a = FALSE ask the submitter for more data. If b = FALSE look in publication and/or ask for more data. If no reply or reply with limited but minimum metadata present go to enter data. Any other problem contact regional work group.
Before entering the data:
A) Check if age-depth models are present. (uncal. and cal.) If A = False create simple age depth model or/and contact age depth group.
Add author of age-depth model to database.
Make taxon harmonisation. If problem occurs ask Taxonomy group.
Accuracy-control check: After age depth models are obtained and taxonomy is harmonized - produce percentage diagram and send to author for approval. (This will give the author (or the submitter) the possibility to compare the submitted dataset with the dataset that is going to be entered into the database.
If submitter/author approves data can be imported into the original database!
If within a month time new sites have been added to the original database, update the database at Medias. – A note is added to the ‘new sites’ webpage or the wiki. Also make a new downloadable version available (if possible also one in Access).
Through his work with the EPD and GPD John Keltner has located and corrected many mistakes in the metadata for which we are very grateful. However, many mistakes or omissions in important metadata remain and efforts should be made to correct them or add additional metadata.
Every error correction needs to be documented.
Ask Taxonomy group to fix existing errors.
1) If available check publication and correct.
If 1) = False contact Mapping and accuracy group where appropriate or Regional contact. If Regional group is contacted cc to Mapping group.
These errors are more serious and if possible the data contributor or author should be contacted. In cases where this is not possible contact the Mapping and accuracy group and/or Regional contact.
In the first instance only the contact person is contacted – after one week the whole group is contacted. If no reply after two weeks send question to the wiki.
(1)Note: If missing these fields will be added by the age-depth-model working group. (2)Note: The lists of choices are not yet in place
You have to be logged for write access