Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
database_structure [2008/07/10 14:44]
giesie
database_structure [2015/06/23 17:22] (current)
Line 1: Line 1:
-====== Keeping track of developments of pollen databases ====== 
  
-In a new development the North American and Global pollen databases are being reshaped to hold a wider range of data types spanning the last 5.3 million years. The new database is called Neotoma and its initial development is funded by a grant from the U.S. National Science Foundation Geoinformatics program. +======= Database structure =======
-As these developments are of interest to the EPD, Simon Brewer was able to attend a recent meeting of the principal investigators of this project and the following short report is drawn from his notes of the meeting. ​+
  
-===The Neotoma system=== +====== Aim ===== 
-The Neotoma ​database ​represents ​a number of quite important differences ​in the use and functioning of pollen ​database.  +**The EPD is set up as a [[http://​en.wikipedia.org/​wiki/​Relational_database |relational ​database]] consisting of large number of tables to allow querying the data in complex ways and to minimise ​the storage space required. The aim of the database structure support group is to find and maintain ​table structure and database ​management ​system ​that is adequate and up to date for the widest possible use of the EPD. The group is maintaining and fostering links to other palaeo-environmental ​databases ​like the ALPine Pollen DAtaBAse (ALPADABA) housed in Bern (Switzerland)[[Neotoma]], the [[http://​medias.obs-mip.fr/​apd/​ |African Pollen Database (APD)]][[PANGAEA]]and [[http://​www.bugscep.com/​|BugsCEP]]. 
-The change in the system ​of tables ​is mainly ​to accommodate different data types to coexist. This also has the advantage ​of streamlining a certain number of tables from the original database. +**
-The second major change in Neotoma ​is move away from local copies of databases, ​managed by individual data managersto a single centralised database. This will be hosted on a server and can be accessed and interrogated via the internetHoweverit is intended that individual components of the database are still be managed by a local data stewardbut remotely.+
  
-==Data stewards== +----
-While the database will be maintained on the central server, the different data types and regions will be managed by a data steward. This person will have the responsibility for uploading new data and maintaining and correcting the existing content. The role of the data steward is therefore not very different from a current data manager, with the exception that data is sent to a remote server, rather than into a local copy of the database. ​+
  
-==Website== 
-The Neotoma website will be the main access portal for the majority of users of Neotoma. The website is still under development,​ and should be available by the IPC meeting in Bonn. At present, users may select sites by choosing the type of data, the geographical region and/or the time window of interest. The choice is made using a shopping basket approach, which allows the selection of one or many sites. The goal is to select a standard set of the most common queries for users of the data.  
-While the Neotoma website will be the main portal for access to this data, it has also been agreed that external applications may have access to the database. This means that existing websites and applications that use a pollen database can be adapted to use Neotoma. 
  
-==Standalone version== +**The Publishing Network ​for Geoscientific & Environmental Data [[PANGAEA]] offers ​to support ​the EPD**
-A standalone version of the database would be made available for download ​for power users, i.e. those who will need to query the database in more complicated ways. This version will most probably be available as an SQL Server database that may be queried using MS Access. ​+
  
-==Calibrated age-depth models== +----
-The new database has a number of modifications to tables containing information about dating and chronologies,​ allowing all types of dates to be stored in the same table. The database also accepts age-depth models in radiocarbon,​ calibrated and varve years. A sequence may also have more than one default chronology, but only one default per type of age control.  +
-A table is present which stores relative ages (Relative Chronology),​ and which may be used in age-depth models. This will contain ages attributed to a variety of controls, including archaeological time scales and geological time scales.  +
-The queries of Neotoma via the website will be, by default, in calibrated ages. In order to use information from sites that only have radiocarbon chronologies a conversion table will be used. This is not intended to replace the establishment of a age-depth model based on calibrated dates, but to allow quick exploration of the existing data.  ​+
  
  
 +**We are keeping track of developments of pollen databases and had a closer look at the [[Neotoma]] database**
  
  
  
-====== Report from the Aix meeting of 4 ======+ 
 +===== Review of the existing EPD table structure ​===== 
    
 Walter Finsinger, Simon Brewer, Thomas Giesecke and Basil Davis met in Aix between Mai 29 and 31. We reviewed the EPD table structure, discussed working protocols with Michelle Leydet and helped John Keltner to correct mistakes in the database. We are grateful to John, Michelle and Valérie Andrieu for advice in discussions and support. ​ Walter Finsinger, Simon Brewer, Thomas Giesecke and Basil Davis met in Aix between Mai 29 and 31. We reviewed the EPD table structure, discussed working protocols with Michelle Leydet and helped John Keltner to correct mistakes in the database. We are grateful to John, Michelle and Valérie Andrieu for advice in discussions and support. ​
Line 36: Line 28:
  
 //This report attempts to summarize the major discussions from the work meeting and suggests a few guidelines or protocols. None of the items are set in stone but they should serve as a base of discussions.// ​ //This report attempts to summarize the major discussions from the work meeting and suggests a few guidelines or protocols. None of the items are set in stone but they should serve as a base of discussions.// ​
- 
-===== Revision of the database structure ===== 
  
  
 We first reviewed the **paradox table structure** (Fig. 1) that the EPD is currently held in and identified fields that have not been used, items that should be combined or items that should be added. We first reviewed the **paradox table structure** (Fig. 1) that the EPD is currently held in and identified fields that have not been used, items that should be combined or items that should be added.
  
-{{epd_tables.jpg|}}+{{epd_tables.jpg?900x600}}
  
  Fig. 1: //Table structure and relationships of the most important EPD paradox tables (here imported in Access).//  Fig. 1: //Table structure and relationships of the most important EPD paradox tables (here imported in Access).//
Line 55: Line 45:
 Table ‘Entity’:​ IsCore, IsSect, IsSSamp could be combined into a field or combined with Descriptor. ​ Table ‘Entity’:​ IsCore, IsSect, IsSSamp could be combined into a field or combined with Descriptor. ​
  
-Table ‘Entity’:​ It would be good to add an identifier for a single publication that should be cited when using the dataset. The database should hold references to many publications that are describing the dataset, but when citing many records it is often only possible to cite one publication. Ideally, the person who submits data to the DB should indicate the reference to the publication that should be cited whenever the dataset is used.+Table ‘Entity’:​ It would be good to add an identifier for a single publication that should be cited when using the dataset. The database should hold references to many publications that are describing the dataset, but when citing many records it is often only possible to cite one publication. Ideally, the person who submits data to the DB should indicate the reference to the publication that should be cited whenever the dataset is used. However, one single publication is in some cases not sufficient: (1) Sometimes a diagram is published in two parts in different publications (e.g., Late Glacial in one and Holocene in another). (2) Sometimes the C14 dates, or part of them, are published in a later publication,​ separate from the original publication with the pollen diagram.
  
 Table ‘Entity’:​ The variables ‘IceThickCM’ and ‘C14DepthAdj’ may be deleted and the data that is present could be stored in a free text ‘notes’ field in an appropriate table. Table ‘Entity’:​ The variables ‘IceThickCM’ and ‘C14DepthAdj’ may be deleted and the data that is present could be stored in a free text ‘notes’ field in an appropriate table.
Line 88: Line 78:
  
  
-==== Data entry: ​==== +** Data entry: ​** 
  
  
Line 97: Line 87:
 c) Maximum possible c) Maximum possible
  
-==== When receiving a new dataset: ​====+** When receiving a new dataset: ​**
  
  
Line 108: Line 98:
 Any other problem contact regional work group. ​ Any other problem contact regional work group. ​
  
-==== Before entering the data: ====+** Before entering the data: **
  
  
Line 136: Line 126:
 Ask Taxonomy group to fix existing errors. Ask Taxonomy group to fix existing errors.
  
-=== Metadata mistakes: ​===+ 
 +** Metadata mistakes: ​**
  
  
Line 213: Line 204:
  
  
-~~DISCUSSION~~+ 
 +**You have to be logged for write access** 
 + 
 + 
database_structure.1215693861.txt.gz · Last modified: 2015/06/25 16:07 (external edit)
Back to top
chimeric.de = chi`s home Creative Commons License Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0