| [Jump to the main content of this page] |
|||||||||||||||||||||||||||||||||||||||||||||||
North Central Research Station |
|||||||||||||||||||||||||||||||||||||||||||||||
| |
|
||||||||||||||||||||||||||||||||||||||||||||||
|
Data Publishing Standards & Process
Data documentation In early 2005 the Station began documenting and publishing the research data it generates. The two primary purposes of this new activity are to:
Data paper: reviewed metadata (data documentation), data set(s) generated by the research, links to existing data sources used by the research, and other documentation the author believes would facilitate understanding of the study. Other documentation can be field site photos, lab notes, supplementary data analysis material, video clips, etc. Data set documentation
From a user perspective, any publishing solution has two major components: standards for documenting the data and standards for formatting the data. The documentation standard describes the metadata needed to understand why a data set was collected, what is in the data set, and how the data were generated. The Federal Government uses the Federal Geographic Data Committee (FGDC; http://www.fgdc.gov) metadata standard when dealing with spatial data. For data that are not purely spatial, the Station has chosen a superset of the FGDC standard, defined by the National Biological Information Infrastructure (NBII; http://www.nbii.gov), as its metadata standard. Although designed for biological research, its structure has enough flexibility for use with physical and social sciences research as well. Data set formatting
The National Academy of Sciences recommends that archived data should be accessible for at least 20 years. Popular data formats like Microsoft Excel spreadsheets seem unlikely to enjoy that sort of longevity. To maximize the opportunities for data re-use in the science community, the Station prefers to use an archival format usable across many computing platforms. While the particular data format(s) employed will vary by data set, spreadsheet-like data sets will generally be stored in XML and comma delimited text. When text is used for numbers, interview transcripts, etc. the text format will be UTF-8. Use of UTF-8 broadens the range of characters that can be represented and is more cross-platform-friendly than use of ASCII. GIS layers will generally be an ESRI transport format until support for Geographic Markup Language becomes more widespread. Software tools
Metadata softwareMetavist is a software program for creating NBII-compliant metadata. Currently available for the 32-bit Windows platform (Windows 2000 and Windows XP) platform. This software can be ordered on CD at no charge. The CD contains the software; Microsoft’s .Net Framework version 1.1; the user manual; the source code (Visual Basic.Net); and a collection of supplemental material, including examples. Version note: The most current version is 1.3; dated May 2006. The new version incorporates a number of minor bug fixes and a new feature—HTML export. In version 1.0, a browser could work with a style sheet (like NBII_classic.xsl) to display the XML-formatted metadata document. The drawback to this was that any paragraph structure present in the XML was lost in the display conversion. While a style sheet is still required to determine how to convert the XML to HTML, the HTML generated by the program retains the paragraph structure, making it easier to read the metadata document. If you have previously installed version 1.0 of Metavist, you can order the new CD or simply download this zip file (3 MB), which contains the three files needed to install the program. The installation places a ReadMe file in the program directory. This file contains more information about the bug fixes and how to use the HTML export feature. Whether you download or order the CD, you can check that the Metavist.msi file is authentic by calculating its MD5 checksum. The correct checksum is 0d9e0d8a72e06f87b69e37fbb03662a7. XML schema for NBII metadataXML schema can be used to validate an XML document’s structure or simply to understand the structure. This NBII schema was created by the Station’s archivist to assist this work. NBII is currently using an older Document Type Definition (DTD) to describe the XML structure of metadata documents. XML validated under the schema is expected to be valid under the DTD also (the schema describes metadata documents in more detail, and so is slightly more restrictive than the DTD). Style sheet for NBII metadataThis style sheet translates the XML version of FGDC or NBII metadata into a form suitable for viewing in a Web browser – without needing to understand the XML tags that display in XML-aware browsers when no style sheet is employed. This sheet is based on the “FGDC classic” sheet and conforms to XSLT 1.0. Contact us
|
||||||||||||||||||||||||||||||||||||||||||||||
| top | |||||||||||||||||||||||||||||||||||||||||||||||
|
USDA Forest Service - North Central Research Station | |||||||||||||||||||||||||||||||||||||||||||||||