USDA Forest Service
 

North Central Research Station

 

North Central Research Station
1992 Folwell Avenue
St. Paul, MN 55108

(651) 649-5000

United States Department of Agriculture Forest Service.

Data Publishing Standards & Process

Data documentation
Data formatting
Software tools
Contact us


In early 2005 the Station began documenting and publishing the research data it generates. The two primary purposes of this new activity are to:

  • promote the transparency of our research results by providing the data underlying the published conclusions
  • enhance the reuse of data collected at significant taxpayer expense and researcher effort
This page will help you understand what this process entails and how you can use our published data in your research. If you are interested in the reference materials we used, take a look at our reading list.

Data paper: reviewed metadata (data documentation), data set(s) generated by the research, links to existing data sources used by the research, and other documentation the author believes would facilitate understanding of the study.
Other documentation can be field site photos, lab notes, supplementary data analysis material, video clips, etc.


Data set documentation

From a user perspective, any publishing solution has two major components: standards for documenting the data and standards for formatting the data. The documentation standard describes the metadata needed to understand why a data set was collected, what is in the data set, and how the data were generated. The Federal Government uses the Federal Geographic Data Committee (FGDC; http://www.fgdc.gov) metadata standard when dealing with spatial data. For data that are not purely spatial, the Station has chosen a superset of the FGDC standard, defined by the National Biological Information Infrastructure (NBII; http://www.nbii.gov), as its metadata standard. Although designed for biological research, its structure has enough flexibility for use with physical and social sciences research as well. More.

top


Data set formatting

The National Academy of Sciences recommends that archived data should be accessible for at least 20 years. Popular data formats like Microsoft Excel spreadsheets seem unlikely to enjoy that sort of longevity. To maximize the opportunities for data re-use in the science community, the Station prefers to use an archival format usable across many computing platforms. While the particular data format(s) employed will vary by data set, spreadsheet-like data sets will generally be stored in XML and comma delimited text. When text is used for numbers, interview transcripts, etc. the text format will be UTF-8. Use of UTF-8 broadens the range of characters that can be represented and is more cross-platform-friendly than use of ASCII. GIS layers will generally be an ESRI transport format until support for Geographic Markup Language becomes more widespread. More.

top


Software tools

Metadata software

Metavist is a software program for creating NBII-compliant metadata. Currently available for the 32-bit Windows platform (Windows 2000 and Windows XP) platform. This software can be ordered on CD at no charge. The CD contains the software; Microsoft’s .Net Framework version 1.1; the user manual; the source code (Visual Basic.Net); and a collection of supplemental material, including examples.

Version note: The most current version is 1.3; dated May 2006.

The new version incorporates a number of minor bug fixes and a new feature—HTML export. In version 1.0, a browser could work with a style sheet (like NBII_classic.xsl) to display the XML-formatted metadata document. The drawback to this was that any paragraph structure present in the XML was lost in the display conversion. While a style sheet is still required to determine how to convert the XML to HTML, the HTML generated by the program retains the paragraph structure, making it easier to read the metadata document.

If you have previously installed version 1.0 of Metavist, you can order the new CD or simply download this zip file (3 MB), which contains the three files needed to install the program. The installation places a ReadMe file in the program directory. This file contains more information about the bug fixes and how to use the HTML export feature.

Whether you download or order the CD, you can check that the Metavist.msi file is authentic by calculating its MD5 checksum. The correct checksum is 0d9e0d8a72e06f87b69e37fbb03662a7.

XML schema for NBII metadata

XML schema can be used to validate an XML document’s structure or simply to understand the structure. This NBII schema was created by the Station’s archivist to assist this work. NBII is currently using an older Document Type Definition (DTD) to describe the XML structure of metadata documents. XML validated under the schema is expected to be valid under the DTD also (the schema describes metadata documents in more detail, and so is slightly more restrictive than the DTD).

Style sheet for NBII metadata

This style sheet translates the XML version of FGDC or NBII metadata into a form suitable for viewing in a Web browser – without needing to understand the XML tags that display in XML-aware browsers when no style sheet is employed. This sheet is based on the “FGDC classic” sheet and conforms to XSLT 1.0.

top


Contact us

Bulleted item Dave Rugg, NC Data Archivist
651-649-5173

 

USDA Forest Service - North Central Research Station
Last Modified: Monday, 21 March 2005


USDA logo which links to the department's national site.Forest Service logo which links to the agency's national site.