Skip to content
Shanan Peters edited this page Apr 29, 2017 · 14 revisions

Introduction

This wiki contains instructions, templates, demos, and educational materials for working with the GeoDeepDive library of machine-readable documents.

Table of Contents

GeoDeepDive Basics

Basic information on GeoDeepDive goals, developers, and its underlying infrastructure can be found at GeoDeepDive.org.

Application Template

The application template provides a common set of conventions to get you started building applications with the GeoDeepDive infrastructure and is primarily intended to streamline interaction with collaborators. It does not contain any logic for entity or data extraction, or any other functionality of DeepDive. It is only an interface to a document library and processing infrastructure that can enable these sorts of activities.

Resources Description
Getting Started Instructions on how to clone, setup, develop, and test applications on your personal infrastructure.
Supported languages A description of languages currently supported by GeoDeepDive infrastructure.
Configuring the application Instructions on how to configure applications for high-throughput computing infrastructure.

Available Data Products

Descriptions of available derived products provided by GeoDeepDive.

Resources Description
Available data products A list of the data products currently available.
Stanford NLP Poses Key Definitions of the Stanford NLP parts of speech ("poses") abbreviations, which tag parts of speech in processed GeoDeepDive document sentences.
Stanford NLP Paths Key Definitions of the Stanford NLP typed dependencies ("dep paths") abbreviations, which define the grammatical relationships of words in processed GeoDeepDive document sentences.

Tutorials and Demos

The following are working demos, educational exercises, or tutorials about using GeoDeepDive or supporting software.

Resources Description
Stromatolites Demo An abbreviated demonstration of the stromatolites application (a full version is available here). The demo searches for stromatolite occurrences in the geologic literature, using a Python script and GeoDeepDive data from five USGS Open Report documents (included). The search terms can be readily changed (i.e., "mountain", "Mexico") to show how this script can be used as a building block in any text-mining operation that utilizes data from the GeoDeepDive library.
ePANDDA Lab Exercise A lab exercise for the University of Wisconsin's Quantitative Paleobiology course that covers basic text-mining and machine-learning in GeoDeepDive.
R Tutorial Resources for developing basic R skills
Git Tutorial Resources for developing git and GitHub skills.