Infomap Project
Home
Contact Us
CSLI
SourceForge.net Logo
Spacer image
SourceForge pages: Project Homepage | Project Summary | Download | Documentation
Stanford pages: Infomap Project homepage | Related Papers

There are currently no further fixes or releases planned for the Infomap NLP software (though if you would like to contribute some, please feel free to get involved). SemanticVectors is another more recent Open Source package for creating semantic vector models. It is easier to install and is undergoing more active development. We encourage any users who run into trouble installing and using Infomap NLP to try out the SemanticVectors package, though we will continute to try and answer questions posted to Infomap NLP mailing lists.

Infomap NLP Software

An Open-Source Package for Natural Language Processing

Project Summary / Download Page

The Infomap NLP Software package uses a variant of Latent Semantic Analysis (LSA) on free-text corpora to learn vectors representing the meanings of words in a vector-space known as WordSpace. It indexes the documents in the corpora it processes, and can perform information retrieval and word-word semantic similarity computations using the resulting model.

The Infomap software is implemented in C and can efficiently process large corpora. It has already been used on the British National Corpus; New York Times, AP, and Wall Street Journal newswire corpora; a collection of medical abstracts; the OHSUMED corpus; and other corpora. The software has successfully been compiled and run under Solaris 7 (SunOS 5.7), Red Hat Linux 9, Debian Linux 3.0 ("woody"), and Cygwin. It should work under other Unix variants with minimal adaptation.

Future releases may include a version of the software that can process parallel bilingual corpora and a web interface for convenient information retrieval and query tuning.

To download the Infomap software, visit the project summary page. You can also browse the documentation online and see research papers relating to the software. (The papers describe both the algorithms the software uses and experiments that have been performed with it.)

The Infomap NLP Software is produced by the Infomap Project, a project of the Computational Semantics Lab at Stanford University's Center for the Study of Language and Information. The Infomap Project and the Computational Semantics Lab are under the direction of Prof. Stanley Peters.