Home  |  About us  |  Contact us
 
Description Logic Classifier

Refresh 0.02 on 6/13

We're making available an alpha release of the open-source Ontylog classifier with a SAX based parser for input of Ontylog XML syntax. It produces a file listing the direct super concept relationships.

To date this has been tested on the NCI Thesaurus and a SNOMED file in KRSS format.

To run the classifier perform the following steps:

  1. Download the iec.jar file.
  2. Download the Ontylog DTD from the NCI ftp site and place it in the same directory as the file you wish to classify.
  3. Create an Ontylog XML file or download the NCI Thesaurus.
  4. Run using the command (Java 1.5 required):
    java -Xmx512M -jar iec-0.02.jar ontylog_input_file.xml direct_sups_output_file.txt

To use the KRSS parser which creates an Ontylog XML file run:

    java -Xmx512M -cp iec-0.02.jar;jatha-2.8.jar com.mays.importer.KrssParser input_file.krss output_file.xml

Notes:

The parser is non-validating and currently only handles resolving references by name.

The output file is a tab delimited with column one containing the concept name and column two the direct super concept name. If multiple direct super concepts exist, there will be mutiple lines for each concept. There is a designated super concept name of TOP. If a concept has TOP as a direct super concept, it is a root of the terminology.

This simply runs in memory and exits. We're working on a persistence layer.

The KRSS parser uses Jatha, a Java library that implements a fairly large subset of Common Lisp.

Resources:

NCI Thesaurus See the README. Be sure to get the XML format and the Ontylog DTD.

Download the classifier

Release notes:

Version 0.02

  • Bug: Close the output file on termination (yikes).
  • Bug: Create ids when none are present.
  • Added the KRSS parser.

Plans: (comments welcome)

Finish persistence layer.

Clean up and release source.

Resolve references by code and id.