More on PyLucene

October 26th, 2004 at 1:48 pm (4 years ago) by Andi Vajda under chandlerdb

The PyLucene project was announced on June 24th. It started with version 0.5 to illustrate that it was sort of half-way there; half-way to supporting the entire Lucene set of APIs and capabilities.

Since then, many things happened. As of this writing, PyLucene is at version 0.8.2 and supports roughly 80% of the Lucene 1.4.2 APIs, many of the remaining ones being expert APIs aiding in the extension of Lucene’s stock classes. The goal for version 1.0 is to support the entire set of APIs, including the expert extension ones!

Memory Management issues were figured out. Python reference counts the memory objects it manages whereas Java memory is garbage collected. In addition, the libgcj garbage collector does not keep track of Java objects outside of its realm. This means that Java object instances, wrapped by SWIG and returned to Python are not kept track of by the Java garbage collector. SWIG is not aware of the garbage collection issue since it views the Java objects as C++ objects, as that is how they are presented by GCJ’s Compiled Native Interface (CNI). To work these issues around I had to tweak SWIG’s python object release code and to wrap all Java instances returned to Python with a Java PythonRef instance that manages a reference count to the java object instance and a global Java hashtable of such PythonRef instances to prevent them from being garbage collected by Java before their due.

A capability of providing Python implementations for abstract Java Lucene methods was developed. When the SWIG-based PyLucene glue code recognizes one of these python implementations it wraps it with a proper Java extension of the class the methods are intended for. This extension bridge class contains native implementations for these abstract Java methods invoking the Python-implemented methods provided.

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • Reddit

Leave a Reply