2009년 8월 6일 목요일

Modan development

As I mentioned in the previous post, I'm developing Modan in python language, with wxPython as a GUI toolkit. My development environment consists of Eclipse + PyDev. All the morphometric data are stored in a database file using SQLite, with sqlite3 python module. For image manipulation, PIL (Python Image Library) is used, and for 3D visualization, I use PyOpenGL. To handle matrices more easily, Numpy is utilized. The whole project is hosted on Google Code website http://code.google.com/p/modan , including issues tracking, WikiWiki, and source control through SVN. To make installable file for MS Windows, py2exe is used. What else... probably that's it for the most part. If you have any other questions concerning technical details, just let me know.

2009년 8월 4일 화요일

Current status of Modan

As of today, Aug 4th 2009, Modan is still in its early stage. It has been almost one and a half year since I first started this project, but I couldn't spend much time on it. Recently I could spare some time, and thought it would be better if I could make Modan available for public before the summer break ends. Still, I'm far away from presenting this as a ready-to-use software product. That said, Modan can do quite a few things now.

  1. Data acquisition - you can digitize 2D landmark data from image file. For easy digitizing, you can adjust brightness/contrast, zoom and pan the image. Wireframe and baseline definition can be done at the same time. Also, calibration is possible so that you can put a scale when you take the picture of a specimen, and calibrate the coordinates accordingly. You can input the coordinates manually. If you already have digitized data in MS Excel files or .tps files, you can import those files.
  2. Data transformation - Once the data are acquired, you may want to export them in different format for use with IMP or SPSS, R, or whatever. In export dialog, you can choose among formats to be exported. Data will be transfromed on the fly when actual export is being done. Users can select which superimposition method they want to use when exporting, among Procrustes superimposition (generalized least-square), Bookstein registration, sliding baseline registration, and experimental RFTRA. Bookstein and sliding baseline require baseline to be defined. If you want traditional lengths, you have to define a wireframe for the dataset beforehand.
  3. Analysis - At the moment, only principal component analysis is implemented. Transformation (or, superimposition) is 'transparent' to the user. You just select the superimposition method with radio buttons. and the choice is instantly reflected in the viewer control as well as PCA result plot.

There's no documentation yet. I hope I can write one, such as 'getting started guide,' soon. But right now, I'm more focusing on adding features, fixing bugs, cleaning up the user interface, and so on.

You can download Modan version 0.1.4 from Google code website, even though I cannot guarantee anything about this version. http://code.google.com/p/modan Right now, I only provide installable file for MS Windows. Other platforms will be supported in the future. I'm developing Modan in Python language, with wxPython toolkit. So, basically it's multi-platform, but there's some dependency on MS Windows specific module now. If you are interested, you can check out the source code anytime.

What's my plan now? You can get some idea from the ToDoList wiki page - http://code.google.com/p/modan/wiki/ToDoList . In fact, every time I run Modan to test it, there comes out at least two or three things to be done before I can let anyone other than me actually use it. (probably I'd better not test it?) At least, I want to do following things before I "officially" release Modan to public. Implementing canonical variate analysis, 3D visualization of analysis results, and some heavy UI clean-up (fingers crossed).

Well, I know nobody is reading this blog at the moment. So I don't worry much about anything. :D

2009년 8월 2일 일요일

Workflow in morphometrics

In this blog, I'm going to talk about geometric morphometrics and my software "Modan." Specifically, landmark-based morphometrics is the starting point. In this article, I will briefly describe the process, or workflow in morphometrics.

Typically, 2D landmark-based morphometrics begins with taking pictures of the specimens. After taking pictures, all the coordinates of the landmarks are digitized using softwares such as ImageJ or NIH Image. Digitized coordinates are stored in a text file in tps format. This is "data acquisition" part. 3D landmark-based morphometrics is not very different. It just involves different devices that can measure 3D coordinates and appropriate software instead of digital camera and digitizing software.

The second part is "data transformation" part. To perform statistical analysis, raw data acquired from specimens should be standardized. In most cases, if not all, this process involves translation, rotation and scaling, according to the superimposition methods the researcher have chosen. This transformation can be done by softwares such as CoordGen of IMP and stored as x1y1 format file. Sometimes, additional editing must be done to go to the next step. For example, to use the data in statistical packages, additional columns that describe some attribtes of specimen, or group of specimens should be inserted. This is very tedious job, and can be even painful if you're doing it over and over again.

The next part is statistical analysis. According to what kind of analysis you actually want to do, there is a wide variety of choices here. SPSS, SAS, R, some programs in IMP, and so on. Basically, you feed the transformed data into the program and execute specific commands to get the results, as tabulated numbers, bivariate graphs, or some more pretty stuffs.

Okay, that's a rough sketch of how we do morphometric analysis in general. However, if you think about the whole process, you may notice that more than a few softwares are involved in "each" step of the analysis. When I started to learn how to do morphometric analysis a few year ago, I had to consult the email that my advisor sent to me and think very hard to figure out what I should do, which program I should use at that moment. Even though I have considered myself a power user for many years, it was frustrating experience. Some of the programs are nice, some others are terrible to use, which means the overall process was terrible.

As a result, the user experience is inconsistent and very much fragmented. The ability to understand the results and contemplate on the meaning is often hindered by the difficulties throughout the process, and make the researcher glad that he/she got some results anyway. My former advisor once said, "this is the real world. You have to deal with it." Yes, that's right. But still, I think there should be some better way to deal with it.

And that's why I began writing my own program, Modan.