Hello,<br><br>I'm fairly new to GATE, and I'm trying to use it for information extraction, through the Batch ML PR. I've managed to make my way through up to running it (corpus annotated with both labels and features + config. file). Well, when I'm saying "running" I'm being slightly optimistic. I should rather say that now is troubleshooting time, and I'd really appreciate some help on a couple of questions (btw, I'm running SVMLibSvmJava).<br>
<br>
- First off, I couldn't find much information on the web re troubleshooting, but maybe I haven't been looking at the right places? Would there be any documents (on or offline) that I should consult before bothering this list's users with newbies' questions?<br>
<br>
- Second, when running the learner (in EVALUATION mode) I get an ArrayOutOfBoundException. Not so good, generally... I've been trying to identifying the source of the problem, but no luck so far. The configuration file is certainly to be blamed, but I can't find where the problem is, and I'm running short of ideas.<br>
Following are snippets of the output message from the API, and of the log file. It looks like no classes are found, so I'm looking in that direction (the data set of in my config file has a <class/> element in one of the attributes).<br>
<br>Cheers,<br>JP<br><br>************** Output message *************************<br>Pre-processing the 50 documents...<br>Learning starts.<br>For the information about this learning see the log file /home/jeanp/workspace/experiments/GATE-MLToy1/savedFiles/logFileForNLPLearning.save<br>
** Evaluation mode:<br>Kfold k=2, numDoc=50, len=25.<br>java.lang.ArrayIndexOutOfBoundsException<br>**************** end output message **********************<br><br>************** log file ************************<br>04-Nov-2008 16:39:09: <br>
*** A new run starts.<br>04-Nov-2008 16:39:09: <br>The execution time (pre-processing the first document): Tue Nov 04 16:39:09 GMT 2008<br>04-Nov-2008 16:39:09: The learning start at Tue Nov 04 16:39:09 GMT 2008<br>04-Nov-2008 16:39:09: The number of documents in dataset: 50<br>
04-Nov-2008 16:39:09: ** Evaluation mode:<br>04-Nov-2008 16:39:09: K-fold evaluation: k=2<br>04-Nov-2008 16:39:09: Kfold k=2, numDoc=50, len=25.<br>04-Nov-2008 16:39:09: <br>*** Fold 1<br>Number of docs for training: 25<br>
1 Subscription_-_Change_Of_Address-412809.txt.xml_00061<br>2 Subscription_-_Change_Of_Address-412843.txt.xml_00062<br>(...)<br>Number of docs for application: 25<br>1 Subscription_-_Change_Of_Address-412085.txt.xml_00048<br>
2 Subscription_-_Change_Of_Address-411758.txt.xml_00049<br>(...)<br>04-Nov-2008 16:39:10: <br>Filtering starts.<br>04-Nov-2008 16:39:10: Multi to binary conversion.<br>04-Nov-2008 16:39:10: The number of classes in dataset: 0<br>
04-Nov-2008 16:39:10: The learners: SVMLibSvmJava<br>04-Nov-2008 16:39:10: total Number of classes for learning is 0<br>04-Nov-2008 16:39:10: One against others for multi to binary class conversion.<br>04-Nov-2008 16:39:10: One against others for multi to binary class conversion.<br>
Number of classes in model: 0<br>04-Nov-2008 16:39:10: Application time for class: 0ms<br>************** end log file **************************<br>--- End<br>