Interpreting Weka Output
Below is the output from Weka when using the weka.classifiers.trees.J48
classifier with the file $WEKAHOME/data/iris.arff as a training file and no
testing file. I.e. using the command:
java
weka.classifiers.trees.J48 -t $WEKAHOME/data/iris.arff
In square brackets ([,]) there are comments on how to
interpret the output.
J48 pruned tree
------------------
petalwidth <= 0.6: Iris-setosa (50.0)
petalwidth > 0.6
| petalwidth <=
1.7
| | petallength <= 4.9: Iris-versicolor
(48.0/1.0)
| | petallength > 4.9
| | |
petalwidth <= 1.5: Iris-virginica (3.0)
| | |
petalwidth > 1.5: Iris-versicolor (3.0/1.0)
| petalwidth >
1.7: Iris-virginica (46.0/1.0)
Number of Leaves : 5
Size of the tree : 9
[ Above is the decision tree constructed by the J48
classifier. This indicates how the classifier uses the attributes to make a decision.
The leaf nodes indicate which class an instance will be assigned to should that
node be reached. The numbers in brackets after
the leaf nodes indicate the number of instances assigned to
that node, followed by how many of those instances are incorrectly classified
as a result. With other classifiers some other output will be given that indicates
how the decisions are made, e.g. a rule set. Note that the tree has been
pruned. An unpruned tree and be produced by using the "-U" option. ]
Time taken to build model: 0.05 seconds
Time taken to test model on training data: 0.01 seconds
=== Error on training data ===
Correctly Classified Instances 147 98 %
Incorrectly Classified Instances 3 2 %
Kappa statistic 0.97
Mean absolute error 0.0233
Root mean squared error 0.108
Relative absolute error 5.2482 %
Root relative squared error 22.9089 %
Total Number of Instances 150
[ This gives the error levels when applying the classifier
to the training data it was constructed from. For our purposes the most important
figures here are the numbers of correctly and incorrectly classified instances.
With the exception of the Kappa statistic, the remaining statistics compute
various error measures based on the class probabilities assigned by the tree. ]
=== Confusion Matrix ===
a b
c <-- classified as
50 0 0
| a = Iris-setosa
0 49 1 | b
= Iris-versicolor
0 2 48 |
c = Iris-virginica
[ This shows for each class, how instances from that class
received the various classifications. E.g. for class "b", 49
instances were correctly classified but 1 was put into class "c". ]
=== Stratified cross-validation ===
Correctly Classified Instances 144 96 %
Incorrectly Classified Instances 6 4 %
Kappa statistic 0.94
Mean absolute error 0.035
Root mean squared error 0.1586
Relative absolute error 7.8705 %
Root relative squared error 33.6353 %
Total Number of Instances 150
[ This gives the error levels during a 10-fold
cross-validation. The "-x" option can be used to specify a different
number of folds. The correctly/incorrectly classified instances refers to the
case where the instances are used as test data and again are the most important statistics here for our purposes. ]
=== Confusion Matrix ===
a b
c <-- classified as
49 1 0
| a = Iris-setosa
0 47 3 | b
= Iris-versicolor
0 2 48 |
c = Iris-virginica
[ This is the confusion matrix for the 10-fold
cross-validation, showing what classification the instances from each class
received when it was used as testing data. E.g. for class "a" 49
instances were correctly classified and 1 instance was assigned to class
"b". ]