Logical rules extracted from data


Computational Intelligence Laboratory | Department of Informatics | Nicolaus Copernicus University
Look at datasets to find more results obtained using different classifiers.
Links on: AI and Machine Learning | AI in Information Retrieval | Cognitive science | Computational Intelligence | Neuroscience | Software & Databases | Science & Fringes | Comparison of classification results | Logical rules extracted from data |

Medical: Appendictis | Breast cancer (Wisconsin) | Cleveland heart disease | Diabetes | Hepatitis | Hypothyroid | Ljubljana cancer | Statlog Heart |
Other: Ionosphere | Iris flowers | Mushrooms | Monk 1 | Monk 2 | Monk 3 | Satellite image dataset (Statlog version) | NASA Shuttle | Sonar | Vovel |
Confusion matrices: column labels refer to the true class, row labels to the assigned class, for medical data healthy cases are first.

Appendicitis.

106 vectors, 8 attributes, two classes (88 acute +18 other),
obtained from Shalom Weiss.
Attribute names: WBC1, MNEP, MNEA, MBAP, MBAA, HNEP, HNEA
Rules found using PVM
Accuracy 89.6% in leave-one-out, 91.5% overall
C1: MNEA > 6600 OR MBPA > 11
C2: ELSE
Rules found using C-MLP2LN, no optimization
Accuracy 89.6% in leave-one-out, 91.5% overall
C1: MNEA > 6650 OR MBPA > 12
C2: ELSE
Second neuron gets 3 more cases correctly using 2 rules, but we treat it as noise rather than an interesting rare case.
Using L-units another set of rules is generated with the overall 89.6% accuracy (11 errors).
C1: WBC1 > 8400 OR MBPA >= 42
C2: ELSE

Confusion matrix:

Append.
Other

Appendicitis
84
10

Other
1
11
C4.5 generates 3 rules with overall 91.5% accuracy. It may also generate 7 rules for 97.2% accuraccy but this is strong overfitting, with each rule classifying only 1-2 cases.
Summary of accuracy (%) and references
Method
Accuracy
Reference
PVM
89.6
Weiss, Kapouleas
C-MLP2LN
89.6±?
our
RIAC rule induction
86.9
Hamilton et.al
CART, C4.5 (dec. trees)
84.9
Weiss, Kapouleas
FSM rules
???
our (RA)
S.M. Weiss, I. Kapouleas, "An empirical comparison of pattern recognition, neural nets and machine learning classification methods", in: J.W. Shavlik and T.G. Dietterich, Readings in Machine Learning, Morgan Kauffman Publ, CA 1990
H.J. Hamilton, N. Shan, N. Cercone, RIAC: a rule induction algorithm based on approximate classification, Tech. Rep. CS 96-06, Regina University 1996.
Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306


Wisconsin breast cancer.

From UCI repository, 699 cases, 9 attributes (1-10 integer values),
two classes, 458 benign (65.5%) & 241 malignant (34.5%).
For 16 instances one attribute is missing.
Attributes: from original database remove F0, id. number (warining: in some papers original feature numbers are given).
F1: Clump Thickness 1 - 10
F2: Uniformity of Cell Size 1 - 10
F3: Uniformity of Cell Shape 1 - 10
F4: Marginal Adhesion 1 - 10
F5: Single Epithelial Cell Size 1 - 10
F6: Bare Nuclei 1 - 10
F7: Bland Chromatin 1 - 10
F8: Normal Nucleoli 1 - 10
F9: Mitoses 1 - 10
C-MLP2LN results:
Rules S1: Single rule: IF f2 = [1,2] then benign else malignant
Original class.
Calculated
1 417 12
2 41 229
Accuracy: 646 correct (92.42%), 53 errors; Sensitivity=0.9720, Specificity=0.8481
Rules S2: 5 rules for malignant, overall accuracy of 96%.

R1
f1<6 &
f3<4 &
f6<2 &
f7<5

100%
R2
f1<6 &
f4<4 &
f6<2 &
f7<5

100%
R3
f1<6 &
f3<4 &
f4<4 &
f6<2

100%
R4
f1=[6,8] &
f3<4 &
f4<4 &
f6<2 &
f7<5
100%
R5
f1<6 &
f3<4 &
f4<4 &
f6=[2,7] &
f7<5
92.3% (36 correct, 3 errors)

ELSE
benign




3 benign cases wrongly classified as malignant and 25 malignant cases wrongly classified as benign.
Rules S3: 4 malignant rules, overall accuracy of 97.7%, confusion matrix
Confusion matrix:

Benign
Malignant

Benign
447
5

Malignant
11
236

R1
f3<3 &
f4<4 &
f6<6 &
f9=1

99.5% (2 err)
R2
f1<7 &
f4<4 &
f6<6 &
f9=1

99.8% (5 err)
R3
f1<7 &
f3<3 &
f6<6 &
f9=1

99.5% (2 err)
R4
f1<7 &
f3<3 &
f4<4 &
f6<6

99.5% (2 err)

ELSE
benign




3 benign cases wrongly classified as malignant and 25 malignant cases wrongly classified as benign.
Rules S4: Optimized rules: 1 benign vector classified as malignant (rule 1 and rule 5, the same vector).
ELSE condition makes 6 errors, giving 99.00% overall accuracy:

R1
f1<9 &
f4<4 &
f6<2 &
f7<5

100%
R2
f1<10 &
f3<4 &
f4<4 &
f6<3

100%
R3
f1<7 &
f3<9 &
f4<3 &
f6=[4,9] &
f7<4
100%
R4
f1=[3,4] &
f3<9 &
f4<10 &
f6<6 &
f7<8
99.8%
R5
f1<6 &
f3<3 &
f7<8


99.8%

ELSE
benign



(6 errors)
Other solutions: 100% reliable rules rejecting 51 cases (7.3%) of all vectors.
For malignant class these rules are:

R1
f1<9 &
f3<4 &
f6<3 &
f7<6

100%
R2
f1<5 &
f4<8 &
f6<5 &
f7<10

100%
R3
f1<4 &
f3<2 &
f4<3 &
f6<7

100%
R4
f1<10 &
f4<10 &
f6=[1,5] &
f7<2

100%
For the benign cases rules are: NOT (R5 OR R6 OR R7 OR R8), where:

R5
f1<8 &
f3<5 &
f7<4


100%
R6
f1<9 &
f4<6 &
f6<9 &
f7<5

100%
R7
f1<9 &
f3<6 &
f4<8 &
f6<9

100%
R8
f1=6 &
f3<10 &
f4<10 &
f6<2 &
f7<9
100%
Summary of results (rules discovered for the whole data set).

Method
Accuracy %
Reference
Rules
C-MLP2LN
99.0


FSM
98.3
our (RA)

C4.5 (decision tree)
96.0
Hamilton et.al

RIAC (prob. inductive)
95.0
Hamilton et.al

Duch W, Adamczak R, Grąbczewski K, Żal G, Hybrid neural-global minimization method of logical rule extraction. Journal of Advanced Computational Intelligence 3 (5): 348-356.
Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306
H.J. Hamilton, N. Shan, N. Cercone, RIAC: a rule induction algorithm based on approximate classification, Tech. Rep. CS 96-06, Regina University 1996.
Papers on a smaller (569 cases) Wisconsin breast cancer dataset are on the O.L. Mangasarian page.


Cancer (Ljubljana data)

From UCI repository (restricted): 286 instances, 201 no-recurrence-events (70.3%), 85 recurrence-events (29.7%);
9 attributes, between 2-13 values each, 9 missing values
Rules found using PVM: 70% for training, 30% for test
Accuracy 77.4% train, 77.1% test
C1: Involved Nodes > 0 & Degree_malig = 3
C2: ELSE
C-MLP2LN more accurate rules: 78% overall accuracy
R1: deg_malig=3 & breast=left & node_caps=yes
R2: (deg_malig=3 OR breast=left) & NOT inv_nodes=[0,2] & NOT age=[50,59]

Method
Accuracy, % test
Reference
C-MLP2LN
77.4
our
CART
77.1
Weiss, Kapouleas
PVM
77.1
Weiss, Kapouleas
AQ15
66-72
Michalski et.al
Inductive
65-72
Clark, Niblett
Michalski,R.S., Mozetic,I., Hong,J., & Lavrac,N. (1986). The Multi-Purpose Incremental Learning System AQ15 and its Testing Application to Three Medical Domains. In Proceedings of the Fifth National Conference on Artificial Intelligence, 1041-1045, Philadelphia, PA: Morgan Kaufmann.
Clark,P. & Niblett,T. (1987). Induction in Noisy Domains. In: Progress in Machine Learning (from the Proceedings of the 2nd European Working Session on Learning), 11-30, Bled, Yugoslavia: Sigma Press.
CART & PVM 77.4% train, 77.1% test; S.M. Weiss, I. Kapouleas. An empirical comparison of pattern recognition, neural nets and machine learning classification methods, in: J.W. Shavlik and T.G. Dietterich, Readings in Machine Learning, Morgan Kauffman Publ, CA 1990
Duch W, Adamczak R, Grąbczewski K (1997) Extraction of crisp logical rules using constrained backpropagation networks, International Conference on Artificial Neural Networks (ICNN'97), Houston, 9-12.6.1997, pp. 2384-2389
Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306


Hepatitis.

From UCI repository, 155 vectors, 19 attributes, 13 binary, other integer, class is first.
Two classes, 32 die (20.6%), 123 live (79.4%)
Missing values (here F1=class): F4(1), F6(1), F7(1), F8(1), F9(10), F10(11), F11(5), F12(5), F13(5), F14(5), F15(6), F16(29), F17(4), F18(16), F19(67)
C-MLP2LN rule, overall accuracy 88.4%, using F2=age, F13=Ascites, F15=bilirubin, F20=histology,
R1: age > 52 & bilirubin > 3.5
R2: histology=yes & ascites=no & age = [30,51]
C-MLP2LN, lignuistic variables from L-units, overall accuracy 96.1%, looks good but uses F19=protime which has missing values in almost half of the cases.
age >= 30 & sex=male & antivirals=no & protime <= 50

Confusion matrix:

Live
Die

Live
120
3

Die
3
29

Method
Accuracy %
Reference
C-MLP2LN
???
Our
FSM
90
Our
PVM
??

CART (decision tree)
82.7

Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306


Cleveland heart disease.

From UCI repository, 303 cases, 13 attributes (4 cont, 9 nominal), many missing values.
2 (no, yes) or 5 classes (no, degree 1, 2, 3, 4).
Class distribution: 164 (54.1%) no, 55+36+35+13 yes (45.9%) with disease degree 1-4.

C-MLP2LN simplified rules 85.5% overall accuracy. Rules for healthy class:
R1: (thal=0 OR thal=1) & ca=0.0 (88.5%)
R2: (thal=0 OR ca=0.0) & cp NOT 2 (85.2%)
ELSE sick (89.2%)

Method
Accuracy %
Reference
C-MLP2LN
82.5
RA, estimated?
FSM
82.2
Rafał Adamczak


Statlog Heart disease.

13 attributes (extracted from 75), no missing values.
270=150+120 observations selected from the 303 cases (Cleveland Heart).

Cost Matrix =
Absence
Presence
0
1
5
0
Results without risk matrix
Method
Accuracy %
Reference
K*
76.7
WEKA, RA
C-MLP2LN
???
Our
1R
71.4
WEKA, RA
T2
68.1
WEKA, RA
FOIL
64.0
WEKA, RA
RBF
60.0
ToolDiag, RA
InductH
58.5
WEKA, RA
Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306


Diabetes.

From UCI repository, dataset "Pima Indian diabetes":
2 classes, 8 attributes, 768 instances, 500 (65.1%) healthy, 268 (34.9%) diabetes.
F2 is "Plasma glucose concentration (2 hours oral glucose tolerance) test"
F6 is "Body mass index (weight in kg/(height in m)^2)"
1 rule from SSV, overall accuracy 74.9%, Sensitivity=45.5, Spec.=90.6
IF F#2 > 144.5 then diabetes, else healthy
Rule from C-MLP2LN with L-units, overall accuracy 75%
IF ( F2<=151 AND F6<=47 ) THEN healthy, else diabetes
2 rules from SSV, overall accuracy 76.2%, Sensitivity=60.8, Spec.=84.4
IF F#2 > 144.5 OR (F#2 > 123.5 AND F#6 > 32.55) then diabetes, else healthy
Estimation of accuracy (4 leaves in SSV): average of 10 runs, each 10xCV, accuracy 75.2 ±0.6

Confusion matrix:

Healthy
Diabetes

Healthy
467
159

Diabetes
33
109
Results from crossvalidation.
Method
Accuracy %
Reference
SSV 5 nodes/BF
75.3±4.8
WD, Ghostminer
SSV opt nodes/3CV/BF
74.7±3.5
WD, Ghostminer
SSV opt prune/3CV/BS
74.6±3.3
WD, Ghostminer
SSV opt prune/3CV/BF
74.0±4.1
WD, Ghostminer
SSV opt nodes/3CV/BS
72.9±4.3
WD, Ghostminer
SSV 5 nodes/BF
74.9±4.8
WD, Ghostminer
SSV 3 nodes/BF
74.6±5.2
WD, Ghostminer
CART
74.5±?
Stalog
DB-CART
74.4±?
Shang & Breiman
ASR
74.3±?
Ster & Dobnikar
CART
72.8±?
Ster & Dobnikar
C4.5
73.0±?
Stalog
Default
65.1±?

C-MLP2LN, overall
75.0±?
Our, 4/99
Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306


Hypothyroid.

Thyroid, From UCI repository, dataset "ann-train.data":
3772 learning and 3428 testing examples;
Training: 93+191+3488 or 2.47%, 5.06%, 92.47%
Test: 73+177+3178 or 2.13%, 5.16%, 92.71%
21 attributes (15 binary, 6 continuous); 3 classes
C-MLP2LN rules (all values of continuous features are multiplied here by 1000)
Initial rules:
primary hypothyroid: TSH>6.1 & FTI <65
compensated : TSH > 6 & TT4<149 & On_Tyroxin=FALSE & FTI>64 & surgery=False
ELSE normal
Optimized more accurate rules: 4 errors on the training set (99.89%), 22 errors on the test set (99.36%)
primary hypothyroid: TSH>30.48 & FTI <64.27 (97.06%)
primary hypothyroid: TSH=[6.02,29.53] & FTI <64.27 & T3< 23.22 (100%)
compensated : TSH > 6.02 & FTI>[64.27,186.71] & TT4=[50, 150.5) & On_Tyroxin=no & surgery=no (98.96%)
no hypothyroid : ELSE (100%)

Method
% training
% test
Reference
C-MLP2LN rules + ASA
99.9
99.36
Rafał/Krzysztof/Grzegorz
CART
99.8
99.36
Weiss
PVM
99.8
99.33
Weiss




C-MLP2LN rules
99.7
99.0
Rafał/Krzysztof




3 crisp logical rules using TSH, FTI, T3, on_thyroxine, thyroid_surgery, TT4 give 99.3% of accuracy on the test set.
Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306


Other, non-medical data


Iris flowers

150 vectors, 50 in each class: setosa, virginica, versicolor
PL=x3=Petal Length; PW=x4=Petal Width
PVM Rules: accuracy 98% in leave-one-out and overall

Setosa
Petal Length <3
Virginica
Petal length >4.9 OR Petal Width >1.6
Versicolor
ELSE
C-MLP2LN rules:
7 errors, overall 95.3% accuracy
Setosa
PL <2.5
100%
Virginica
PL >4.8
92%
Versicolor
ELSE
94%
Higher accuracy: overall 98%
Setosa
PL <2.9
100%
Virginica
PL>4.95 OR PW>1.65
94%
Versicolor
PL=[2.9,4.95] & PW=[0.9,1.65]
100%
100% reliable rules reject 11 vectors, 8 virginica and 3 versicolor:
Setosa
PL <2.9
100%
Virginica
PL>5.25 OR PW>1.85
100%
Versicolor
PL=[2.9,4.9] & PW<1.7
100%
Summary:
Method
Accuracy
Reference
PVM 1 rule
97.3
Weiss
CART (dec. tree)
96.0
Weiss
FuNN
95.7
Kasabov
NEFCLASS
96.7
Nauck et.al.
FuNe-I
96.7
Halgamuge
PVM 2 rules
98.0
Weiss, optimal result, corresponds to about 96% in CV tests
C-MLP2LN
98.0
Duch et.al.
SSV
98.0
Duch et.al.
Grobian (rough)
100
Browne; overfitting
References:
S.M. Weiss, I. Kapouleas, "An empirical comparison of pattern recognition, neural nets and machine learning classification methods", in: J.W. Shavlik and T.G. Dietterich, Readings in Machine Learning, Morgan Kauffman Publ, CA 1990
N. Kasabov, Connectionist methods for fuzzy rules extraction, reasoning and adaptation. In: Proc. of the Int. Conf. on Fuzzy Systems, Neural Networks and Soft Computing, Iizuka, Japan, World Scientific 1996, pp. 74-77
Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306
C. Browne, I. Duntsch, G. Gediga, IRIS revisited: A comparison of discriminant and enhanced rough set data analysis. In: L. Polkowski and A. Skowron, eds. Rough sets in knowledge discovery, vol. 2. Physica Verlag, Heidelberg, 1998, pp. 345-368
D. Nauck, U. Nauck and R. Kruse, Generating Classification Rules with the Neuro-Fuzzy System NEFCLASS. Proc. Biennial Conf. of the North American Fuzzy Information Processing Society (NAFIPS'96), Berkeley, 1996
S.K. Halgamuge and M. Glesner, Neural networks in designing fuzzy systems for real world applications. Fuzzy Sets and Systems 65:1-12, 1994

Mushrooms

8124 instances, 4208 (51.8%) edible and 3916 (48.2%) poisonous;
22 attributes (all symbolic): cap shape (6, e.g.. bell, conical,flat...), cap surface (4), cap color (10), bruises (2), odor (9), gill attachment (4), gill spacing (3), gill size (2), gill color (12), stalk shape (2), stalk root (7, many missing values), surface above the ring (4), surface below the ring (4), color above the ring (9), color below the ring (9), veil type (2), veil color (4), ring number (3), spore print color (9), population (6), habitat (7).
Together 118 logical input values.
2480 missing values for attribute 11
C-MLP2LN rules:
Disjunctive rules for poisonous mushrooms, from most general to most specific:

No.
Rule
Accuracy
1
odor=NOT(almond.OR.anise.OR.none)
98.52%, 120 poisonous cases missed
2
spore-print-color=green
99.41%, 48 cases missed
3
odor=none.AND.stalk-surface-below-ring=scaly.
AND.(stalk-color-above-ring=NOT.brown)
99.90%, 8 cases missed
4
habitat=leaves.AND.cap-color=white
100% accuracy
Alternative R4' rule: population=clustered.AND.cap_color=white
These rule involve 6 attributes (out of 22). Rule 1 may be replaced by:
odor = creosote.OR.fishy.OR.foul.OR.musty.OR.pungent.OR.spicy
Rules for edible mushrooms are obtained as negation of the rules given above, for example rule:
Re1: odor=(almond.OR.anise.OR.none).AND.spore-print-color=NOT.green
makes 48 errors, giving 99.41% accuracy on the whole dataset.
Several slightly more complex variations on these rules exist, involving other attributes, such as gill_size, gill_spacing, stalk_surface_above_ring, but the rules given above are the simplest found so far.
Other methods:
[1] BRAINNE: 300 rules, > 8000 antecedents, 91%
[2] STAGGER: asymptoted to 95% classification accuracy after reviewing 1000 instances.
[3] HILLARY algorithm, about 95%
References:
Duch W, Adamczak R, Grabczewski K (1996) Extraction of logical rules from training data using backpropagation networks, in: Proc. of the The 1st Online Workshop on Soft Computing, 19-30.Aug.1996, pp. 25-30, available on-line at: http://www.bioele.nuee.nagoya-u.ac.jp/wsc1/
Duch W, Adamczak R, Grabczewski K, Ishikawa M, Ueda H, Extraction of crisp logical rules using constrained backpropagation networks - comparison of two new approaches, in: Proc. of the European Symposium on Artificial Neural Networks (ESANN'97), Bruge, Belgium 16-18.4.1997, pp. 109-114
Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306
Schlimmer,J.S. (1987). Concept Acquisition Through Representational Adjustment (Technical Report 87-19), Doctoral disseration, Department of Information and Computer Science, University of California, Irvine.
Iba,W., Wogulis,J., & Langley,P. (1988). Trading off Simplicity and Coverage in Incremental Concept Learning. In Proceedings of the 5th International Conference on Machine Learning, 73-79, Ann Arbor, Michigan: Morgan Kaufmann.

Monk 1

Original rule is: head shape = body shape OR jacket color = red
C-MLP2LN:
100% accuracy with 4 rules + 2 exception, 14 atomic formulae.
Other systems: see the original paper:
S. Thrun, J. Bala, E. Bloedorn, I. Bratko, B. Cestnik, J. Cheng, K. De Jong, S. Dzeroski, R. Hamann, K. Kaufman, S. Keller, I. Kononenko, J. Kreuziger, R.S. Michalski, T. Mitchell, P. Pachowicz, B. Roger, H. Vafaie, W. Van de Velde, W. Wenzel, J. Wnek, and J. Zhang.
The MONK's problems: A performance comparison of different learning algorithms. Technical Report CMU-CS-91-197, Carnegie Mellon University, Computer Science Department, Pittsburgh, PA, 1991.

Monk 2

Original rule: exactly two of the six features have their first values
C-MLP2LN:
100% accuracy with 16 rules and 8 exceptions, 132 atomic formulae.
Other systems: see the Thrun et al. original paper: The MONK's problems

Monk 3

Original rule:
NOT (body shape = octagon OR jacket color = blue) OR (holding = sward AND jacket color = green)
was corrupted by 5% noise.
C-MLP2LN:
100% accuracy with 33 atomic formulae.
Other systems: see the Thrun et al. original paper: The MONK's problems
Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306
Comparison of results:

Method
Monk-1
Monk-2
Monk-3
Remarks
AQ17-DCI
100
100
94.2
Michalski
AQ17-HCI
100
93.1
100
Michalski
AQ17-GA
100
86.8
100
Michalski
Assistant Pro.
100
81.5
100
Monk paper
mFOIL
100
69.2
100
Monk paper
ID5R
79.7
69.2
95.2
Monk paper
IDL
97.2
66.2
--
Monk paper
ID5R-hat
90.3
65.7
--
Monk paper
TDIDT
75.7
66.7
--
Monk paper
ID3
98.6
67.9
94.4
Monk paper
AQR
95.9
79.7
87.0
Monk paper
CLASSWEB 0.10
71.8
64.8
80.8
Monk paper
CLASSWEB 0.15
65.7
61.6
85.4
Monk paper
CLASSWEB 0.20
63.0
57.2
75.2
Monk paper
PRISM
86.3
72.7
90.3
Monk paper
ECOWEB
82.7
71.3
68.0
Monk paper
Neural methods
MLP
100
100
93.1
Monk paper
MLP+reg.
100
100
97.2
Monk paper
Cascade correlation
100
100
97.2
Monk paper
FSM, Gaussians
94.5
79.3
95.5
Duch et.al.
SSV
100
80.6
97.2
Duch et.al.
C-MLP2LN
100
100
100
Duch et.al.
Other methods
kNN, with VDM metric
--
--
98.0
K. Grudziński

NASA Shuttle

Training set 43500, test set 14500, 9 attributes, 7 classes
Approximately 80% of the data belongs to class 1.
Rules obtained from FSM, without optimization:

Class
15 rules, train 99.89%, test 99.81% accuracy
Correct/False
C1
F9 [-14,0]
F1 [27,39] and F2 [-16,13]
F2 [-22,110] and F9 [-14,2]
F2 [-25,7] and F3 [76,83] and F7 [36,58]
15043/0
11612/0
26014/0
11648/0
C2
F2 [18,110] and F4 = 0 and F5 [-188,12]
F1 [42, 59] and F2 [10,50] and F6 [0,59] and F7 [19,37] and F9 [2,24]
25/0
10/0
C3
F2 [-118,-22] and F7 [5,71] and F8 [73,103] and F9 [16,86]
F2 [-318,-31] and F5 [-188,34]
F2 [-177,-19] and F5 [36,72] and F9 [6,54]
F2 [-42,-17] and F3 [71,78] and F6 [-14,24] and F9 [2,26]
58/0
82/0
27/0
9/5
C4
F1 [51, 67] and F2 [-18,17] and F9 [4,70]
F1 [53, 66] and F2 [-60,24] and F4 [-29,30] and F9 [8,266]
F2 [-12,18] and F3 [64, 79] and F7 [ 4, 26] and F9 [8, 82]
6063/0
5564/0
2634/0
C5
F7 [-48, 5]
2458/2
C6
F2 [-4821,-386] and F5 [-46,34]
9/0
Rules obtained from FSM, without optimization:

Class
19 rules, train 99.94%, test 99.87% accuracy
Correct/False
C1
F9 [-14,0]
F1 [27,44] and F2 [-20,18]
F2 [-15,51] and F9 [-14,2]
F6 [-13839,-41] and F9 [-356,10]
F1 [27,50] and F2 [-27,8] and F9 [-14,24]
15043/0
19316/0
26003/0
36/0
25563/1
C2
F2 [21,110] and F4 [ 0, 0] and F5 [-188,26]
F1 [40, 57] and F2 [14,59] and F9 [ 8,22]
25/0
12/0
C3
F2 [-102,-37] and F9 [2,28]
F1 [ 27, 81] and F2 [-138,-24] and F9 [22,88]
F2 [ -64, -21] and F4 [-2,1] and F6 [-37,27] and F9 [2,48]
46/0
60/0
67/8
C4
F1 [53,61] and F2 [ -46, 45] and F7 [ 1, 40] and F9 [18,126]
F1 [53,59] and F2 [-4821,275] and F5 [-188,46] and F7 [-48,28]
F1 [53,63] and F2 [ -19, 26] and F4 [ -21, 50] and F9 [4,126]
3805
3512/2
6735/0
C5
F4 [-2044,769] and F7 [-48, 2]
F7 [ - 19, 5] and F9 [44,196]
F6 [ -4, 4] and F8 [36, 38] and F9 [30,38]
690/0
1772/0
203/0
C6
F2 [-4821,-4475]
F2 [-4821,-908] and F5 [8,34]
F2 [ 275,1958] and F7 [1,54]
3/0
9/0
6/2
17 optimized FSM rules make only 3 errors on the training set (99.99\% accuracy), leaving 8 vectors unclassified, and no errors on the test set but leaving 9 vectors unclassified (99.94\%). After Gaussian fuzzification of inputs (very small, 0.05\%) only 3 errors and 5 unclassified vectors are obtained for the training and 3 vectors are unclassified and 1 error is made (with the probability of correct class for this case close to 50\%) for the test set.
32 rules from SSV gave even better results: 100\% correct on the training and only 1 error on the test set.

Satellite image dataset (STATLOG version)

Training 4435 test 2000 cases, 36 semi-continous [0 to 255] attributes (= 4 spectral bands x 9 pixels in neighbourhood) and 6 decision classes: 1,2,3,4,5 and 7 (class 6 has been removed because of doubts about the validity of this class).

Method
% train
% test
Time train
Time test
Dipol92
94.9
88.9
746
111
Radial
88.9
87.9
564
74
CART
92.1
86.2
330
14
Bayesian Tree
98.0
85.3
248
10
C4.5
96.0
85.0
434
1
New ID
93.3
85.0
226
53
Duch W, Adamczak R, Grąbczewski K, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Transactions on Neural Networks 12 (2001) 277-306

Ionosphere

200 training, 150 test cases, 34 continuous attributes, 2 classes

Method
Accuracy %
Reference
3-NN + simplex
98.7
Our ???
3-NN
96.7
our
IB3
96.7
Aha
MLP+BP
96.0
Sigillito
C4.5
94.9
Hamilton
RIAC
94.6
Hamilton
C4 (no windowing)
94.0
Aha
Non-linear perceptron
92.0
Sigillito
FSM + rotation
92.8
our
1-NN
92.1
Aha
DB-CART
91.3
Shang, Breiman
Linear perceptron
90.7
Sigillito
CART
88.9
Shang, Breiman
N. Shang, L. Breiman, ICONIP'96, p.133
David Aha: k-NN+C4+IB3 (Aha \& Kibler, IJCAI-1989), IB3 parameter settings: 70% and 80% for acceptance and dropping respectively.
RIAC, C4.5 from: H.J. Hamilton, N. Shan, N. Cercone, RIAC: a rule induction algorithm based on approximate classification, Tech. Rep. CS 96-06, Regina University 1996.


Sonar

208 cases, 60 continuous attributes, 2 classes
From the CMU benchmark repository

Method
Train %
Test %
Reference
MLP+BP, 12 hidden
99.8±0.1
84.7±5.7
Gorman, Sejnowski
MLP+BP, 24 hidden
99.8±0.1
84.5±5.7
Gorman, Sejnowski
1-NN, Manhattan

84.2±1.0
our (KG)
MLP+BP, 6 hidden
99.7±0.2
83.5±5.6
Gorman, Sejnowski
FSM - methodology ?

83.6
our (RA)
1-NN Euclidean

82.2±0.6
our (KG)
DB-CART, 10xCV

81.8
Shang, Breiman
CART, 10xCV

67.9
Shang, Breiman
Our results: kNN also from 13xCV, results from 10xCV are quite similar, for example 1-NN Manhattan 84.5±0.9

Vovel

528 training, 462 test cases, 10 continous attributes, 11 classes
From the CMU benchmark repository

Method
Train
Test
Reference
CART-DB, 10xCV on total set
90.0

Shang, Breiman
CART, 10xCV on total set
78.2

Shang, Breiman




FSM initialization, methodology ?

84.4
our (RA)
9-NN

56.5
our ?
Square node network, 88 units

54.8
UCI
Gaussian node network, 528 units

54.6
UCI
1-NN

54.1
UCI
Radial Basis Function, 528 units

53.5
UCI
Gaussian node network, 88 units

53.5
UCI
Square node network, 22

51.1
UCI
Multi-layer perceptron, 88 hidden

50.6
UCI
Modified Kanerva Model, 528 units

50.0
UCI
Radial Basis Function, 88 units

47.6
UCI
Single-layer perceptron, 88 hidden

33.3
UCI
N. Shang, L. Breiman, ICONIP'96, p.133, made 10xCv instead of using the test set.


Other Data

Glass: Shang, Breiman CART 28.6% error, DB-CART 29.4%

DNA-Primate splice-junction gene sequence


Włodzisław Duch, last modification 28.11.2000