Logical+rules+from+data

=[|Logical rules extracted from data]= ||  [|Computational Intelligence Laboratory] | [|Department of Informatics] | [|Nicolaus Copernicus University] Look at [|datasets] to find more results obtained using different classifiers. Links on: AI and Machine Learning | AI in Information Retrieval | Cognitive science | Computational Intelligence | Neuroscience | Software & Databases | Science & Fringes | Comparison of classification results | Logical rules extracted from data |

**Medical**: [|Appendictis] | [|Breast cancer (Wisconsin)] | [|Cleveland heart disease] | [|Diabetes] | [|Hepatitis] | [|Hypothyroid] | [|Ljubljana cancer] | [|Statlog Heart] |  **Other**: [|Ionosphere] | [|Iris flowers] | [|Mushrooms] | [|Monk 1] | [|Monk 2] | [|Monk 3] | [|Satellite image dataset](Statlog version) | [|NASA Shuttle] | [|Sonar] | [|Vovel] |  Confusion matrices: column labels refer to the true class, row labels to the assigned class, for medical data healthy cases are first.

Appendicitis.
106 vectors, 8 attributes, two classes (88 acute +18 other),  obtained from Shalom Weiss.  Attribute names: WBC1, MNEP, MNEA, MBAP, MBAA, HNEP, HNEA  Rules found using PVM  Accuracy 89.6% in leave-one-out, 91.5% overall  C1: MNEA > 6600 OR MBPA > 11  C2: ELSE  Rules found using C-MLP2LN, no optimization  Accuracy 89.6% in leave-one-out, 91.5% overall <span style="font-family: Arial,Helvetica;"> C1: MNEA > 6650 OR MBPA > 12 <span style="font-family: Arial,Helvetica;"> C2: ELSE <span style="font-family: Arial,Helvetica;"> Second neuron gets 3 more cases correctly using 2 rules, but we treat it as noise rather than an interesting rare case. <span style="font-family: Arial,Helvetica;"> Using L-units another set of rules is generated with the overall 89.6% accuracy (11 errors). <span style="font-family: Arial,Helvetica;"> C1: WBC1 > 8400 OR MBPA >= 42 <span style="font-family: Arial,Helvetica;"> C2: ELSE

<span style="font-family: Arial,Helvetica;">C4.5 generates 3 rules with overall 91.5% accuracy. It may also generate 7 rules for 97.2% accuraccy but this is strong overfitting, with each rule classifying only 1-2 cases. <span style="font-family: Arial,Helvetica;"> Summary of accuracy (%) and references <span style="font-family: Arial,Helvetica;">S.M. Weiss, I. Kapouleas, "An empirical comparison of pattern recognition, neural nets and machine learning classification methods", in: J.W. Shavlik and T.G. Dietterich, Readings in Machine Learning, Morgan Kauffman Publ, CA 1990 <span style="font-family: Arial,Helvetica;"> H.J. Hamilton, N. Shan, N. Cercone, RIAC: a rule induction algorithm based on approximate classification, Tech. Rep. CS 96-06, Regina University 1996. <span style="font-family: Arial,Helvetica;"> Duch W, Adamczak R, Grąbczewski K, [|A new methodology of extraction, optimization and application of crisp and fuzzy logical rules]. IEEE Transactions on Neural Networks 12 (2001) 277-306
 * Confusion matrix: ||  || Append. || Other ||
 * || Appendicitis || 84 || 10 ||
 * || Other || 1 || 11 ||
 * Method || Accuracy || Reference ||
 * PVM || 89.6 || Weiss, Kapouleas ||
 * C-MLP2LN || 89.6±? || our ||
 * RIAC rule induction || 86.9 || Hamilton //et.al// ||
 * CART, C4.5 (dec. trees) || 84.9 || Weiss, Kapouleas ||
 * FSM rules || ??? || our (RA) ||

<span style="font-family: Arial,Helvetica;">Wisconsin breast cancer.
<span style="font-family: Arial,Helvetica;">From [|UCI repository], 699 cases, 9 attributes (1-10 integer values), <span style="font-family: Arial,Helvetica;"> two classes, 458 benign (65.5%) & 241 malignant (34.5%). <span style="font-family: Arial,Helvetica;"> For 16 instances one attribute is missing. <span style="font-family: Arial,Helvetica;"> Attributes: from original database remove F0, id. number (warining: in some papers original feature numbers are given). <span style="font-family: Arial,Helvetica;"> F1: Clump Thickness 1 - 10 <span style="font-family: Arial,Helvetica;"> F2: Uniformity of Cell Size 1 - 10 <span style="font-family: Arial,Helvetica;"> F3: Uniformity of Cell Shape 1 - 10 <span style="font-family: Arial,Helvetica;"> F4: Marginal Adhesion 1 - 10 <span style="font-family: Arial,Helvetica;"> F5: Single Epithelial Cell Size 1 - 10 <span style="font-family: Arial,Helvetica;"> F6: Bare Nuclei 1 - 10 <span style="font-family: Arial,Helvetica;"> F7: Bland Chromatin 1 - 10 <span style="font-family: Arial,Helvetica;"> F8: Normal Nucleoli 1 - 10 <span style="font-family: Arial,Helvetica;"> F9: Mitoses 1 - 10 <span style="font-family: Arial,Helvetica;"> C-MLP2LN results: <span style="font-family: Arial,Helvetica;"> Rules S1: Single rule: IF f2 = [1,2] then benign else malignant <span style="font-family: Arial,Helvetica;"> Original class. <span style="font-family: Arial,Helvetica;"> Calculated <span style="font-family: Arial,Helvetica;"> 1 417 12 <span style="font-family: Arial,Helvetica;"> 2 41 229 <span style="font-family: Arial,Helvetica;"> Accuracy: 646 correct (92.42%), 53 errors; Sensitivity=0.9720, Specificity=0.8481 <span style="font-family: Arial,Helvetica;"> Rules S2: 5 rules for malignant, overall accuracy of 96%.

<span style="font-family: Arial,Helvetica;">3 benign cases wrongly classified as malignant and 25 malignant cases wrongly classified as benign. <span style="font-family: Arial,Helvetica;"> Rules S3: 4 malignant rules, overall accuracy of 97.7%, confusion matrix
 * R1 || f1<6 & || f3<4 & || f6<2 & || f7<5 ||  || 100% ||
 * R2 || f1<6 & || f4<4 & || f6<2 & || f7<5 ||  || 100% ||
 * R3 || f1<6 & || f3<4 & || f4<4 & || f6<2 ||  || 100% ||
 * R4 || f1=[6,8] & || f3<4 & || f4<4 & || f6<2 & || f7<5 || 100% ||
 * R5 || f1<6 & || f3<4 & || f4<4 & || f6=[2,7] & || f7<5 || 92.3% (36 correct, 3 errors) ||
 * || ELSE || benign ||  ||   ||   ||   ||
 * Confusion matrix: ||  || Benign || Malignant ||
 * || Benign || 447 || 5 ||
 * || Malignant || 11 || 236 ||

<span style="font-family: Arial,Helvetica;">3 benign cases wrongly classified as malignant and 25 malignant cases wrongly classified as benign. <span style="font-family: Arial,Helvetica;"> Rules S4: Optimized rules: 1 benign vector classified as malignant (rule 1 and rule 5, the same vector). <span style="font-family: Arial,Helvetica;"> ELSE condition makes 6 errors, giving 99.00% overall accuracy:
 * R1 || f3<3 & || f4<4 & || f6<6 & || f9=1 ||  || 99.5% (2 err) ||
 * R2 || f1<7 & || f4<4 & || f6<6 & || f9=1 ||  || 99.8% (5 err) ||
 * R3 || f1<7 & || f3<3 & || f6<6 & || f9=1 ||  || 99.5% (2 err) ||
 * R4 || f1<7 & || f3<3 & || f4<4 & || f6<6 ||  || 99.5% (2 err) ||
 * || ELSE || benign ||  ||   ||   ||   ||

<span style="font-family: Arial,Helvetica;">Other solutions: 100% reliable rules rejecting 51 cases (7.3%) of all vectors. <span style="font-family: Arial,Helvetica;"> For malignant class these rules are:
 * R1 || f1<9 & || f4<4 & || f6<2 & || f7<5 ||  || 100% ||
 * R2 || f1<10 & || f3<4 & || f4<4 & || f6<3 ||  || 100% ||
 * R3 || f1<7 & || f3<9 & || f4<3 & || f6=[4,9] & || f7<4 || 100% ||
 * R4 || f1=[3,4] & || f3<9 & || f4<10 & || f6<6 & || f7<8 || 99.8% ||
 * R5 || f1<6 & || f3<3 & || f7<8 ||  ||   || 99.8% ||
 * || ELSE || benign ||  ||   ||   || (6 errors) ||

<span style="font-family: Arial,Helvetica;">For the benign cases rules are: NOT (R5 OR R6 OR R7 OR R8), where:
 * R1 || f1<9 & || f3<4 & || f6<3 & || f7<6 ||  || 100% ||
 * R2 || f1<5 & || f4<8 & || f6<5 & || f7<10 ||  || 100% ||
 * R3 || f1<4 & || f3<2 & || f4<3 & || f6<7 ||  || 100% ||
 * R4 || f1<10 & || f4<10 & || f6=[1,5] & || f7<2 ||  || 100% ||

<span style="font-family: Arial,Helvetica;">Summary of results (rules discovered for the whole data set).
 * R5 || f1<8 & || f3<5 & || f7<4 ||  ||   || 100% ||
 * R6 || f1<9 & || f4<6 & || f6<9 & || f7<5 ||  || 100% ||
 * R7 || f1<9 & || f3<6 & || f4<8 & || f6<9 ||  || 100% ||
 * R8 || f1=6 & || f3<10 & || f4<10 & || f6<2 & || f7<9 || 100% ||

<span style="font-family: Arial,Helvetica;">Duch W, Adamczak R, Grąbczewski K, Żal G, [|Hybrid neural-global minimization method of logical rule extraction]. Journal of Advanced Computational Intelligence 3 (5): 348-356. <span style="font-family: Arial,Helvetica;"> Duch W, Adamczak R, Grąbczewski K, [|A new methodology of extraction, optimization and application of crisp and fuzzy logical rules]. IEEE Transactions on Neural Networks 12 (2001) 277-306 <span style="font-family: Arial,Helvetica;"> H.J. Hamilton, N. Shan, N. Cercone, RIAC: a rule induction algorithm based on approximate classification, Tech. Rep. CS 96-06, Regina University 1996. <span style="font-family: Arial,Helvetica;"> Papers on a smaller (569 cases) Wisconsin breast cancer dataset are on the [|O.L. Mangasarian page].
 * Method || Accuracy % || Reference || Rules ||
 * C-MLP2LN || 99.0 ||  ||   ||
 * FSM || 98.3 || our (RA) ||  ||
 * C4.5 (decision tree) || 96.0 || Hamilton //et.al// ||  ||
 * RIAC (prob. inductive) || 95.0 || Hamilton //et.al// ||  ||

<span style="font-family: Arial,Helvetica;">Cancer (Ljubljana data)
<span style="font-family: Arial,Helvetica;">From [|UCI repository] (restricted): 286 instances, 201 no-recurrence-events (70.3%), 85 recurrence-events (29.7%); <span style="font-family: Arial,Helvetica;"> 9 attributes, between 2-13 values each, 9 missing values <span style="font-family: Arial,Helvetica;"> Rules found using PVM: 70% for training, 30% for test <span style="font-family: Arial,Helvetica;"> Accuracy 77.4% train, 77.1% test <span style="font-family: Arial,Helvetica;"> C1: Involved Nodes > 0 & Degree_malig = 3 <span style="font-family: Arial,Helvetica;"> C2: ELSE <span style="font-family: Arial,Helvetica;"> C-MLP2LN more accurate rules: 78% overall accuracy <span style="font-family: Arial,Helvetica;"> R1: deg_malig=3 & breast=left & node_caps=yes <span style="font-family: Arial,Helvetica;"> R2: (deg_malig=3 OR breast=left) & NOT inv_nodes=[0,2] & NOT age=[50,59]

<span style="font-family: Arial,Helvetica;">Michalski,R.S., Mozetic,I., Hong,J., & Lavrac,N. (1986). //The Multi-Purpose Incremental Learning System AQ15 and its Testing Application to Three Medical Domains//. In Proceedings of the Fifth National Conference on Artificial Intelligence, 1041-1045, Philadelphia, PA: Morgan Kaufmann. <span style="font-family: Arial,Helvetica;"> Clark,P. & Niblett,T. (1987). //Induction in Noisy Domains//. In: Progress in Machine Learning (from the Proceedings of the 2nd European Working Session on Learning), 11-30, Bled, Yugoslavia: Sigma Press. <span style="font-family: Arial,Helvetica;"> CART & PVM 77.4% train, 77.1% test; S.M. Weiss, I. Kapouleas. An empirical comparison of pattern recognition, neural nets and machine learning classification methods, in: J.W. Shavlik and T.G. Dietterich, Readings in Machine Learning, Morgan Kauffman Publ, CA 1990 <span style="font-family: Arial,Helvetica;"> Duch W, Adamczak R, Grąbczewski K (1997) [|Extraction of crisp logical rules using constrained backpropagation networks], International Conference on Artificial Neural Networks (ICNN'97), Houston, 9-12.6.1997, pp. 2384-2389 <span style="font-family: Arial,Helvetica;"> Duch W, Adamczak R, Grąbczewski K, [|A new methodology of extraction, optimization and application of crisp and fuzzy logical rules]. IEEE Transactions on Neural Networks 12 (2001) 277-306
 * Method || Accuracy, % test || Reference ||
 * C-MLP2LN || 77.4 || our ||
 * CART || 77.1 || Weiss, Kapouleas ||
 * PVM || 77.1 || Weiss, Kapouleas ||
 * AQ15 || 66-72 || Michalski //et.al// ||
 * Inductive || 65-72 || Clark, Niblett ||

<span style="font-family: Arial,Helvetica;">Hepatitis.
<span style="font-family: Arial,Helvetica;">From [|UCI repository], 155 vectors, 19 attributes, 13 binary, other integer, class is first. <span style="font-family: Arial,Helvetica;"> Two classes, 32 die (20.6%), 123 live (79.4%) <span style="font-family: Arial,Helvetica;"> Missing values (here F1=class): F4(1), F6(1), F7(1), F8(1), F9(10), F10(11), F11(5), F12(5), F13(5), F14(5), F15(6), F16(29), F17(4), F18(16), F19(67) <span style="font-family: Arial,Helvetica;"> C-MLP2LN rule, overall accuracy 88.4%, using F2=age, F13=Ascites, F15=bilirubin, F20=histology, <span style="font-family: Arial,Helvetica;"> R1: age > 52 & bilirubin > 3.5 <span style="font-family: Arial,Helvetica;"> R2: histology=yes & ascites=no & age = [30,51] <span style="font-family: Arial,Helvetica;"> C-MLP2LN, lignuistic variables from L-units, overall accuracy 96.1%, looks good but uses F19=protime which has missing values in almost half of the cases. <span style="font-family: Arial,Helvetica;"> age >= 30 & sex=male & antivirals=no & protime <= 50


 * Confusion matrix: ||  || Live || Die ||
 * || Live || 120 || 3 ||
 * || Die || 3 || 29 ||

<span style="font-family: Arial,Helvetica;">Duch W, Adamczak R, Grąbczewski K, [|A new methodology of extraction, optimization and application of crisp and fuzzy logical rules]. IEEE Transactions on Neural Networks 12 (2001) 277-306
 * Method || Accuracy % || Reference ||
 * C-MLP2LN || ??? || Our ||
 * FSM || 90 || Our ||
 * PVM || ?? ||  ||
 * CART (decision tree) || 82.7 ||  ||

<span style="font-family: Arial,Helvetica;">Cleveland heart disease.
<span style="font-family: Arial,Helvetica;">From [|UCI repository], 303 cases, 13 attributes (4 cont, 9 nominal), many missing values. <span style="font-family: Arial,Helvetica;"> 2 (no, yes) or 5 classes (no, degree 1, 2, 3, 4). <span style="font-family: Arial,Helvetica;"> Class distribution: 164 (54.1%) no, 55+36+35+13 yes (45.9%) with disease degree 1-4.

<span style="font-family: Arial,Helvetica;"> C-MLP2LN simplified rules 85.5% overall accuracy. Rules for healthy class: <span style="font-family: Arial,Helvetica;"> R1: (thal=0 OR thal=1) & ca=0.0 (88.5%) <span style="font-family: Arial,Helvetica;"> R2: (thal=0 OR ca=0.0) & cp NOT 2 (85.2%) <span style="font-family: Arial,Helvetica;"> ELSE sick (89.2%)


 * Method || Accuracy % || Reference ||
 * C-MLP2LN || 82.5 || RA, estimated? ||
 * FSM || 82.2 || Rafał Adamczak ||

<span style="font-family: Arial,Helvetica;">Statlog Heart disease.
<span style="font-family: Arial,Helvetica;">13 attributes (extracted from 75), no missing values. <span style="font-family: Arial,Helvetica;"> 270=150+120 observations selected from the 303 cases (Cleveland Heart).

<span style="font-family: Arial,Helvetica;">Results without risk matrix <span style="font-family: Arial,Helvetica;">Duch W, Adamczak R, Grąbczewski K, [|A new methodology of extraction, optimization and application of crisp and fuzzy logical rules]. IEEE Transactions on Neural Networks 12 (2001) 277-306
 * Cost Matrix = || Absence || Presence ||
 * ^  || 0 || 1 ||
 * ^  || 5 || 0 ||
 * Method || Accuracy % || Reference ||
 * K* || 76.7 || WEKA, RA ||
 * C-MLP2LN || ??? || Our ||
 * 1R || 71.4 || WEKA, RA ||
 * T2 || 68.1 || WEKA, RA ||
 * FOIL || 64.0 || WEKA, RA ||
 * RBF || 60.0 || ToolDiag, RA ||
 * InductH || 58.5 || WEKA, RA ||

<span style="font-family: Arial,Helvetica;">Diabetes.
<span style="font-family: Arial,Helvetica;">From [|UCI repository], dataset "Pima Indian diabetes": <span style="font-family: Arial,Helvetica;"> 2 classes, 8 attributes, 768 instances, 500 (65.1%) healthy, 268 (34.9%) diabetes. <span style="font-family: Arial,Helvetica;"> F2 is "Plasma glucose concentration (2 hours oral glucose tolerance) test" <span style="font-family: Arial,Helvetica;"> F6 is "Body mass index (weight in kg/(height in m)^2)" <span style="font-family: Arial,Helvetica;"> 1 rule from SSV, overall accuracy 74.9%, Sensitivity=45.5, Spec.=90.6 <span style="font-family: Arial,Helvetica;"> IF F#2 > 144.5 then diabetes, else healthy <span style="font-family: Arial,Helvetica;"> Rule from C-MLP2LN with L-units, overall accuracy 75% <span style="font-family: Arial,Helvetica;"> IF ( F2<=151 AND F6<=47 ) THEN healthy, else diabetes <span style="font-family: Arial,Helvetica;"> 2 rules from SSV, overall accuracy 76.2%, Sensitivity=60.8, Spec.=84.4 <span style="font-family: Arial,Helvetica;"> IF F#2 > 144.5 OR (F#2 > 123.5 AND F#6 > 32.55) then diabetes, else healthy <span style="font-family: Arial,Helvetica;"> Estimation of accuracy (4 leaves in SSV): average of 10 runs, each 10xCV, accuracy 75.2 ±0.6

<span style="font-family: Arial,Helvetica;">Results from crossvalidation. <span style="font-family: Arial,Helvetica;">Duch W, Adamczak R, Grąbczewski K, [|A new methodology of extraction, optimization and application of crisp and fuzzy logical rules]. IEEE Transactions on Neural Networks 12 (2001) 277-306
 * Confusion matrix: ||  || Healthy || Diabetes ||
 * || Healthy || 467 || 159 ||
 * || Diabetes || 33 || 109 ||
 * Method || Accuracy % || Reference ||
 * SSV 5 nodes/BF || 75.3±4.8 || WD, Ghostminer ||
 * SSV opt nodes/3CV/BF || 74.7±3.5 || WD, Ghostminer ||
 * SSV opt prune/3CV/BS || 74.6±3.3 || WD, Ghostminer ||
 * SSV opt prune/3CV/BF || 74.0±4.1 || WD, Ghostminer ||
 * SSV opt nodes/3CV/BS || 72.9±4.3 || WD, Ghostminer ||
 * SSV 5 nodes/BF || 74.9±4.8 || WD, Ghostminer ||
 * SSV 3 nodes/BF || 74.6±5.2 || WD, Ghostminer ||
 * CART || 74.5±? || Stalog ||
 * DB-CART || 74.4±? || Shang & Breiman ||
 * ASR || 74.3±? || Ster & Dobnikar ||
 * CART || 72.8±? || Ster & Dobnikar ||
 * C4.5 || 73.0±? || Stalog ||
 * Default || 65.1±? ||  ||
 * C-MLP2LN, overall || 75.0±? || Our, 4/99 ||

<span style="font-family: Arial,Helvetica;">Hypothyroid.
<span style="font-family: Arial,Helvetica;">Thyroid, From [|UCI repository], dataset "ann-train.data": <span style="font-family: Arial,Helvetica;"> 3772 learning and 3428 testing examples; <span style="font-family: Arial,Helvetica;"> Training: 93+191+3488 or 2.47%, 5.06%, 92.47% <span style="font-family: Arial,Helvetica;"> Test: 73+177+3178 or 2.13%, 5.16%, 92.71% <span style="font-family: Arial,Helvetica;"> 21 attributes (15 binary, 6 continuous); 3 classes <span style="font-family: Arial,Helvetica;"> C-MLP2LN rules (all values of continuous features are multiplied here by 1000) <span style="font-family: Arial,Helvetica;"> Initial rules: <span style="font-family: Arial,Helvetica;"> primary hypothyroid: TSH>6.1 & FTI <65 <span style="font-family: Arial,Helvetica;"> compensated : TSH > 6 & TT4<149 & On_Tyroxin=FALSE & FTI>64 & surgery=False <span style="font-family: Arial,Helvetica;"> ELSE normal <span style="font-family: Arial,Helvetica;"> Optimized more accurate rules: 4 errors on the training set (99.89%), 22 errors on the test set (99.36%) <span style="font-family: Arial,Helvetica;"> primary hypothyroid: TSH>30.48 & FTI <64.27 (97.06%) <span style="font-family: Arial,Helvetica;"> primary hypothyroid: TSH=[6.02,29.53] & FTI <64.27 & T3< 23.22 (100%) <span style="font-family: Arial,Helvetica;"> compensated : TSH > 6.02 & FTI>[64.27,186.71] & TT4=[50, 150.5) & On_Tyroxin=no & surgery=no (98.96%) <span style="font-family: Arial,Helvetica;"> no hypothyroid : ELSE (100%)

<span style="font-family: Arial,Helvetica;">[|3 crisp logical rules] using TSH, FTI, T3, on_thyroxine, thyroid_surgery, TT4 give 99.3% of accuracy on the test set. <span style="font-family: Arial,Helvetica;"> Duch W, Adamczak R, Grąbczewski K, [|A new methodology of extraction, optimization and application of crisp and fuzzy logical rules]. IEEE Transactions on Neural Networks 12 (2001) 277-306
 * Method || % training || % test || Reference ||
 * C-MLP2LN rules + ASA || 99.9 || 99.36 || Rafał/Krzysztof/Grzegorz ||
 * CART || 99.8 || 99.36 || Weiss ||
 * PVM || 99.8 || 99.33 || Weiss ||
 * C-MLP2LN rules || 99.7 || 99.0 || Rafał/Krzysztof ||
 * C-MLP2LN rules || 99.7 || 99.0 || Rafał/Krzysztof ||

<span style="font-family: Arial,Helvetica;">Iris flowers
<span style="font-family: Arial,Helvetica;">150 vectors, 50 in each class: setosa, virginica, versicolor <span style="font-family: Arial,Helvetica;"> PL=x3=Petal Length; PW=x4=Petal Width <span style="font-family: Arial,Helvetica;"> PVM Rules: accuracy 98% in leave-one-out and overall

<span style="font-family: Arial,Helvetica;">C-MLP2LN rules: <span style="font-family: Arial,Helvetica;"> 7 errors, overall 95.3% accuracy <span style="font-family: Arial,Helvetica;">Higher accuracy: overall 98% <span style="font-family: Arial,Helvetica;">100% reliable rules reject 11 vectors, 8 virginica and 3 versicolor: <span style="font-family: Arial,Helvetica;">Summary: <span style="font-family: Arial,Helvetica;">References: <span style="font-family: Arial,Helvetica;"> S.M. Weiss, I. Kapouleas, "An empirical comparison of pattern recognition, neural nets and machine learning classification methods", in: J.W. Shavlik and T.G. Dietterich, Readings in Machine Learning, Morgan Kauffman Publ, CA 1990 <span style="font-family: Arial,Helvetica;"> N. Kasabov, Connectionist methods for fuzzy rules extraction, reasoning and adaptation. In: Proc. of the Int. Conf. on Fuzzy Systems, Neural Networks and Soft Computing, Iizuka, Japan, World Scientific 1996, pp. 74-77 <span style="font-family: Arial,Helvetica;"> Duch W, Adamczak R, Grąbczewski K, [|A new methodology of extraction, optimization and application of crisp and fuzzy logical rules]. IEEE Transactions on Neural Networks 12 (2001) 277-306 <span style="font-family: Arial,Helvetica;"> C. Browne, I. Duntsch, G. Gediga, IRIS revisited: A comparison of discriminant and enhanced rough set data analysis. In: L. Polkowski and A. Skowron, eds. Rough sets in knowledge discovery, vol. 2. Physica Verlag, Heidelberg, 1998, pp. 345-368 <span style="font-family: Arial,Helvetica;"> D. Nauck, U. Nauck and R. Kruse, Generating Classification Rules with the Neuro-Fuzzy System NEFCLASS. Proc. Biennial Conf. of the North American Fuzzy Information Processing Society (NAFIPS'96), Berkeley, 1996 <span style="font-family: Arial,Helvetica;"> S.K. Halgamuge and M. Glesner, Neural networks in designing fuzzy systems for real world applications. Fuzzy Sets and Systems 65:1-12, 1994
 * Setosa || Petal Length <3 ||
 * Virginica || Petal length >4.9 OR Petal Width >1.6 ||
 * Versicolor || ELSE ||
 * Setosa || PL <2.5 || 100% ||
 * Virginica || PL >4.8 || 92% ||
 * Versicolor || ELSE || 94% ||
 * Setosa || PL <2.9 || 100% ||
 * Virginica || PL>4.95 OR PW>1.65 || 94% ||
 * Versicolor || PL=[2.9,4.95] & PW=[0.9,1.65] || 100% ||
 * Setosa || PL <2.9 || 100% ||
 * Virginica || PL>5.25 OR PW>1.85 || 100% ||
 * Versicolor || PL=[2.9,4.9] & PW<1.7 || 100% ||
 * Method || Accuracy || Reference ||
 * PVM 1 rule || 97.3 || Weiss ||
 * CART (dec. tree) || 96.0 || Weiss ||
 * FuNN || 95.7 || Kasabov ||
 * NEFCLASS || 96.7 || Nauck et.al. ||
 * FuNe-I || 96.7 || Halgamuge ||
 * PVM 2 rules || 98.0 || Weiss, optimal result, corresponds to about 96% in CV tests ||
 * C-MLP2LN || 98.0 || Duch et.al. ||
 * SSV || 98.0 || Duch et.al. ||
 * Grobian (rough) || 100 || Browne; overfitting ||

<span style="font-family: Arial,Helvetica;">Mushrooms
<span style="font-family: Arial,Helvetica;">8124 instances, 4208 (51.8%) edible and 3916 (48.2%) poisonous; <span style="font-family: Arial,Helvetica;"> 22 attributes (all symbolic): cap shape (6, e.g.. bell, conical,flat...), cap surface (4), cap color (10), bruises (2), odor (9), gill attachment (4), gill spacing (3), gill size (2), gill color (12), stalk shape (2), stalk root (7, many missing values), surface above the ring (4), surface below the ring (4), color above the ring (9), color below the ring (9), veil type (2), veil color (4), ring number (3), spore print color (9), population (6), habitat (7). <span style="font-family: Arial,Helvetica;"> Together 118 logical input values. <span style="font-family: Arial,Helvetica;"> 2480 missing values for attribute 11 <span style="font-family: Arial,Helvetica;"> C-MLP2LN rules: <span style="font-family: Arial,Helvetica;"> Disjunctive rules for poisonous mushrooms, from most general to most specific:

AND.(stalk-color-above-ring=NOT.brown) || 99.90%, 8 cases missed || <span style="font-family: Arial,Helvetica;">Alternative R4' rule: population=clustered.AND.cap_color=white <span style="font-family: Arial,Helvetica;"> These rule involve 6 attributes (out of 22). Rule 1 may be replaced by: <span style="font-family: Arial,Helvetica;"> odor = creosote.OR.fishy.OR.foul.OR.musty.OR.pungent.OR.spicy <span style="font-family: Arial,Helvetica;"> Rules for edible mushrooms are obtained as negation of the rules given above, for example rule: <span style="font-family: Arial,Helvetica;"> Re1: odor=(almond.OR.anise.OR.none).AND.spore-print-color=NOT.green <span style="font-family: Arial,Helvetica;"> makes 48 errors, giving 99.41% accuracy on the whole dataset. <span style="font-family: Arial,Helvetica;"> Several slightly more complex variations on these rules exist, involving other attributes, such as gill_size, gill_spacing, stalk_surface_above_ring, but the rules given above are the simplest found so far. <span style="font-family: Arial,Helvetica;"> **Other methods:** <span style="font-family: Arial,Helvetica;"> [1] BRAINNE: 300 rules, > 8000 antecedents, 91% <span style="font-family: Arial,Helvetica;"> [2] STAGGER: asymptoted to 95% classification accuracy after reviewing 1000 instances. <span style="font-family: Arial,Helvetica;"> [3] HILLARY algorithm, about 95% <span style="font-family: Arial,Helvetica;"> **References:** <span style="font-family: Arial,Helvetica;"> Duch W, Adamczak R, Grabczewski K (1996) Extraction of logical rules from training data using backpropagation networks, in: Proc. of the The 1st Online Workshop on Soft Computing, 19-30.Aug.1996, pp. 25-30, available on-line at: http://www.bioele.nuee.nagoya-u.ac.jp/wsc1/ <span style="font-family: Arial,Helvetica;"> Duch W, Adamczak R, Grabczewski K, Ishikawa M, Ueda H, Extraction of crisp logical rules using constrained backpropagation networks - comparison of two new approaches, in: Proc. of the European Symposium on Artificial Neural Networks (ESANN'97), Bruge, Belgium 16-18.4.1997, pp. 109-114 <span style="font-family: Arial,Helvetica;"> Duch W, Adamczak R, Grąbczewski K, [|A new methodology of extraction, optimization and application of crisp and fuzzy logical rules]. IEEE Transactions on Neural Networks 12 (2001) 277-306 <span style="font-family: Arial,Helvetica;"> Schlimmer,J.S. (1987). Concept Acquisition Through Representational Adjustment (Technical Report 87-19), Doctoral disseration, Department of Information and Computer Science, University of California, Irvine. <span style="font-family: Arial,Helvetica;"> Iba,W., Wogulis,J., & Langley,P. (1988). Trading off Simplicity and Coverage in Incremental Concept Learning. In Proceedings of the 5th International Conference on Machine Learning, 73-79, Ann Arbor, Michigan: Morgan Kaufmann.
 * No. || Rule || Accuracy ||
 * 1 || odor=NOT(almond.OR.anise.OR.none) || 98.52%, 120 poisonous cases missed ||
 * 2 || spore-print-color=green || 99.41%, 48 cases missed ||
 * 3 || odor=none.AND.stalk-surface-below-ring=scaly.
 * 4 || habitat=leaves.AND.cap-color=white || 100% accuracy ||

<span style="font-family: Arial,Helvetica;">Monk 1
<span style="font-family: Arial,Helvetica;">Original rule is: head shape = body shape OR jacket color = red <span style="font-family: Arial,Helvetica;"> C-MLP2LN: <span style="font-family: Arial,Helvetica;"> 100% accuracy with 4 rules + 2 exception, 14 atomic formulae. <span style="font-family: Arial,Helvetica;"> Other systems: see the original paper: <span style="font-family: Arial,Helvetica;"> S. Thrun, J. Bala, E. Bloedorn, I. Bratko, B. Cestnik, J. Cheng, K. De Jong, S. Dzeroski, R. Hamann, K. Kaufman, S. Keller, I. Kononenko, J. Kreuziger, R.S. Michalski, T. Mitchell, P. Pachowicz, B. Roger, H. Vafaie, W. Van de Velde, W. Wenzel, J. Wnek, and J. Zhang. <span style="font-family: Arial,Helvetica;"> [|The MONK's problems: A performance comparison of different learning algorithms]. Technical Report CMU-CS-91-197, Carnegie Mellon University, Computer Science Department, Pittsburgh, PA, 1991.

<span style="font-family: Arial,Helvetica;">Monk 2
<span style="font-family: Arial,Helvetica;">Original rule: exactly two of the six features have their first values <span style="font-family: Arial,Helvetica;"> C-MLP2LN: <span style="font-family: Arial,Helvetica;"> 100% accuracy with 16 rules and 8 exceptions, 132 atomic formulae. <span style="font-family: Arial,Helvetica;"> Other systems: see the Thrun et al. original paper: [|The MONK's problems]

<span style="font-family: Arial,Helvetica;">Monk 3
<span style="font-family: Arial,Helvetica;">Original rule: <span style="font-family: Arial,Helvetica;"> NOT (body shape = octagon OR jacket color = blue) OR (holding = sward AND jacket color = green) <span style="font-family: Arial,Helvetica;"> was corrupted by 5% noise. <span style="font-family: Arial,Helvetica;"> C-MLP2LN: <span style="font-family: Arial,Helvetica;"> 100% accuracy with 33 atomic formulae. <span style="font-family: Arial,Helvetica;"> Other systems: see the Thrun et al. original paper: [|The MONK's problems] <span style="font-family: Arial,Helvetica;"> Duch W, Adamczak R, Grąbczewski K, [|A new methodology of extraction, optimization and application of crisp and fuzzy logical rules]. IEEE Transactions on Neural Networks 12 (2001) 277-306 <span style="font-family: Arial,Helvetica;"> Comparison of results:


 * Method || Monk-1 || Monk-2 || Monk-3 || Remarks ||
 * AQ17-DCI || 100 || 100 || 94.2 || Michalski ||
 * AQ17-HCI || 100 || 93.1 || 100 || Michalski ||
 * AQ17-GA || 100 || 86.8 || 100 || Michalski ||
 * Assistant Pro. || 100 || 81.5 || 100 || Monk paper ||
 * mFOIL || 100 || 69.2 || 100 || Monk paper ||
 * ID5R || 79.7 || 69.2 || 95.2 || Monk paper ||
 * IDL || 97.2 || 66.2 || -- || Monk paper ||
 * ID5R-hat || 90.3 || 65.7 || -- || Monk paper ||
 * TDIDT || 75.7 || 66.7 || -- || Monk paper ||
 * ID3 || 98.6 || 67.9 || 94.4 || Monk paper ||
 * AQR || 95.9 || 79.7 || 87.0 || Monk paper ||
 * CLASSWEB 0.10 || 71.8 || 64.8 || 80.8 || Monk paper ||
 * CLASSWEB 0.15 || 65.7 || 61.6 || 85.4 || Monk paper ||
 * CLASSWEB 0.20 || 63.0 || 57.2 || 75.2 || Monk paper ||
 * PRISM || 86.3 || 72.7 || 90.3 || Monk paper ||
 * ECOWEB || 82.7 || 71.3 || 68.0 || Monk paper ||
 * Neural methods ||
 * MLP || 100 || 100 || 93.1 || Monk paper ||
 * MLP+reg. || 100 || 100 || 97.2 || Monk paper ||
 * Cascade correlation || 100 || 100 || 97.2 || Monk paper ||
 * FSM, Gaussians || 94.5 || 79.3 || 95.5 || Duch et.al. ||
 * SSV || 100 || 80.6 || 97.2 || Duch et.al. ||
 * C-MLP2LN || 100 || 100 || 100 || Duch et.al. ||
 * Other methods ||
 * kNN, with VDM metric || -- || -- || 98.0 || K. Grudziński ||

<span style="font-family: Arial,Helvetica;">NASA Shuttle
<span style="font-family: Arial,Helvetica;">Training set 43500, test set 14500, 9 attributes, 7 classes <span style="font-family: Arial,Helvetica;"> Approximately 80% of the data belongs to class 1. <span style="font-family: Arial,Helvetica;"> Rules obtained from FSM, without optimization:

F1 [27,39] and F2 [-16,13] F2 [-22,110] and F9 [-14,2] F2 [-25,7] and F3 [76,83] and F7 [36,58] || 15043/0 11612/0 26014/0 11648/0 || F1 [42, 59] and F2 [10,50] and F6 [0,59] and F7 [19,37] and F9 [2,24] || 25/0 10/0 || F2 [-318,-31] and F5 [-188,34] F2 [-177,-19] and F5 [36,72] and F9 [6,54] F2 [-42,-17] and F3 [71,78] and F6 [-14,24] and F9 [2,26] || 58/0 82/0 27/0 9/5 || F1 [53, 66] and F2 [-60,24] and F4 [-29,30] and F9 [8,266] F2 [-12,18] and F3 [64, 79] and F7 [ 4, 26] and F9 [8, 82] || 6063/0 5564/0 2634/0 || <span style="font-family: Arial,Helvetica;">Rules obtained from FSM, without optimization:
 * Class || 15 rules, train 99.89%, test 99.81% accuracy || Correct/False ||
 * C1 || F9 [-14,0]
 * C2 || F2 [18,110] and F4 = 0 and F5 [-188,12]
 * C3 || F2 [-118,-22] and F7 [5,71] and F8 [73,103] and F9 [16,86]
 * C4 || F1 [51, 67] and F2 [-18,17] and F9 [4,70]
 * C5 || F7 [-48, 5] || 2458/2 ||
 * C6 || F2 [-4821,-386] and F5 [-46,34] || 9/0 ||

F1 [27,44] and F2 [-20,18] F2 [-15,51] and F9 [-14,2] F6 [-13839,-41] and F9 [-356,10] F1 [27,50] and F2 [-27,8] and F9 [-14,24] || 15043/0 19316/0 26003/0 36/0 25563/1 || F1 [40, 57] and F2 [14,59] and F9 [ 8,22] || 25/0 12/0 || F1 [ 27, 81] and F2 [-138,-24] and F9 [22,88] F2 [ -64, -21] and F4 [-2,1] and F6 [-37,27] and F9 [2,48] || 46/0 60/0 67/8 || F1 [53,59] and F2 [-4821,275] and F5 [-188,46] and F7 [-48,28] F1 [53,63] and F2 [ -19, 26] and F4 [ -21, 50] and F9 [4,126] || 3805 3512/2 6735/0 || F7 [ - 19, 5] and F9 [44,196] F6 [ -4, 4] and F8 [36, 38] and F9 [30,38] || 690/0 1772/0 203/0 || F2 [-4821,-908] and F5 [8,34] F2 [ 275,1958] and F7 [1,54] || 3/0 9/0 6/2 || <span style="font-family: Arial,Helvetica;">17 optimized FSM rules make only 3 errors on the training set (99.99\% accuracy), leaving 8 vectors unclassified, and no errors on the test set but leaving 9 vectors unclassified (99.94\%). After Gaussian fuzzification of inputs (very small, 0.05\%) only 3 errors and 5 unclassified vectors are obtained for the training and 3 vectors are unclassified and 1 error is made (with the probability of correct class for this case close to 50\%) for the test set. <span style="font-family: Arial,Helvetica;"> 32 rules from SSV gave even better results: 100\% correct on the training and only 1 error on the test set.
 * Class || 19 rules, train 99.94%, test 99.87% accuracy || Correct/False ||
 * C1 || F9 [-14,0]
 * C2 || F2 [21,110] and F4 [ 0, 0] and F5 [-188,26]
 * C3 || F2 [-102,-37] and F9 [2,28]
 * C4 || F1 [53,61] and F2 [ -46, 45] and F7 [ 1, 40] and F9 [18,126]
 * C5 || F4 [-2044,769] and F7 [-48, 2]
 * C6 || F2 [-4821,-4475]

<span style="font-family: Arial,Helvetica;">Satellite image dataset (STATLOG version)
<span style="font-family: Arial,Helvetica;">Training 4435 test 2000 cases, 36 semi-continous [0 to 255] attributes (= 4 spectral bands x 9 pixels in neighbourhood) and 6 decision classes: 1,2,3,4,5 and 7 (class 6 has been removed because of doubts about the validity of this class).

<span style="font-family: Arial,Helvetica;">Duch W, Adamczak R, Grąbczewski K, [|A new methodology of extraction, optimization and application of crisp and fuzzy logical rules]. IEEE Transactions on Neural Networks 12 (2001) 277-306
 * Method || % train || % test || Time train || Time test ||
 * Dipol92 || 94.9 || 88.9 || 746 || 111 ||
 * Radial || 88.9 || 87.9 || 564 || 74 ||
 * CART || 92.1 || 86.2 || 330 || 14 ||
 * Bayesian Tree || 98.0 || 85.3 || 248 || 10 ||
 * C4.5 || 96.0 || 85.0 || 434 || 1 ||
 * New ID || 93.3 || 85.0 || 226 || 53 ||

<span style="font-family: Arial,Helvetica;">Ionosphere
<span style="font-family: Arial,Helvetica;">200 training, 150 test cases, 34 continuous attributes, 2 classes

<span style="font-family: Arial,Helvetica;">N. Shang, L. Breiman, ICONIP'96, p.133 <span style="font-family: Arial,Helvetica;"> David Aha: k-NN+C4+IB3 (Aha \& Kibler, IJCAI-1989), IB3 parameter settings: 70% and 80% for acceptance and dropping respectively. <span style="font-family: Arial,Helvetica;"> RIAC, C4.5 from: H.J. Hamilton, N. Shan, N. Cercone, RIAC: a rule induction algorithm based on approximate classification, Tech. Rep. CS 96-06, Regina University 1996.
 * Method || Accuracy % || Reference ||
 * 3-NN + simplex || 98.7 || Our ??? ||
 * 3-NN || 96.7 || our ||
 * IB3 || 96.7 || Aha ||
 * MLP+BP || 96.0 || Sigillito ||
 * C4.5 || 94.9 || Hamilton ||
 * RIAC || 94.6 || Hamilton ||
 * C4 (no windowing) || 94.0 || Aha ||
 * Non-linear perceptron || 92.0 || Sigillito ||
 * FSM + rotation || 92.8 || our ||
 * 1-NN || 92.1 || Aha ||
 * DB-CART || 91.3 || Shang, Breiman ||
 * Linear perceptron || 90.7 || Sigillito ||
 * CART || 88.9 || Shang, Breiman ||

<span style="font-family: Arial,Helvetica;">Sonar
<span style="font-family: Arial,Helvetica;">208 cases, 60 continuous attributes, 2 classes <span style="font-family: Arial,Helvetica;"> From the [|CMU benchmark repository]

<span style="font-family: Arial,Helvetica;">Our results: kNN also from 13xCV, results from 10xCV are quite similar, for example 1-NN Manhattan 84.5±0.9
 * Method || Train % || Test % || Reference ||
 * MLP+BP, 12 hidden || 99.8±0.1 || 84.7±5.7 || Gorman, Sejnowski ||
 * MLP+BP, 24 hidden || 99.8±0.1 || 84.5±5.7 || Gorman, Sejnowski ||
 * 1-NN, Manhattan ||  || 84.2±1.0 || our (KG) ||
 * MLP+BP, 6 hidden || 99.7±0.2 || 83.5±5.6 || Gorman, Sejnowski ||
 * FSM - methodology ? ||  || 83.6 || our (RA) ||
 * 1-NN Euclidean ||  || 82.2±0.6 || our (KG) ||
 * DB-CART, 10xCV ||  || 81.8 || Shang, Breiman ||
 * CART, 10xCV ||  || 67.9 || Shang, Breiman ||

<span style="font-family: Arial,Helvetica;">Vovel
<span style="font-family: Arial,Helvetica;">528 training, 462 test cases, 10 continous attributes, 11 classes <span style="font-family: Arial,Helvetica;"> From the [|CMU benchmark repository]

<span style="font-family: Arial,Helvetica;">N. Shang, L. Breiman, ICONIP'96, p.133, made 10xCv instead of using the test set.
 * Method || Train || Test || Reference ||
 * CART-DB, 10xCV on total set || 90.0 ||  || Shang, Breiman ||
 * CART, 10xCV on total set || 78.2 ||  || Shang, Breiman ||
 * FSM initialization, methodology ? ||  || 84.4 || our (RA) ||
 * 9-NN ||  || 56.5 || our ? ||
 * Square node network, 88 units ||  || 54.8 || UCI ||
 * Gaussian node network, 528 units ||  || 54.6 || UCI ||
 * 1-NN ||  || 54.1 || UCI ||
 * Radial Basis Function, 528 units ||  || 53.5 || UCI ||
 * Gaussian node network, 88 units ||  || 53.5 || UCI ||
 * Square node network, 22 ||  || 51.1 || UCI ||
 * Multi-layer perceptron, 88 hidden ||  || 50.6 || UCI ||
 * Modified Kanerva Model, 528 units ||  || 50.0 || UCI ||
 * Radial Basis Function, 88 units ||  || 47.6 || UCI ||
 * Single-layer perceptron, 88 hidden ||  || 33.3 || UCI ||
 * Single-layer perceptron, 88 hidden ||  || 33.3 || UCI ||

<span style="font-family: Arial,Helvetica;">Other Data
<span style="font-family: Arial,Helvetica;">Glass: Shang, Breiman CART 28.6% error, DB-CART 29.4%

<span style="font-family: Arial,Helvetica;">DNA-Primate splice-junction gene sequence

<span style="display: block; font-family: Arial,Helvetica; text-align: right;">//[|Włodzisław Duch], last modification 28.11.2000//