Classification+results

=[|Datasets used for classification]: comparison of results= ||  [|Computational Intelligence Laboratory] | [|Department of Informatics] | [|Nicolaus Copernicus University]

Links on: AI and Machine Learning | AI in Information Retrieval | Cognitive science | Computational Intelligence | Neuroscience | Software & Databases | Science & Fringes | Comparison of classfication results | Logical rules extracted from data |

 Before using any new dataset it should be described here!  [|Results from the Statlog project are here].  Logical rules derived for data are here.

**Medical**: Appendicitis | Breast cancer (Wisconsin) | Breast Cancer (Ljubljana) | Diabetes (Pima Indian) | [|Heart disease (Cleveland)] | Heart disease (Statlog version) | Hepatitis | Hypothyroid | Hepatobiliary disorders |  **Other datasets**: Ionosphere | Satellite image dataset (Statlog version) | Sonar | Telugu Vovel | Vovel | Wine | Other data: Glass, DNA |  **More results** for [|Statlog datasets.]

**A note of caution**: comparison of different classifiers is not an easy task. Before you get into ranking of methods using the numbers presented in tables below please note the following facts.  Many results we have collected give only a single number (even results from the [|StatLog project]!), without standard deviation. Since most classifiers may give results that differ by several percent on slightly different data partitions single numbers do not mean much.  Leave-one-out tests have been criticized as a basis for accuracy evaluation, the conclusion is that crossvalidation is safer, cf:  Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proc. of the 14th Int. Joint Conference on Artificial Intelligence, Morgan Kaufmann, pp. 1137-1143.  Crossvalidation tests (CV) are also not ideal. Theoretically about 2/3 of results should be within a single standard deviation from the average, and 95% of results should be within two standard deviations, so in a 10-fold crossvalidation you should see very rarely reuslts that are beter or worse than 2xSTDs. Running CV several times may also give you different answers. Search for the best estimator continues. Cf:  Dietterich, T. (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10 (7), 1895-1924;  Nadeau C, Bengio Y. (1999) Inference for the Generalization Error. Tech. rep. 99s-25, CIRANO, J. Machine Learning (Kluver, in print).  Even the best accuracy and variance estimation is not sufficient, since performance cannot be characterized by a single number. It should be much better to provide full Receiver Operator Curves (ROC). Combining ROC with variance estimation would be ideal. <span style="font-family: Arial,Helvetica;"> Unfortunately this still remains to be done. All we can do now is to collect some numbers in tables. <span style="font-family: Arial,Helvetica;"> Our results are obtained usually with the [|GhostMiner] package, developed in our group. <span style="font-family: Arial,Helvetica;"> Some [|publications with results] are on my page. <span style="font-family: Arial,Helvetica;"> [|TuneIT], Testing Machine Learning & Data Mining Algorithms - Automated Tests, Repeatable Experiments, Meaningful Results. <span style="font-family: Arial,Helvetica;">Results of hand-written signs and numbers [|classification are here].

<span style="font-family: Arial,Helvetica;">Appendicitis.
<span style="font-family: Arial,Helvetica;">106 vectors, 8 attributes, two classes (85 acute a. +21 other, or 80.2+19.8%), data from Shalom Weiss; <span style="font-family: Arial,Helvetica;"> Results obtained with the leave-one-out test, % of accuracy given <span style="font-family: Arial,Helvetica;"> Attribute names: WBC1, MNEP, MNEA, MBAP, MBAA, HNEP, HNEA

k=4,5, stand. Euclid, f2+f4 removed || 88.7 || our (WD/KG) || <span style="font-family: Arial,Helvetica;">For 90% accuracy and p=0.95 confidence level 2-tailed bounds are: [82.8%,94.4%] <span style="font-family: Arial,Helvetica;"> S.M. Weiss, I. Kapouleas, "An empirical comparison of pattern recognition, neural nets and machine learning classification methods", in: J.W. Shavlik and T.G. Dietterich, Readings in Machine Learning, Morgan Kauffman Publ, CA 1990 <span style="font-family: Arial,Helvetica;"> H.J. Hamilton, N. Shan, N. Cercone, RIAC: a rule induction algorithm based on approximate classification, Tech. Rep. CS 96-06, Regina University 1996. <span style="font-family: Arial,Helvetica;"> C-MLP2LN (logical rules) only estimated l-o-o since the rules are like PVM. <span style="font-family: Arial,Helvetica;"> 3 crisp logical rules, overall 91.5% accuracy <span style="font-family: Arial,Helvetica;"> Results for 10-fold stratified crossvalidation
 * Method || Accuracy % || Reference ||
 * PVM (logical rules) || 89.6 || Weiss, Kapouleas ||
 * C-MLP2LN (logical rules) || 89.6±? || our ||
 * k-NN, stand. Manhatan, k=8,9,22-25
 * 9-NN, stand. Euclides || 87.7 || our (KG) ||
 * RIAC (prob. inductive) || 86.9 || Hamilton //et.al// ||
 * 1-NN, stand. Euclides, f2+f4 rem || 86.8 || our (WD/KG) ||
 * MLP+backpropagation || 85.8 || Weiss, Kapouleas ||
 * CART, C4.5 (dec. trees) || 84.9 || Weiss, Kapouleas ||
 * FSM || 84.9 || our (RA) ||
 * Bayes rule (statistical) || 83.0 || Weiss, Kapouleas ||

<span style="font-family: Arial,Helvetica;">Maszczyk T, Duch W, [|Support Feature Machine], WCCI 2010 (submitted).
 * Method || Accuracy % || Reference ||
 * NBC+WX+G(WX) || ??.5±7.7 || TM-GM ||
 * NBC+G(WX) || ??.2±6.7 || TM-GM ||
 * kNN auto+G(WX) Eukl || ??.2±6.7 || TM-GM ||
 * C-MLP2LN || 89.6 || our logical rules ||
 * 20-NN, stand. Eukl f 4,1,7 || 89.3±8.6 || our (KG); feature sel. from CV on the whole data set ||
 * SSV beam leaves || 88.7±8.5 || WD ||
 * SVM linear C=1 || 88.1±8.6 || WD ||
 * 6-NN, stand. Eukl. || 88.0±7.9 || WD ||
 * SSV default || 87.8±8.7 || WD ||
 * SSV beam pruning || 86.9±9.8 || WD ||
 * kNN, k=auto, Eucl || 86.7±6.6 || WD ||
 * FSM, a=0.9, Gauss, cluster || 86.1±8.8 || WD-GM ||
 * NBC || 85.9±10.2 || TM-GM ||
 * VSS 1 neuron, 4 it || 84.9±7.4 || WD/MK ||
 * SVM Gauss C=32, s=0.1 || 84.4±8.2 || WD ||
 * MLP+BP (Tooldiag) || 83.9 || Rafał Adamczak ||
 * RBF (Tooldiag) || 80.2 || Rafał Adamczak ||
 * RBF (Tooldiag) || 80.2 || Rafał Adamczak ||

<span style="font-family: Arial,Helvetica;">[|Wisconsin breast cancer].
<span style="font-family: Arial,Helvetica;">From [|UCI repository], 699 cases, 9 attributes, two classes, 458 (65.5%) & 241 (34.5%). <span style="font-family: Arial,Helvetica;"> Results obtained with the leave-one-out test, % of accuracy given.

<span style="font-family: Arial,Helvetica;"> F6 has 16 missing values, removing these vectors leaves 683 examples.

<span style="font-family: Arial,Helvetica;">H.J. Hamilton, N. Shan, N. Cercone, RIAC: a rule induction algorithm based on approximate classification, Tech. Rep. CS 96-06, Regina University 1996. <span style="font-family: Arial,Helvetica;"> Results obtained with the 10-fold crossvalidation, 16 vectors with F6 values missing removed, 683 samples left, % of accuracy given.
 * Method || Accuracy % || Reference ||
 * FSM || 98.3 || our (RA) ||
 * 3-NN stand Manhatan || 97.1 || our (KG) ||
 * 21-NN stand. Euclidean || 96.9 || our (KG) ||
 * C4.5 (decision tree) || 96.0 || Hamilton //et.al// ||
 * RIAC (prob. inductive) || 95.0 || Hamilton //et.al// ||

<span style="font-family: Arial,Helvetica;">Results obtained with the 10-fold crossvalidation, % of accuracy given, all data, missing vlues handled in different ways.
 * method || Accuracy % || Reference ||
 * Naive MFT || 97.1 || Opper, Winther, L-1-O est. 97.3 ||
 * SVM Gauss, C=1,s=0.1 || 97.0±2.3 || WD-GM ||
 * SVM (10xCV) || 96.9 || Opper, Winther ||
 * SVM lin, opt C || 96.9±2.2 || WD-GM, same with Minkovsky kernel ||
 * Cluster means, 2 prototypes || 96.5±2.2 || MB ||
 * **Default, majority** || 65.5 || -- ||

<span style="font-family: Arial,Helvetica;">For 97% accuracy and p=0.95 confidence level 2-tailed bounds are: [95.5%,98.0%] <span style="font-family: Arial,Helvetica;"> K.P. Bennett, J. Blue, [|A Support Vector Machine Approach to Decision Trees], R.P.I Math Report No. 97-100, Rensselaer Polytechnic Institute, Troy, NY, 1997 <span style="font-family: Arial,Helvetica;"> N. Shang, L. Breiman, ICONIP'96, p.133 <span style="font-family: Arial,Helvetica;"> B. Ster and A. Dobnikar, //Neural networks in medical diagnosis: Comparison with other methods//. In A. Bulsari //et al//., editor, Proceedings of the International Conference EANN '96, pages 427-430, 1996. <span style="font-family: Arial,Helvetica;"> F. Zarndt, A Comprehensive Case Study: An Examination of Machine Learning and Connectionist Algorithms, MSc Thesis, Dept. of Computer Science, Brigham Young University, 1995
 * method || Accuracy % || Reference ||
 * NB + kernel est || 97.5±1.8 || WD, WEKA, 10X10CV ||
 * SVM (5xCV) || 97.2 || Bennet and Blue ||
 * kNN with DVDM distance || 97.1 || our (KG) ||
 * GM k-NN, k=3, raw, Manh || 97.0±2.1 || WD, 10X10CV ||
 * GM k-NN, k=opt, raw, Manh || 97.0±1.7 || WD, 10CV only ||
 * VSS, 8 it/2 neurons || 96.9±1.8 || WD/MK; 98.1% train ||
 * FSM-Feature Space Mapping || 96.9±1.4 || RA/WD, a=.99 Gaussian ||
 * Fisher linear discr. anal || 96.8 || Ster, Dobnikar ||
 * MLP+BP || 96.7 || Ster, Dobnikar ||
 * MLP+BP (Tooldiag) || 96.6 || Rafał Adamczak ||
 * LVQ || 96.6 || Ster, Dobnikar ||
 * kNN, Euclidean/Manhattan f. || 96.6 || Ster, Dobnikar ||
 * SNB, semi-naive Bayes (pairwise dependent) || 96.6 || Ster, Dobnikar ||
 * SVM lin, opt C || 96.4±1.2 || WD-GM, 16 missing with -10 ||
 * VSS, 8 it/1 neuron! || 96.4±2.0 || WD/MK, train 98.0% ||
 * GM IncNet || 96.4±2.1 || NJ/WD; FKF, max. 3 neurons ||
 * NB - naive Bayes (completly independent) || 96.4 || Ster, Dobnikar ||
 * SSV opt nodes, 3CV int || 96.3±2.2 || WD/GM; training 96.6±0.5 ||
 * IB1 || 96.3±1.9 || Zarndt ||
 * DB-CART (decision tree) || 96.2 || Shang, Breiman ||
 * GM SSV Tree, opt nodes BFS || 96.0±2.9 || WD/KG (beam search 94.0) ||
 * LDA - linear discriminant analysis || 96.0 || Ster, Dobnikar ||
 * OC1 DT (5xCV) || 95.9 || Bennet and Blue ||
 * RBF (Tooldiag) || 95.9 || Rafał Adamczak ||
 * GTO DT (5xCV) || 95.7 || Bennet and Blue ||
 * ASI - Assistant I tree || 95.6 || Ster, Dobnikar ||
 * MLP+BP (Weka) || 95.4±0.2 || TW/WD ||
 * OCN2 || 95.2±2.1 || Zarndt ||
 * IB3 || 95.0±4.0 || Zarndt ||
 * MML tree || 94.8±1.8 || Zarndt ||
 * ASR - Assistant R (RELIEF criterion) tree || 94.7 || Ster, Dobnikar ||
 * C4.5 tree || 94.7±2.0 || Zarndt ||
 * LFC, Lookahead Feature Constr binary tree || 94.4 || Ster, Dobnikar ||
 * CART tree || 94.4±2.4 || Zarndt ||
 * ID3 || 94.3±2.6 || Zarndt ||
 * C4.5 (5xCV) || 93.4 || Bennet and Blue ||
 * C 4.5 rules || 86.7±5.9 || Zarndt ||
 * **Default, majority** || 65.5 || -- ||
 * QDA - quadratic discr anal || 34.5 || Ster, Dobnikar ||

<span style="font-family: Arial,Helvetica;">[|Breast Cancer (Ljubljana data)]
<span style="font-family: Arial,Helvetica;">From [|UCI repository] (restricted): 286 instances, 201 no-recurrence-events (70.3%), 85 recurrence-events (29.7%); <span style="font-family: Arial,Helvetica;"> 9 attributes, between 2-13 values each, 9 missing values <span style="font-family: Arial,Helvetica;"> Results - 10xCV? Sometimes methodology was unclear; <span style="font-family: Arial,Helvetica;"> difficult, noisy data, some methods are below the base rate (70.3%).

<span style="font-family: Arial,Helvetica;">For 78% accuracy and p=0.95 confidence level 2-tailed bounds are: [72.9%,82.4%]


 * Assistant-86 achieved 78 %, but this seems to be best result that happens in some crossvalidations, not the average.
 * Cestnik,G., Konenenko,I, & Bratko,I. (1987). //Assistant-86: A Knowledge-Elicitation Tool for Sophisticated Users//. In I.Bratko & N.Lavrac (Eds.) Progress in Machine Learning, 31-45, Sigma Press.
 * Blanchard, G., Schafer,C., Rozenholc,Y., &Muller,K.-R. (2007) Optimal dyadic decision trees. Machine Learning 66: 709-717.
 * Clark,P. & Niblett,T. (1987). //Induction in Noisy Domains//. In: Progress in Machine Learning (from the Proceedings of the 2nd European Working Session on Learning), 11-30, Bled, Yugoslavia: Sigma Press.
 * Porter R.B., G. Beate Zimmer, Don R. Hush: Stack Filter Classifiers. ISMM 2009: 282-294
 * Michalski,R.S., Mozetic,I., Hong,J., & Lavrac,N. (1986). //The Multi-Purpose Incremental Learning System AQ15 and its Testing Application to Three Medical Domains//. In Proceedings of the Fifth National Conference on Artificial Intelligence, 1041-1045, Philadelphia, PA: Morgan Kaufmann.
 * Tan, M., & Eshelman, L. (1988). //Using weighted networks to represent classification knowledge in noisy domains.// Proceedings of the Fifth International Conference on Machine Learning, 121-134, Ann Arbor, MI.
 * F. Zarndt, A Comprehensive Case Study: An Examination of Machine Learning and Connectionist Algorithms, MSc Thesis, Dept. of Computer Science, Brigham Young University, 1995
 * S.M. Weiss, I. Kapouleas. An empirical comparison of pattern recognition, neural nets and machine learning classification methods, in: J.W. Shavlik and T.G. Dietterich, Readings in Machine Learning, Morgan Kauffman Publ, CA 1990

<span style="font-family: Arial,Helvetica;"> They used leave-one-out tests and obtained: <span style="font-family: Arial,Helvetica;"> MLP+backprop: 75.7% train, 71.5% test; <span style="font-family: Arial,Helvetica;"> Bayes 75.9% train, 71.8% test, <span style="font-family: Arial,Helvetica;"> CART & PVM 77.4% train, 77.1% test; <span style="font-family: Arial,Helvetica;"> k-NN 65.3 test

<span style="font-family: Arial,Helvetica;">[|Hepatitis].
<span style="font-family: Arial,Helvetica;">From [|UCI repository], 155 vectors, 19 attributes, <span style="font-family: Arial,Helvetica;"> Two classes, die with 32 (20.6%), live with 123 (79.4%). <span style="font-family: Arial,Helvetica;"> Many missing values! F18 has 67 missing values, F15 has 29, F17 has 16 and other features between 0 and 11. <span style="font-family: Arial,Helvetica;"> Results obtained with the leave-one-out test, % of accuracy given <span style="font-family: Arial,Helvetica;">MLP, CART, LDA results from (check it ?) S.M. Weiss, I. Kapouleas, "//An empirical comparison of pattern recognition, neural nets and machine learning classification methods//", in: J.W. Shavlik and T.G. Dietterich, Readings in Machine Learning, Morgan Kauffman Publ, CA 1990. <span style="font-family: Arial,Helvetica;"> Other results - our own; <span style="font-family: Arial,Helvetica;"> Results obtained with the 10-fold crossvalidation, % of accuracy given; our results with stratified crossvalidation, other results - who knows? Differences for this dataset are rather small, 0.1-0.2%.
 * Method || Accuracy, % test || Reference ||
 * C-MLP2LN/SSV single rule || 76.2±0.0 || WD/K. Grabczewski, stable rule ||
 * SSV Tree rule || 75.7±1.1 || WD, av. from 10x10CV ||
 * MML Tree || 75.3±7.8 || Zarndt ||
 * SVM Gauss, C=1, s =0.1 || 73.8±4.3 || WD, GM ||
 * MLP+backprop || 73.5±9.4 || Zarndt ||
 * SVM Gauss, C, s opt || 72.4±5.1 || WD, GM ||
 * IB1 || 71.8±7.5 || Zarndt ||
 * CART || 71.4±5.0 || Zarndt ||
 * ODT trees || 71.3±4.2 || Blanchard ||
 * SVM lin, C=opt || 71.0±4.7 || WD, GM ||
 * UCN 2 || 70.7±7.8 || Zarndt ||
 * SFC, Stack filters || 70.6±4.2 || Porter ||
 * **Default, majority** || **70.3±0.0** || ============ ||
 * SVM lin, C=1 || 70.0±5.6 || WD, GM ||
 * C 4.5 rules || 69.7±7.2 || Zarndt ||
 * Bayes rule || 69.3±10.0 || Zarndt ||
 * C 4.5 || 69.2±4.9 || Blanchard ||
 * Weighted networks || 68-73.5 || Tan, Eshelman ||
 * IB3 || 67.9±7.7 || Zarndt ||
 * ID3 rules || 66.2±8.5 || Zarndt ||
 * AQ15 || 66-72 || Michalski //e.a.// ||
 * Inductive || 65-72 || Clark, Niblett ||
 * Method || Accuracy % || Reference ||
 * 21-NN, stand Manhattan || 90.3 || our (KG) ||
 * FSM || 90.0 || our (RA) ||
 * 14-NN, stand. Euclid || 89.0 || our (KG) ||
 * LDA || 86.4 || Weiss & K ||
 * CART (decision tree) || 82.7 || Weiss & K ||
 * MLP+backprop || 82.1 || Weiss & K ||

<span style="font-family: Arial,Helvetica;">Results on BP, LVQ, ..., SNB are from: B. Ster and A. Dobnikar, //Neural networks in medical diagnosis: Comparison with other methods//. In A. Bulsari //et al//., editor, Proceedings of the International Conference EANN '96, pages 427-430, 1996. <span style="font-family: Arial,Helvetica;"> Our good results reflect superior handling of missing values ? <span style="font-family: Arial,Helvetica;"> Duch W, Grudziński K (1998) A framework for similarity-based methods. Second Polish Conference on Theory and Applications of Artificial Intelligence, Lodz, 28-30 Sept. 1998, pp. 33-60 <span style="font-family: Arial,Helvetica;"> Weighted kNN: Duch W, Grudzinski K and Diercksen G.H.F (1998) Minimal distance neural methods. World Congress of Computational Intelligence, May 1998, Anchorage, Alaska, IJCNN'98 Proceedings, pp. 1299-1304
 * Method || Accuracy % || Reference ||
 * Weighted 9-NN || 92.9±? || Karol Grudziński ||
 * 18-NN, stand. Manhattan || 90.2±0.7 || Karol Grudziński ||
 * FSM with rotations || 89.7±? || Rafał Adamczak ||
 * 15-NN, stand. Euclidean || 89.0±0.5 || Karol Grudziński ||
 * VSS 4 neurons, 5 it || 86.5±8.8 || WD/MK, train 97.1 ||
 * FSM without rotations || 88.5 || Rafał Adamczak ||
 * LDA, linear discriminant analysis || 86.4 || Stern & Dobnikar ||
 * Naive Bayes and Semi-NB || 86.3 || Stern & Dobnikar ||
 * IncNet || 86.0 || Norbert Jankowski ||
 * QDA, quadratic discriminant analysis || 85.8 || Stern & Dobnikar ||
 * 1-NN || 85.3±5.4 || Stern & Dobnikar, std added by WD ||
 * VSS 2 neurons, 5 it || 85.1±7.4 || WD/MK, train 95.0 ||
 * ASR || 85.0 || Stern & Dobnikar ||
 * Fisher discriminant analysis || 84.5 || Stern & Dobnikar ||
 * LVQ || 83.2 || Stern & Dobnikar ||
 * CART (decision tree) || 82.7 || Stern & Dobnikar ||
 * MLP with BP || 82.1 || Stern & Dobnikar ||
 * ASI || 82.0 || Stern & Dobnikar ||
 * LFC || 81.9 || Stern & Dobnikar ||
 * RBF (Tooldiag) || 79.0 || Rafał Adamczak ||
 * MLP+BP (Tooldiag) || 77.4 || Rafał Adamczak ||

<span style="font-family: Arial,Helvetica;">[|Statlog version of Cleveland Heart disease].
<span style="font-family: Arial,Helvetica;">13 attributes (extracted from 75), no missing values. <span style="font-family: Arial,Helvetica;"> 270=150+120 observations selected from the 303 cases (Cleveland Heart). <span style="font-family: Arial,Helvetica;"> Attribute Information:

in mg/dl || by flouroscopy |||||| 13. thal: 3 = normal; 6 = fixed defect; 7 = reversable defect || <span style="font-family: Arial,Helvetica;">Attributes types: Real: 1,4,5,8,10,12; Ordered:11, Binary: 2,6,9 Nominal:7,3,13 <span style="font-family: Arial,Helvetica;"> Classes: Absence (1) or presence (2) of heart disease; <span style="font-family: Arial,Helvetica;"> In [|Statlog experiments on heart data] cost or risk matrix has been used with 9-fold crossvalidation, only cost values are given. <span style="font-family: Arial,Helvetica;"> Results below are obtained with the 10-fold crossvalidation, % of accuracy given, no risk matrix
 * 1. age || 2. sex || 3. chest pain type (4 values) || 4. resting blood pressure || 5. serum cholestorol
 * 6. fasting blood sugar 120 mg/dl || 7. resting electrocardiographic results (values 0,1,2) || 8. maximum heart rate achieved || 9. exercise induced angina || 10. oldpeak = ST depression induced by exercise relative to rest ||
 * 11. the slope of the peak exercise ST segment || 12. number of major vessels (0-3) colored

<span style="font-family: Arial,Helvetica;">Results [|for Heart] and other Statlog datasest are [|collected here].
 * Method || Accuracy % || Reference ||
 * Lin SVM 2D QCP || 85.9±5.5 || MG, 10xCV ||
 * kNN auto+WX || ??.8±5.6 || TM GM 10xCV ||
 * SVM Gauss+WX+G(WX), C=1 s=2-5 || ??.8±6.4 || TM GM 10xCV ||
 * SVM lin, C=0.01 || 84.9±7.9 || WD, GM 10x(9xCV) ||
 * SFM, G(WX), default C=1 || ??±5.1 || TM, GM 10xCV ||
 * Naive-Bayes || 84.5±6.3 || TM, GM 10xCV ||
 * Naive-Bayes || 83.6 || RA, WEKA ||
 * SVML default C=1 || 82.5±6.4 || TM, GM 10xCV ||
 * K* || 76.7 || WEKA, RA ||
 * IB1c || 74.0 || WEKA, RA ||
 * 1R || 71.4 || WEKA, RA ||
 * T2 || 68.1 || WEKA, RA ||
 * MLP+BP || 65.6 || ToolDiag, RA ||
 * FOIL || 64.0 || WEKA, RA ||
 * RBF || 60.0 || ToolDiag, RA ||
 * InductH || 58.5 || WEKA, RA ||
 * Base rate (majority classifier) || 55.7 ||  ||
 * IB1-4 || 50.0 || ToolDiag, RA ||

<span style="font-family: Arial,Helvetica;">[|Cleveland heart disease].
<span style="font-family: Arial,Helvetica;">From [|UCI repository], 303 cases, 13 attributes (4 cont, 9 nominal), 7 vectors with missing values ? <span style="font-family: Arial,Helvetica;"> 2 (no, yes) or 5 classes (no, degree 1, 2, 3, 4). <span style="font-family: Arial,Helvetica;"> Class distribution: 164 (54.1%) no, 55+36+35+13 yes (45.9%) with disease degree 1-4. <span style="font-family: Arial,Helvetica;"> Results obtained with the leave-one-out test, % of accuracy given, 2 classes used.

<span style="font-family: Arial,Helvetica;">MLP, CART, LDA where are these results from ??? <span style="font-family: Arial,Helvetica;"> Other results - our own. <span style="font-family: Arial,Helvetica;"> Results obtained with the 10-fold crossvalidation, % of accuracy given. <span style="font-family: Arial,Helvetica;"> Ster & Dobnikar reject 6 vectors (leaving 297) with missing values. <span style="font-family: Arial,Helvetica;"> We use all 303 vectors replacing missing values by means for their class; in KNN we have used Stalog convention, 297 vectors
 * Method || Accuracy % || Reference ||
 * LDA || 84.5 || Weiss ? ||
 * 25-NN, stand, Euclid || 83.6±0.5 || WD/KG repeat?? ||
 * C-MLP2LN || 82.5 || RA, estimated? ||
 * FSM || 82.2 || Rafał Adamczak ||
 * MLP+backprop || 81.3 || Weiss ? ||
 * CART || 80.8 || Weiss ? ||

<span style="font-family: Arial,Helvetica;">For 85% accuracy and p=0.95 confidence level 2-tailed bounds are: [80.5%,88.6%] <span style="font-family: Arial,Helvetica;"> Results obtained with BP, LVQ, ..., SNB are from: B. Ster and A. Dobnikar, //Neural networks in medical diagnosis: Comparison with other methods//. In: A. Bulsari //et al//., editor, Proceedings of the International Conference EANN '96, pages 427-430, 1996.
 * Method || Accuracy % || Reference ||
 * [|IncNet]+transformations || **90.0** || Norbert Jankowski; check again! ||
 * 28-NN, stand, Euclid, 7 features || 85.1±0.5 || WD/KG ||
 * LDA || 84.5 || Ster & Dobnikar ||
 * Fisher discriminant analysis || 84.2 || Ster & Dobnikar ||
 * k=7, Euclid, std || 84.2±6.6 || WD, GhostMiner ||
 * 16-NN, stand, Euclid || 84±0.6 || WD/KG ||
 * FSM, 82.4-84% on test only || 84.0 || Rafał Adamczak ||
 * k=1:10, Manhattan, std || 83.8±5.3 || WD, GhostMiner ||
 * Naive Bayes || 82.5-83.4 || Rafał; Ster, Dobnikar ||
 * SNB || 83.1 || Ster & Dobnikar ||
 * LVQ || 82.9 || Ster & Dobnikar ||
 * GTO DT (5xCV) || 82.5 || Bennet and Blue ||
 * kNN, k=19, Eculidean || 82.1±0.8 || Karol Grudziński ||
 * k=7, Manhattan, std || 81.8±10.0 || WD, GhostMiner ||
 * SVM (5xCV) || 81.5 || Bennet and Blue ||
 * kNN (k=1? raw data?) || 81.5 || Ster & Dobnikar ||
 * MLP+BP (standarized) || 81.3 || Ster, Dobnikar, Rafał Adamczak ||
 * Cluster means, 2 prototypes || 80.8±6.4 || MB ||
 * CART || 80.8 || Ster & Dobnikar ||
 * RBF (Tooldiag, standarized) || 79.1 || Rafał Adamczak ||
 * Gaussian EM, 60 units || 78.6 || Stensmo & Sejnowski ||
 * ASR || 78.4 || Ster & Dobnikar ||
 * C4.5 (5xCV) || 77.8 || Bennet and Blue ||
 * IB1c (WEKA) || 77.6 || Rafał Adamczak ||
 * QDA || 75.4 || Ster & Dobnikar ||
 * LFC || 75.1 || Ster & Dobnikar ||
 * ASI || 74.4 || Ster & Dobnikar ||
 * K* (WEKA) || 74.2 || Rafał Adamczak ||
 * OC1 DT (5xCV) || 71.7 || Bennet and Blue ||
 * 1 R (WEKA) || 71.0 || Rafał Adamczak ||
 * T2 (WEKA) || 69.0 || Rafał Adamczak ||
 * FOIL (WEKA) || 66.4 || Rafał Adamczak ||
 * InductH (WEKA) || 61.3 || Rafał Adamczak ||
 * **Default, majority** || 54.1 || ==baserate== ||
 * C4.5 rules || 53.8±5.9 || Zarndt ||
 * IB1-4 (WEKA) || 46.2 || Rafał Adamczak ||

<span style="font-family: Arial,Helvetica;"> Magnus Stensmo and Terrence J. Sejnowski, A Mixture Model System for Medical and Machine Diagnosis, Advances in Neural Information Processing Systems 7 (1995) 1077-1084

<span style="font-family: Arial,Helvetica;"> Kristin P. Bennett, J. Blue, [|A Support Vector Machine Approach to Decision Trees], R.P.I Math Report No. 97-100, Rensselaer Polytechnic Institute, Troy, NY, 1997 <span style="font-family: Arial,Helvetica;"> Other results for this dataset (methodology sometimes uncertain): <span style="font-family: Arial,Helvetica;"> D. Wettschereck, averaging 25 runs with 70% train and 30% test, variants of k-NN with different metric functions and scaling. <span style="font-family: Arial,Helvetica;"> David Aha & Dennis Kibler - From [|UCI repository] past usage

<span style="font-family: Arial,Helvetica;">Gennari, J.H., Langley, P, Fisher, D. (1989). Models of incremental concept formation. Artificial Intelligence, 40, 11-61. <span style="font-family: Arial,Helvetica;"> Friedman N, Geiger D, Goldszmit M (1997). Bayesian networks classifiers. Machine Learning 29: 131--163
 * Method || Accuracy % || Reference ||
 * k-NN, Value Distance Metric (VDM) || 82.6 || D. Wettschereck ||
 * k-NN, Euclidean || 82.4±0.8 || D. Wettschereck ||
 * k-NN, Variable Similarity Metric || 82.4 || D. Wettschereck ||
 * k-NN, Modified VDM || 83.1 || D. Wettschereck ||
 * Other k-NN variants || < 82.4 || D. Wettschereck ||
 * k-NN, Mutual Information || 81.8 || D. Wettschereck ||
 * CLASSIT (hierarchical clustering) || 78.9 || Gennari, Langley, Fisher ||
 * NTgrowth (instance-based) || 77.0 || Aha & Kibler ||
 * C4 || 74.8 || Aha & Kibler ||
 * Naive Bayes || 82.8±1.3 || Friedman et.al, 5xCV, 296 vectors ||

<span style="font-family: Arial,Helvetica;">[|Diabetes].
<span style="font-family: Arial,Helvetica;">From the [|UCI repository], dataset "[|Pima Indian diabetes]": <span style="font-family: Arial,Helvetica;"> 2 classes, 8 attributes, 768 instances, 500 (65.1%) negative (class1), and 268 (34.9%) positive tests for diabetes. class2. <span style="font-family: Arial,Helvetica;"> All patients were females at least 21 years old of Pima Indian heritage. <span style="font-family: Arial,Helvetica;"> Attributes used: <span style="font-family: Arial,Helvetica;"> 1. Number of times pregnant <span style="font-family: Arial,Helvetica;"> 2. Plasma glucose concentration a 2 hours in an oral glucose tolerance test <span style="font-family: Arial,Helvetica;"> 3. Diastolic blood pressure (mm Hg) <span style="font-family: Arial,Helvetica;"> 4. Triceps skin fold thickness (mm) <span style="font-family: Arial,Helvetica;"> 5. 2-Hour serum insulin (mu U/ml) <span style="font-family: Arial,Helvetica;"> 6. Body mass index (weight in kg/(height in m)^2) <span style="font-family: Arial,Helvetica;"> 7. Diabetes pedigree function <span style="font-family: Arial,Helvetica;"> 8. Age (years) <span style="font-family: Arial,Helvetica;"> Results obtained with the 10-fold crossvalidation, % of accuracy given; Statlog results are with 12-fold crossvalidation

<span style="font-family: Arial,Helvetica;">For 77.7% accuracy and p=0.95 confidence level 2-tailed bounds are: [74.6%,80.5%] <span style="font-family: Arial,Helvetica;"> Results on BP, LVQ, ..., SNB are from: B. Ster and A. Dobnikar, //Neural networks in medical diagnosis: Comparison with other methods//. In A. Bulsari //et al//., editor, Proceedings of the International Conference EANN '96, pages 427-430, 1996.
 * Method || Accuracy % || Reference ||
 * Logdisc || 77.7 || Statlog ||
 * [|IncNet] || 77.6 || Norbert Jankowski ||
 * DIPOL92 || 77.6 || Statlog ||
 * Linear Discr. Anal. || 77.5-77.2 || Statlog; Ster & Dobnikar ||
 * SVM, linear, C=0.01 || 77.5±4.2 || WD-GM, 10XCV averaged 10x ||
 * SVM, Gauss, C, sigma opt || 77.4±4.3 || WD-GM, 10XCV averaged 10x ||
 * SMART || 76.8 || Statlog ||
 * GTO DT (5xCV) || 76.8 || Bennet and Blue ||
 * kNN, k=23, Manh, raw, W || 76.7±4.0 || WD-GM, feature weighting 3CV ||
 * kNN, k=1:25, Manh, raw || 76.6±3.4 || WD-GM, most cases k=23 ||
 * ASI || 76.6 || Ster & Dobnikar ||
 * Fisher discr. analysis || 76.5 || Ster & Dobnikar ||
 * MLP+BP || 76.4 || Ster & Dobnikar ||
 * MLP+BP || 75.8±6.2 || Zarndt ||
 * LVQ || 75.8 || Ster & Dobnikar ||
 * LFC || 75.8 || Ster & Dobnikar ||
 * RBF || 75.7 || Statlog ||
 * NB || 75.5-73.8 || Ster & Dobnikar; Statlog ||
 * kNN, k=22, Manh || 75.5 || Karol Grudziński ||
 * MML || 75.5±6.3 || Zarndt ||
 * SNB || 75.4 || Ster & Dobnikar ||
 * BP || 75.2 || Statlog ||
 * SSV DT || 75.0±3.6 || WD-GM, SSV BS, node 5CV MC ||
 * kNN, k=18, Euclid, raw || 74.8±4.8 || WD-GM ||
 * CART DT || 74.7±5.4 || Zarndt ||
 * CART DT || 74.5 || Stalog ||
 * DB-CART || 74.4 || Shang & Breiman ||
 * ASR || 74.3 || Ster & Dobnikar ||
 * ODT, dyadic trees || 74.0±2.3 || Blanchard ||
 * Cluster means, 2 prototypes || 73.7±3.7 || MB ||
 * SSV DT || 73.7±4.7 || WD-GM, SSV BS, node 10CV strat ||
 * SFC, stacking filters || 73.3±1.9 || Porter ||
 * C4.5 DT || 73.0 || Stalog ||
 * C4.5 DT || 72.7±6.6 || Zarndt ||
 * Bayes || 72.2±6.9 || Zarndt ||
 * C4.5 (5xCV) || 72.0 || Bennet and Blue ||
 * CART || 72.8 || Ster & Dobnikar ||
 * Kohonen || 72.7 || Statlog ||
 * C4.5 DT || 72.1±2.6 || Blanchard (averaged over 100 runs) ||
 * kNN || 71.9 || Ster & Dobnikar ||
 * ID3 || 71.7±6.6 || Zarndt ||
 * IB3 || 71.7±5.0 || Zarndt ||
 * IB1 || 70.4±6.2 || Zarndt ||
 * kNN, k=1, Euclides, raw || 69.4±4.4 || WD-GM ||
 * kNN || 67.6 || Statlog ||
 * C4.5 rules || 67.0±2.9 || Zarndt ||
 * OCN2 || 65.1±1.1 || Zarndt ||
 * **Default, majority** || 65.1 ||  ||
 * QDA || 59.5 || Ster, Dobnikar ||

<span style="font-family: Arial,Helvetica;">Other results (with different tests):
 * Bennett K.P, J. Blue, [|A Support Vector Machine Approach to Decision Trees], R.P.I Math Report No. 97-100, Rensselaer Polytechnic Institute, Troy, NY, 1997
 * Blanchard, G., Schafer,C., Rozenholc,Y., &Muller,K.-R. (2007) Optimal dyadic decision trees. Machine Learning 66: 709-717.
 * Michie D, D.J. Spiegelhalter, C.C. Taylor (eds), [|Machine Learning, Neural and Statistical Classification], Stalog project book.
 * Porter R.B., G. Beate Zimmer, Don R. Hush: Stack Filter Classifiers. ISMM 2009: 282-294
 * Shang N, L. Breiman, ICONIP'96, p.133

<span style="font-family: Arial,Helvetica;">Friedman N, Geiger D, Goldszmit M (1997). Bayesian networks classifiers. Machine Learning 29: 131--163 <span style="font-family: Arial,Helvetica;"> Opper/Winther use 200 training and 332 test examples (following Rippley), with TAP MFT results on test 81%, SVS at 80.1% and best NN as 77.4%.
 * Method || Accuracy % || Reference ||
 * SVM (5xCV) || 77.6 || Bennet and Blue ||
 * C4.5 || 76.0±0.9 || Friedman, 5xCV ||
 * Semi-Naive Bayes || 76.0±0.8 || Friedman, 5xCV ||
 * Naive Bayes || 74.5±0.9 || Friedman, 5xCV ||
 * **Default, majority** || 65.1 ||  ||

<span style="font-family: Arial,Helvetica;">[|Hypothyroid].
<span style="font-family: Arial,Helvetica;">Thyroid, From [|UCI repository], dataset "ann-train.data": A Thyroid database suited for training ANNs. <span style="font-family: Arial,Helvetica;"> 3772 learning and 3428 testing examples; primary hypothyroid, compensated hypothyroid, normal. <span style="font-family: Arial,Helvetica;"> Training: 93+191+3488 or 2.47%, 5.06%, 92.47% <span style="font-family: Arial,Helvetica;"> Test: 73+177+3178 or 2.13%, 5.16%, 92.71% <span style="font-family: Arial,Helvetica;"> 21 attributes (15 binary, 6 continuous); 3 classes <span style="font-family: Arial,Helvetica;"> The problem is to determine whether a patient referred to the clinic has hypothyroid. Therefore three classes are built: normal (not hypothyroid), hyperfunction and subnormal functioning. Because 92 percent of the patients are not hyperthyroid. A good classifier must be significant better than 92%. <span style="font-family: Arial,Helvetica;"> Note: These are the datas Quinlans used in the case study of his article "Simplifying Decision Trees" (International Journal of Man-Machine Studies (1987) 221-234) <span style="font-family: Arial,Helvetica;"> Names: I (W.D.) have investigated this issue and after some mail exchange with Chris Mertz, who maintains the [|UCI repository]; here is the conclusion:


 * 1 age: continuous || 2 sex: {M, F} || 3 on thyroxine: logical ||
 * 4 maybe on thyroxine: logical || 5 on antithyroid medication: logical || 6 sick - patient reports malaise: logical ||
 * 7 pregnant: logical || 8 thyroid surgery: logical || 9 I131 treatment: logical ||
 * 10 test hypothyroid: logical || 11 test hyperthyroid: logical || 12 on lithium: logical ||
 * 13 has goitre: logical || 14 has tumor: logical || 15 hypopituitary: logical ||
 * 16 psychological symptoms: logical || 17 TSH: continuous || 18 T3: continuous ||
 * 19 TT4: continuous || 20 T4U: continuous || 21 FTI: continuous ||

<span style="font-family: Arial,Helvetica;">Results:
<span style="font-family: Arial,Helvetica;">For 99.90% accuracy on training and p=0.95 confidence level 2-tailed bounds are: [99.74%,99.96%] <span style="font-family: Arial,Helvetica;"> Most NN results from W. Schiffmann, M. Joost, R. Werner, 1993; MLP2LN and Init+a,b ours. <span style="font-family: Arial,Helvetica;"> k-NN, PVM and CART from S.M. Weiss, I. Kapouleas, "//An empirical comparison of pattern recognition, neural nets and machine learning classification methods//", in: J.W. Shavlik and T.G. Dietterich, Readings in Machine Learning, Morgan Kauffman Publ, CA 1990 <span style="font-family: Arial,Helvetica;"> SVM with linear and Gaussian kernels gives quite poor results on this data. <span style="font-family: Arial,Helvetica;"> [|3 crisp logical rules] using TSH, FTI, T3, on_thyroxine, thyroid_surgery, TT4 give 99.3% of accuracy on the test set.
 * Method || % training || % test || Reference ||
 * C-MLP2LN rules+ASA || 99.90 || 99.36 || Rafał/Krzysztof/Grzegorz ||
 * CART || 99.80 || 99.36 || Weiss ||
 * PVM || 99.80 || 99.33 || Weiss ||
 * SSV beam search || 99.80 || 99.33 || WD ||
 * IncNet || 99.68 || 99.24 || Norbert ||
 * SSV opt leaves or pruning || 99.7 || 99.1 || WD ||
 * MLP init+ a,b opt. || 99.5 || 99.1 || Rafał ||
 * C-MLP2LN rules || 99.7 || 99.0 || Rafał/Krzysztof ||
 * Cascade correlation || 100.0 || 98.5 || Schiffmann ||
 * Local adapt. rates || 99.6 || 98.5 || Schiffmann ||
 * BP+genetic opt. || 99.4 || 98.4 || Schiffmann ||
 * Quickprop || 99.6 || 98.3 || Schiffmann ||
 * RPROP || 99.6 || 98.0 || Schiffmann ||
 * 3-NN, Euclides, with 3 features || 98.7 || 97.9 || W.D./Karol ||
 * 1-NN, Euclides, with 3 features || 98.4 || 97.7 || W.D./Karol ||
 * Best backpropagation || 99.1 || 97.6 || Schiffmann ||
 * 1-NN, Euclides, 8 features used || -- || 97.3 || Karol/W.D. ||
 * SVM Gauss, C=8 s=0.1 || 98.3 || 96.1 || WD ||
 * Bayesian classif. || 97.0 || 96.1 || Weiss? ||
 * SVM Gauss, C=1 s=0.1 || 95.4 || 94.7 || WD ||
 * BP+conj. gradient || 94.6 || 93.8 || Schiffmann ||
 * 1-NN Manhattan, std data ||  || 93.8 || Karol G./WD ||
 * SVM lin, C=1 || 94.1 || 93.3 || WD ||
 * SVM Gauss, C=8 s=5 || 100 || 92.8 || WD ||
 * **Default, majority** 250 test errors ||  || 92.7 ||   ||
 * 1-NN Manhattan, raw data ||  || 92.2 || Karol G./WD ||

<span style="font-family: Arial,Helvetica;">Hepatobiliary disorders
<span style="font-family: Arial,Helvetica;">Contains medical records of 536 patients admitted to a university-affiliated Tokyo-based hospital, with four types of hepatobiliary disorders: alcoholic liver damage, primary hepatoma, liver cirrhosis and cholelithiasis. The records included results of 9 biochemical tests and sex of the patient. The same 163 cases as in [Hayashi et.al] were used as the test data. <span style="font-family: Arial,Helvetica;"> FSM gives about 60 Gaussian or triangular membership functions achieving accuracy of 75.5-75.8%. Rotation of these functions (i.e. introducing linear combination of inputs to the rules) does not improve this accuracy. 10-fold crossvalidation tests on the mixed, training plus test data, give similar results. The best results were obtained with the K* method based on algorithmic complexity optimization, giving 78.5% on the test set, and kNN with Manhattan distance function, k=1 and selection of features (using the leave-one-out method on the training data, features 2, 5, 6 and 9 were removed), giving 80.4% accuracy. Simulated annealing optimization of the scaling factors for the remaining 5 features give 81.0% and optimizing scaling factors using all input features 82.8%. The scaling factors are: 0.92, 0.60, 0.91, 0.92, 0.07, 0.41, 0.55, 0.86, 0.30. Similar accuracy is obtained using multisimplex method for optimization of the scaling factors.

<span style="font-family: Arial,Helvetica;">Y. Hayashi, A. Imura, K. Yoshida, “Fuzzy neural expert system and its appli-cation to medical diagnosis”, in: 8th International Congress on Cybernetics and Systems, New York City 1990, pp. 54-61 <span style="font-family: Arial,Helvetica;"> S. Mitra, R. De, S. Pal, “Knowledge based fuzzy MLP for classification and rule generation”, IEEE Transactions on Neural Networks 8, 1338-1350, 1997, a knowledge-based fuzzy MLP system gives results on the test set in the range from 33% to 66.3%, depending on the actual fuzzy model used. <span style="font-family: Arial,Helvetica;"> W. Duch and K. Grudzinski, ``Prototype Based Rules - New Way to Understand the Data,'' Int. Joint Conference on Neural Networks, Washington D.C., pp. 1858-1863, 2001. Contains best results with 1-NN, Camberra and feature selection, 83.4% on the test.
 * Method || Training set || Test set || Reference ||
 * IB2-IB4 || 81.2-85.5 || 43.6-44.6 || WEKA, our calculation ||
 * Naive Bayes || -- || 46.6 || WEKA, our calculation ||
 * 1R (rules) || 58.4 || 50.3 || WEKA, our calculation ||
 * T2 (rules from decision tree) || 67.5 || 53.3 || WEKA, our calculation ||
 * FOIL (inductive logic) || 99 || 60.1 || WEKA, our calculation ||
 * FSM, initial 49 crisp logical rules || 83.5 || 63.2 || FSM, our calculation ||
 * LDA (statistical) || 68.4 || 65.0 || our calculation ||
 * DLVQ (38 nodes) || 100 || 66.0 || our calculation ||
 * C4.5 decision rules || 64.5 || 66.3 || our calculation ||
 * Best fuzzy MLP model || 75.5 || 66.3 || Mitra et. al ||
 * MLP with RPROP ||  || 68.0 || our calculation ||
 * Cascade Correlation ||  || 71.0 || our calculation ||
 * Fuzzy neural network || 100 || 75.5 || Hayashi ||
 * C4.5 decision tree || 94.4 || 75.5 || our calculation ||
 * FSM, Gaussian functions || 93 || 75.6 || our calculation ||
 * FSM, 60 triangular functions || 93 || 75.8 || our calculation ||
 * IB1c (instance-based) || -- || 76.7 || WEKA, our calculation ||
 * kNN, k=1, Camberra, raw || 76.1 || 80.4 || WD/SBL ||
 * K* method || -- || 78.5 || WEKA, our calculation ||
 * 1-NN, 4 features removed, Manhattan || 76.9 || 80.4 || our calculation, KG ||
 * 1-NN, Camberra, raw, removed f2, 6, 8, 9 || 77.2 || 83.4 || our calculation, KG ||

<span style="font-family: Arial,Helvetica;">[|Landsat Satellite image dataset] (STATLOG version)
<span style="font-family: Arial,Helvetica;">Training 4435 test 2000 cases, 36 semi-continous [0 to 255] attributes (= 4 spectral bands x 9 pixels in neighbourhood) and 6 decision classes: 1,2,3,4,5 and 7 (class 6 has been removed because of doubts about the validity of this class). <span style="font-family: Arial,Helvetica;"> The StatLog database consists of the multi-spectral values of pixels in 3x3 neighbourhoods in a satellite image, and the classification associated with the central pixel in each neighbourhood. The aim is to predict this classification, given the multi-spectral values. In the sample database, the class of a pixel is coded as a number.

<span style="font-family: Arial,Helvetica;">The original database was generated from Landsat Multi-Spectral Scanner image data. The sample database was generated taking a small section (82 rows and 100 columns) from the original data. One frame of Landsat MSS imagery consists of four digital images of the same scene in different spectral bands. Two of these are in the visible region (corresponding approximately to green and red regions of the visible spectrum) and two are in the (near) infra-red. Each pixel is a 8-bit binary word, with 0 corresponding to black and 255 to white. The spatial resolution of a pixel is about 80m x 80m. Each image contains 2340 x 3380 such pixels. <span style="font-family: Arial,Helvetica;"> The database is a (tiny) sub-area of a scene, consisting of 82 x 100 pixels. Each line of data corresponds to a 3x3 square neighbourhood of pixels completely contained within the 82x100 sub-area. Each line contains the pixel values in the four spectral bands (converted to ASCII) of each of the 9 pixels in the 3x3 neighbourhood and a number indicating the classification label of the central pixel. In each line of data the four spectral values for the top-left pixel are given first followed by the four spectral values for the top-middle pixel and then those for the top-right pixel, and so on with the pixels read out in sequence left-to-right and top-to-bottom. Thus, the four spectral values for the central pixel are given by attributes 17,18,19 and 20. If you like you can use only these four attributes, while ignoring the others. This avoids the problem which arises when a 3x3 neighbourhood straddles a boundary. <span style="font-family: Arial,Helvetica;"> All results from Statlog book, except GM - [|GhostMiner] calculations, W. Duch.
 * Method || % training || % test || Time train || Time test ||
 * MLP+SCG || 96.0 || 91.0 || reg alfa=0.5, 36 hidden nodes, 1400 it || fast; WD ||
 * k-NN || -- || 90.9 || auto-k=3, Manhattan, std data || GM 2.0 ||
 * k-NN || 91.1 || 90.6 || 2105, Statlog || 944; parametry? ||
 * k-NN || -- || 90.4 || auto-k=5, Euclidean, std data || GM 2.0 ||
 * k-NN || -- || 90.0 || k=1, Manhattan, std data, no training || fast, GM 2.0 ||
 * FSM || 95.1 || 89.7 || std data, a=0.95 || fast, GM 2.0; best NN result ||
 * LVQ || 95.2 || 89.5 || 1273 || 44 ||
 * k-NN || -- || 89.4 || k=1, Euclidean, std data, no training || fast, GM 2.0 ||
 * Dipol92 || 94.9 || 88.9 || 746 || 111 ||
 * MLP+SCG || 94.4 || 88.5 || 5000 it; active learning+reg a=0.5, 8-12 hidden || fast; WD ||
 * SVM || 91.6 || 88.4 || std data, Gaussian kernel || fast, GM 2.0; unclassified 4.3% ||
 * Radial || 88.9 || 87.9 || 564 || 74 ||
 * Alloc80 || 96.4 || 86.8 || 63840 || 28757 ||
 * IndCart || 97.7 || 86.2 || 2109 || 9 ||
 * CART || 92.1 || 86.2 || 330 || 14 ||
 * MLP+BP || 88.8 || 86.1 || 72495 || 53 ||
 * Bayesian Tree || 98.0 || 85.3 || 248 || 10 ||
 * C4.5 || 96.0 || 85.0 || 434 || 1 ||
 * New ID || 93.3 || 85.0 || 226 || 53 ||
 * QuaDisc || 89.4 || 84.5 || 157 || 53 ||
 * SSV || 90.9 || 84.3 || default par. || very fast, GM 2.0 ||
 * Cascade || 88.8 || 83.7 || 7180 || 1 ||
 * Log DA, Disc || 88.1 || 83.7 || 4414 || 41 ||
 * LDA, Discrim || 85.1 || 82.9 || 68 || 12 ||
 * Kohonen || 89.9 || 82.1 || 12627 || 129 ||
 * Bayes || 69.2 || 71.3 || 75 || 17 ||
 * Kohonen || 89.9 || 82.1 || 12627 || 129 ||
 * Bayes || 69.2 || 71.3 || 75 || 17 ||

<span style="font-family: Arial,Helvetica;">[|Machine Learning, Neural and Statistical Classification], D. Michie, D.J. Spiegelhalter, C.C. Taylor (eds), Stalog project book!
 * N || Description || Train || Test ||
 * 1 || red soil || 1072 (24.17%) || 461 (23.05%) ||
 * 2 || cotton crop || 479 (10.80%) || 224 (11.20%) ||
 * 3 || grey soil || 961 (21.67%) || 397 (19.85%) ||
 * 4 || damp grey soil || 415 (09.36%) || 211 (10.55%) ||
 * 5 || veg. Stubble || 470 (10.60%) || 237 (11.85%) ||
 * 6 || Mixture class || 0 || 0 ||
 * 7 || very damp grey soil || 1038 (23.40%) || 470 (23.50%) ||

<span style="font-family: Arial,Helvetica;">[|Ionosphere]
<span style="font-family: Arial,Helvetica;">351 data records, with class division 224 (63.8%) + 126 (35.9%). Usually first 200 vectors are taken for training, and last 151 for the test, but this is very unbalanced: in the training set 101 (50.5%) and 99 (49.5%) are from 1/2 class, in the test set 123 (82%) and 27 (18%) are from class 1/2. <span style="font-family: Arial,Helvetica;"> 34 attributes, but f2=0 always and should be removed; f1 is binary, the remaining 32 attributes are continuous. <span style="font-family: Arial,Helvetica;"> 2 classes - different types of radar signals reflected from ionoshpere. <span style="font-family: Arial,Helvetica;"> Some vectors: 8, 18, 20, 22, 24, 30, 38, 52, 76, 78, 80, 82, 103, 163, 169, 171, 183, 187, 189, 191, 201, 215, 219, 221, 223, 225, 227, 229, 231, 233, 249, are either binary 0, 1 or have only 3 values -1, 0, +1. <span style="font-family: Arial,Helvetica;"> For example, vector 169 has only one component = 1, all others are 0.

<span style="font-family: Arial,Helvetica;">Perceptron+MLP results: <span style="font-family: Arial,Helvetica;"> Sigillito, V. G., Wing, S. P., Hutton, L. V., & Baker, K. B. (1989) Classification of radar returns from the ionosphere using neural networks. Johns Hopkins APL Technical Digest, 10, 262-266. <span style="font-family: Arial,Helvetica;"> N. Shang, L. Breiman, ICONIP'96, p.133 <span style="font-family: Arial,Helvetica;"> David Aha: k-NN+C4+IB3, from Aha, D. W., & Kibler, D. (1989). Noise-tolerant instance-based learning algorithms. In Proceedings of the Eleventh International Joint Conference on Artificial Intelligence (pp. 794-799). Detroit, MI: Morgan Kaufmann. <span style="font-family: Arial,Helvetica;"> IB3 parameter settings: 70% and 80% for acceptance and dropping respectively. <span style="font-family: Arial,Helvetica;"> RIAC, C4.5 from: H.J. Hamilton, N. Shan, N. Cercone, RIAC: a rule induction algorithm based on approximate classification, Tech. Rep. CS 96-06, Regina University 1996. <span style="font-family: Arial,Helvetica;"> K.P. Bennett, J. Blue, [|A Support Vector Machine Approach to Decision Trees], R.P.I Math Report No. 97-100, Rensselaer Polytechnic Institute, Troy, NY, 1997 <span style="font-family: Arial,Helvetica;"> Training/test division is not too good in this case, distributions are a bit differnet. <span style="font-family: Arial,Helvetica;"> In 10xCV results are:
 * Method || Accuracy % || Reference ||
 * 3-NN + simplex || 98.7 || Our own weighted kNN ||
 * VSS 2 epochs || 96.7 || MLP with numerical gradient ||
 * 3-NN || 96.7 || KG, GM with or without weights ||
 * IB3 || 96.7 || Aha, 5 errors on test ||
 * 1-NN, Manhattan || 96.0 || GM kNN (our) ||
 * MLP+BP || 96.0 || Sigillito ||
 * SVM Gaussian || 94.9±2.6 || GM (our), defaults, similar for C=1-100 ||
 * C4.5 || 94.9 || Hamilton ||
 * 3-NN Canberra || 94.7 || GM kNN (our) ||
 * RIAC || 94.6 || Hamilton ||
 * C4 (no windowing) || 94.0 || Aha ||
 * C4.5 || 93.7 || Bennet and Blue ||
 * SVM || 93.2 || Bennet and Blue ||
 * Non-lin perceptron || 92.0 || Sigillito ||
 * FSM + rotation || 92.8 || our ||
 * 1-NN, Euclidean || 92.1 || Aha, GM kNN (our) ||
 * DB-CART || 91.3 || Shang, Breiman ||
 * Linear perceptron || 90.7 || Sigillito ||
 * OC1 DT || 89.5 || Bennet and Blue ||
 * CART || 88.9 || Shang, Breiman ||
 * SVM linear || 87.1±3.9 || GM (our), defaults ||
 * GTO DT || 86.0 || Bennet and Blue ||

<span style="font-family: Arial,Helvetica;">VSS is an MLP with search, implemented by Mirek Kordos, used with 3 epochs; neurons may be sigmoidal or step-wise (64 values). <span style="font-family: Arial,Helvetica;"> Maszczyk T, Duch W, [|Support Feature Machine], WCCI 2010 (submitted).
 * Method || Accuracy % || Reference ||
 * SFM+G+G(WX) || ??±2.6 || GM (our), C=1, s=2-5 ||
 * kNN auto+WX+G(WX) || ??.4±3.6 || GM (our) ||
 * SVM Gaussian || 94.6±4.3 || GM (our), C=1, s=2-5 ||
 * VSS-MKNN || 91.5±4.3 || MK, 12 neurons (similar 8-17) ||
 * SVM lin || 89.5±3.8 || GM (our), C=1, s=2-5 ||
 * SSV tree || 87.8±4.5 || GM (our), default ||
 * 1-NN || 85.8±4.9 || GM std, Euclid ||
 * 3-NN || 84.0±5.4 || GM std, Euclid ||

<span style="font-family: Arial,Helvetica;">[|Sonar: Mines vs Rocks]
<span style="font-family: Arial,Helvetica;">208 cases, 60 continuous attributes, 2 classes, 111 metal, 97 rock. <span style="font-family: Arial,Helvetica;"> From the [|CMU benchmark repository] <span style="font-family: Arial,Helvetica;"> This dataset has been used in two kinds of experiments: <span style="font-family: Arial,Helvetica;"> 1. The "aspect-angle independent" experiments use all 208 cases with 13-fold crossvalidation, averaged over 10 runs to get std. <span style="font-family: Arial,Helvetica;"> 2. The "angle independent experiments" use training / test sets with 104 vectors each. Class distribution in training is 49 + 55, in test 62 + 42. <span style="font-family: Arial,Helvetica;"> Estimation of L1O on the whole dataset (Opper and Winther) give 78.2% only; is the test so easy? Some of this results were made without standardization of the data, which is here very important! <span style="font-family: Arial,Helvetica;"> The "angle independent experiments" with training / test sets.

<span style="font-family: Arial,Helvetica;">The "angle dependent experiments" with 13 CV on all data.
 * Method || Train % || Test % || Reference ||
 * 1-NN, 5D from MDS, Euclid, std ||  || 97.1 || our, GM (WD) ||
 * 1-NN, Manhattan std ||  || 97.1 || our, GM (WD) ||
 * 1-NN, Euclid std ||  || 96.2 || our, GM (WD) ||
 * TAP MFT Bayesian || -- || 92.3 || Opper, Winther ||
 * Naive MFT Bayesian || -- || 90.4 || Opper, Winther ||
 * SVM || -- || 90.4 || Opper, Winther ||
 * MLP+BP, 12 hidden, best MLP || -- || 90.4 || Gorman, Sejnowski ||
 * 1-NN, Manhattan raw ||  || 92.3 || our, GM (WD) ||
 * 1-NN, Euclid raw ||  || 91.3 || our, GM (WD) ||
 * FSM - methodology ? ||  || 83.6 || our (RA) ||

|| ||  ||  || <span style="font-family: Arial,Helvetica;">M. Opper and O. Winther, Gaussian Processes and SVM: Mean Field Results and Leave-One-Out. In: Advances in Large Margin Classifiers, Eds. A. J. Smola, P. Bartlett, B. Schölkopf, D. Schuurmans, MIT Press, 311-326, 2000; same methodology as Gorman with Sejnowski. <span style="font-family: Arial,Helvetica;"> N. Shang, L. Breiman, ICONIP'96, p.133, 10xCV <span style="font-family: Arial,Helvetica;"> Gorman, R. P., and Sejnowski, T. J. (1988). "Analysis of Hidden Units in a Layered Network Trained to Classify Sonar Targets", Neural Networks 1, pp. 75-89, 13xCV <span style="font-family: Arial,Helvetica;"> Our results: kNN results from 10xCV and from 13xCV are quite similar, so Shang and Breiman should not differ much from 13 CV. <span style="font-family: Arial,Helvetica;"> WD Leave-one-out (L1O) estimations on std data: <span style="font-family: Arial,Helvetica;"> L1O with k=1, Euclidean distance, for all data gives 87.50%, other k and distance function do not give significant improvement. <span style="font-family: Arial,Helvetica;"> SVM linear, C=1, L1O 75.0%, for Gaussian kernel, C=1, L1O is 78.8% <span style="font-family: Arial,Helvetica;"> Other L1O results taken from C. Domeniconi, J. Peng, D. Gunopulos, "An adaptive metric for pattern classification".
 * 1-NN Euclid on 5D MDS input ||  || 87.5±0.8 || our GM (WD) ||
 * 1-NN Euclidean, std data ||  || 86.8±1.2 || our GM (WD) ||
 * 1-NN Manhattan, std data ||  || 86.3±0.3 || our GM (WD) ||
 * MLP+BP, 12 hidden || 99.8±0.1 || 84.7±5.7 || Gorman, Sejnowski ||
 * 1-NN Manhattan, raw data ||  || 84.5±0.4 || our GM (WD) ||
 * MLP+BP, 24 hidden || 99.8±0.1 || 84.5±5.7 || Gorman, Sejnowski ||
 * MLP+BP, 6 hidden || 99.7±0.2 || 83.5±5.6 || Gorman, Sejnowski ||
 * SVM linear, C=0.1 ||  || 82.7±8.5 || our GM (WD), std data ||
 * 1-NN Euclidean, raw data ||  || 82.1±0.9 || our GM (WD) ||
 * SVM Gauss, C=1, s=0.1 ||  || 77.4±10.1 || our GM (WD), std data ||
 * SVM linear, C=1 ||  || 76.9±11.9 || our GM (WD), raw data ||
 * SVM linear, C=1 ||  || 76.0±9.8 || our GM (WD), std data ||
 * DB-CART, 10xCV ||  || 81.8 || Shang, Breiman ||
 * CART, 10xCV ||  || 67.9 || Shang, Breiman ||


 * Discriminant Adaptive NN, DANN ||  || 92.3 ||
 * Adaptive metric NN ||  || 90.9 ||
 * kNN ||  || 87.5 ||
 * SVM Gauss C=1 ||  || 78.8 ||
 * C4.5 ||  || 76.9 ||
 * SVM linear C=1 ||  || 75.0 ||

<span style="font-family: Arial,Helvetica;">[|Vovel]
<span style="font-family: Arial,Helvetica;">528 training, 462 test cases, 10 continous attributes, 11 classes <span style="font-family: Arial,Helvetica;"> From the [|UCI benchmark repository]. <span style="font-family: Arial,Helvetica;"> Speaker independent recognition of the eleven steady state vowels of British English using a specified training set of lpc derived log area ratios. <span style="font-family: Arial,Helvetica;"> Results on the total set
 * Method || Train || Test || Reference ||
 * CART-DB, 10xCV on total set !!! ||  || 90.0 || Shang, Breiman ||
 * CART, 10xCV on total set ||  || 78.2 || Shang, Breiman ||

<span style="font-family: Arial,Helvetica;">N. Shang, L. Breiman, ICONIP'96, p.133, made 10xCv instead of using the test set.
 * Method || Train || Test || Reference ||
 * Square node network, 88 units ||  || 54.8 || UCI ||
 * Gaussian node network, 528 units ||  || 54.6 || UCI ||
 * 1-NN, Euclides, raw || 99.24 || 56.3 || WD/KG ||
 * Radial Basis Function, 528 units ||  || 53.5 || UCI ||
 * Gaussian node network, 88 units ||  || 53.5 || UCI ||
 * FSM Gauss, 10CV na treningowym || 92.60 || 51.94 || our (RA) ||
 * Square node network, 22 ||  || 51.1 || UCI ||
 * Multi-layer perceptron, 88 hidden ||  || 50.6 || UCI ||
 * Modified Kanerva Model, 528 units ||  || 50.0 || UCI ||
 * Radial Basis Function, 88 units ||  || 47.6 || UCI ||
 * Single-layer perceptron, 88 hidden ||  || 33.3 || UCI ||

<span style="font-family: Arial,Helvetica;">Telugu Vovel
<span style="font-family: Arial,Helvetica;">871 patterns, 6 overlapping vowel classes (Indian Telugu vowel sounds), 3 features (formant frequencies).

<span style="font-family: Arial,Helvetica;">Parameters in SVM were optimized, that is in each CV different paramters were used, so only approximate value can be quoted. If they are fixed to C=1000, s=1 results are a bit worse. <span style="font-family: Arial,Helvetica;"> Papers using this data:
 * Method || Test || Reference ||
 * **10xCV tests below** ||  ||   ||
 * 3-NN, Manhattan || 87.8±4.0 || Kosice ||
 * 3-NN, Canberra || 87.8±4.2 || WD/GM ||
 * FSM, 65 Gaussian nodes || 87.4±4.5 || Kosice ||
 * 3-NN, Euclid || 87.3±3.9 || WD/GM ||
 * SSV dec. tree, 22 rules || 86.0±?? || Kosice ||
 * SVM Gauss opt C~1000, s~1 || 85.0±4.0 || WD, Ghostminer ||
 * SVM Gauss C=1000, s=1 || 83.5±4.1 || WD, Ghostminer ||
 * SVM, Gauss, C=1, s=0.1 || 76.6±2.5 || WD, Ghostminer ||
 * **2xCV tests below** ||  ||   ||
 * 3-NN, Euclidean || 86.1±0.6 || Kosice ||
 * FSM, 40 Gaussian nodes || 85.2±1.2 || Kosice ||
 * MLP || 84.6 || Pal ||
 * Fuzzy MLP || 84.2 || Pal ||
 * SSV dec. tree, beam search || 83.3±0.9 || Kosice ||
 * SSV dec. tree, best first || 83.0±1.0 || Kosice ||
 * Bayes Classifier || 79.2 || Pal ||
 * Fuzzy SOM || 73.5 || Pal ||


 * S. K. Pal and D. Dutta Majumder, ``Fuzzy sets and decision making approaches in vowel and speaker recognition'', IEEE Transactions on Systems, Man, and Cybernetics, Vol. 7, pp. 625-629, 1977.
 * S. Mitra, M. Banerjee and S. K. Pal, Rough knowledge-based network, fuzziness and classification, Neural Computing & Applications 7, 17-25, 1998.
 * Duch W and Hayashi Y, Computational intelligence methods and data understanding. In: Quo Vadis computational Intelligence? New trends and approaches in computational intelligence. Eds. P. Sincak, J. Vascak, Springer studies in fuzziness and soft computing, Vol. 54 (2000), pp. 256-270.
 * Chaoshun Li, Jianzhong Zhou, Qingqing Li and Xiuqiao Xiang, A Fuzzy Cluster Algorithm Based on Mutative Scale Chaos Optimization, LNCS 5264, 259-267, 2008.

<span style="font-family: Arial,Helvetica;">[|Wine data]
<span style="font-family: Arial,Helvetica;">Source: UCI, described in Forina, M. et al, PARVUS - An Extendible Package for Data Exploration, Classification and Correlation. Institute of Pharmaceutical and Food Analysis and Technologies, Via Brigata Salerno, 16147 Genoa, Italy. <span style="font-family: Arial,Helvetica;"> These data are the results of a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. The analysis determined the quantities of 13 constituents found in each of the three types of wines. <span style="font-family: Arial,Helvetica;"> Class distribution: 178 cases = [59, 71, 48] in Class 1-3; <span style="font-family: Arial,Helvetica;"> 13 continuous attributes: alcohol, malic-acid, ash, alkalinity, magnesium, phenols, flavanoids, nonanthocyanins, proanthocyanins, color, hue, OD280/D315, proline.

<span style="font-family: Arial,Helvetica;">UCI past usage: <span style="font-family: Arial,Helvetica;"> [1] S. Aeberhard, D. Coomans and O. de Vel, Comparison of Classifiers in High Dimensional Settings, Tech. Rep. no. 92-02, (1992), Dept. of Computer Science and Dept. of Mathematics and Statistics, James Cook University of North Queensland (submitted to Technometrics). <span style="font-family: Arial,Helvetica;"> [2] S. Aeberhard, D. Coomans and O. de Vel, "The classification performance of RDA" Tech. Rep. no. 92-01, (1992), Dept. of Computer Science and Dept. of Mathematics and Statistics, James Cook University of North Queensland (submitted to Journal of Chemometrics).
 * Method || Test || Reference ||
 * **Leave-one-out test results** ||  ||   ||
 * RDA || 100 || [1] ||
 * QDA || 99.4 || [1] ||
 * LDA || 98.9 || [1] ||
 * kNN, Manhattan, k=1 || 98.7 || GM-WD, std data ||
 * 1NN || 96.1 || [1] z-transformed data ||
 * kNN, Euclidean, k=1 || 95.5 || GM-WD, std data ||
 * kNN, Chebyshev, k=1 || 93.3 || GM-WD, std data ||
 * **10xCV tests below** ||  ||   ||
 * kNN, Manhattan, auto k=1-10 || 98.9±2.3 || GM-WD, 2D data, after MDS/PCA ||
 * IncNet, 10CV, def, Gauss || 98.9±2.4 || GM-WD, std data, up to 3 neurons ||
 * 10 CV SSV, opt prune || 98.3±2.7 || GM-WD, 2D data, after MDS/PCA ||
 * 10 CV SSV, node count 7 || 98.3±2.7 || GM-WD, 2D data, after MDS/PCA ||
 * kNN, Euclidean, k=1 || 97.8±2.8 || GM-WD, 2D data, after MDS/PCA ||
 * kNN, Manhattan, k=1 || 97.8±2.9 || GM-WD, 2D data, after MDS/PCA ||
 * kNN, Manhattan, auto k=1-10 || 97.8±3.9 || GM-WD ||
 * kNN, Euclidean, k=3, weighted features || 97.8±4.7 || GM-WD ||
 * IncNet, 10CV, def, bicentral || 97.2±2.9 || GM-WD, std data, up to 3 neurons ||
 * kNN, Euclidean, auto k=1-10 || 97.2±4.0 || GM-WD ||
 * 10 CV SSV, opt node || 97.2±5.4 || GM-WD, 2D data, after MDS/PCA ||
 * FSM a=.99, def || 96.1±3.7 || GM-WD, 2D data, after MDS/PCA ||
 * FSM 10CV, Gauss, a=.999 || 96.1±4.7 || GM-WD, std data, 8-11 neurons ||
 * FSM 10CV, triang, a=.99 || 96.1±5.9 || GM-WD, raw data ||
 * kNN, Euclidean, k=1 || 95.5±4.4 || GM-WD ||
 * 10 CV SSV, opt node, BFS || 92.8±3.7 || GM-WD ||
 * 10 CV SSV, opt node, BS || 91.6±6.5 || GM-WD ||
 * 10 CV SSV, opt prune, BFS || 90.4±6.1 || GM-WD ||

<span style="font-family: Arial,Helvetica;">[|Glass identification]
<span style="font-family: Arial,Helvetica;">Shang, Breiman CART 71.4% accuracy, DB-CART 70.6%. <span style="font-family: Arial,Helvetica;"> Leave-one-out results taken from C. Domeniconi, J. Peng, D. Gunopulos, "An adaptive metric for pattern classification".


 * Adaptive metric NN ||  || 75.2 ||
 * Discriminant Adaptive NN, DANN ||  || 72.9 ||
 * kNN ||  || 72.0 ||
 * C4.5 ||  || 68.2 ||

==<span style="font-family: Arial,Helvetica;">[|DNA-Primate splice-junction gene sequences], with associated imperfect domain theory. == <span style="font-family: Arial,Helvetica;">Stalog Data: splice junctions are points on a DNA sequence at which `superfluous' DNA is removed during the process of protein creation in higher organisms. The problem posed in this dataset is to recognize, given a sequence of DNA, the boundaries between exons (the parts of the DNA sequence retained after splicing) and introns (the parts of the DNA sequence that are spliced out). <span style="font-family: Arial,Helvetica;"> This problem consists of two subtasks: recognizing exon/intron boundaries (referred to as EI sites), and recognizing intron/exon boundaries (IE sites). (In the biological community, IE borders are referred to a "acceptors while EI borders are referred to as "donors.) <span style="font-family: Arial,Helvetica;"> Number of Instances: 3190. Class distribution: <span style="font-family: Arial,Helvetica;">Number of attributes: originally 60 attributes {a,c,t,g}, usually converted to 180 binary indicator variables {(0,0,0), (0,0,1), (0,1,0), (1,0,0)}, or 240 binary variables. <span style="font-family: Arial,Helvetica;"> Much better performance is generally observed if attributes closest to the junction are used (middle). In the StatLog version (180 variables), this means using attributes A61 to A120 only.
 * Class || Train || Test ||
 * 1 || 464 (23.20%) || 303 (25.55%) ||
 * 2 || 485 (24.25%) || 280 (23.61%) ||
 * 3 || 1051 (52.55%) || 603 (50.84%) ||
 * All || 2000 (100%) || 1186 (100%) ||

<span style="font-family: Arial,Helvetica;">kNN GM - [|GhostMiner] version of kNN (our group) <span style="font-family: Arial,Helvetica;"> SSV Decision Tree - our results
 * Method || % in training || % on test || Time train || Time test ||
 * RBF, 720 nodes || 98.5 || 95.9 ||  ||   ||
 * k-NN GM, p(X|C), k=6, Euclid, raw || 96.8 || 95.5 || 0 || short ||
 * Dipol92 || 99.3 || 95.2 || 213 || 10 ||
 * Alloc80 || 93.7 || 94.3 || 14394 || -- ||
 * QuaDisc || 100.0 || 94.1 || 1581 || 809 ||
 * LDA, Discrim || 96.6 || 94.1 || 929 || 31 ||
 * FSM, 8 Gaussians, 180 binary || 95.4 || 94.0 ||  ||   ||
 * Log DA, Disc || 99.2 || 93.9 || 5057 || 76 ||
 * SSV Tree, p(X|C), opt node, 4CV || 94.8 || 93.4 || short || short ||
 * Naive Bayes || 94.8 || 93.2 || 52 || 15 ||
 * Castle, middle 90 binary var || 93.9 || 92.8 || 397 || 225 ||
 * IndCart, 180 binary || 96.0 || 92.7 || 523 || 516 ||
 * C4.5, on 60 features || 96.0 || 92.4 || 9 || 2 ||
 * CART, middle 90 binary var || 92.5 || 91.5 || 615 || 9 ||
 * MLP+BP || 98.6 || 91.2 || 4094 || 9 ||
 * Bayesian Tree || 99.9 || 90.5 || 82 || 11 ||
 * CN2 || 99.8 || 90.5 || 869 || 74 ||
 * New ID || 100.0 || 90.0 || 698 || 1 ||
 * Ac2 || 100.0 || 90.0 || 12378 || 87 ||
 * Smart || 96.6 || 88.5 || 79676 || 16 ||
 * Cal5 || 89.6 || 86.9 || 1616 || 8 ||
 * Itrule || 86.9 || 86.5 || 2212 || 6 ||
 * k-NN || 91.1 || 85.4 || 2428 || 882 ||
 * Kohonen || 89.6 || 66.1 || - || - ||
 * **Default, majority** || 52.5 || 50.8 ||  ||   ||

<span style="display: block; font-family: Arial,Helvetica; text-align: right;">//[|Włodzisław Duch], last modification 26.08.2012//