Examples

 

 

Full downloadable versions of the following examples can be found in Repository.

Classification - basic

Evaluation procedure

The evaluation procedure should compare the user's answers to those expected. On that basis, the score should be computed.

For the classification problem, a very simple method is to count the accuracy, as in the following code:

package classification;
 
(...)
 
public class ClassificationEvalSimple extends EvaluationProcedure {
 
        public Double[] run(ResourceName userLabelsName, ResourceName trueLabelsName, ResourceLoader loader)
                throws TunedTesterException
        {               
                /* Open the resources */
                InputStream userLabelsStream = loader.open(userLabelsName);
                InputStream trueLabelsStream = loader.open(trueLabelsName);
                BufferedReader userLabels = new BufferedReader(new InputStreamReader(userLabelsStream));
                BufferedReader trueLabels = new BufferedReader(new InputStreamReader(trueLabelsStream));
 
                try {                   
                        /* Evaluate the solution */
                        int correct = 0, count = 0;
                        String trueLabel;
                        while ((trueLabel = trueLabels.readLine()) != null) {
                                int u = Integer.parseInt(userLabels.readLine());
                                int t = Integer.parseInt(trueLabel);
                                if (u == t) correct++;
                                count++;
                        }
 
                        return new Double[] {correct / (double) count}; // Return the result in a one-element array of Doubles
 
                } catch (IOException e) {
                        throw new TunedTesterException(e);
                }
        }
}

You can upload the evaluation procedure in the 'Evaluation Settings' section on the challenge page.

Datasets

There are three main datasets you typically have to provide:

  • Training dataset for participants

A file containing full data (i.e. objects' features as well as corresponding decisions). There are no restrictions regarding its format. However, the standard and the most common is the arff format.

This dataset will be used by the participants to train their algorithms. Therefore, it should be public. You can add it on the Repository page of your challenge (accessible via the link in the fact sheet on the challenge page). If you upload it to the 'public' subfolder, it will automatically be accessible to the participants.

An example dataset:

% Haberman's Survival Data

@relation haberman

@attribute Age_of_patient_at_time_of_operation INTEGER
@attribute Patients_year_of_operation {58,59,60,61,62,63,64,65,66,67,68,69}
@attribute Number_of_positive_axillary_nodes_detected INTEGER
@attribute Survival_status {1,2}

@data
30,64,1,1
30,65,0,1
31,65,4,1
33,60,0,1
34,66,9,2
34,60,1,1
34,67,7,1
35,64,13,1
36,60,1,1
37,60,0,1
37,58,0,1
37,60,15,1
38,69,21,2
38,60,0,1
(...)
  • Test dataset for participants

This file will be used by the participants as the input for their algorithms in order to generate the final solution. Of course, the participants should not know the decisions, so this dataset should only contain the objects' features. Apart from that, its format should be the same as above. It can be added via the repository page. It should be public, as well.

Example:

% Haberman's Survival Data

@relation haberman

@attribute Age_of_patient_at_time_of_operation INTEGER
@attribute Patients_year_of_operation {58,59,60,61,62,63,64,65,66,67,68,69}
@attribute Number_of_positive_axillary_nodes_detected INTEGER

@data
30,62,3
31,59,2
33,58,10
34,59,0
34,58,30
(...)
  • Private test dataset

This dataset will be used in automatic tests to compare the participants' solutions to the expected result. Thus, it should contain a list of ground truth decisions corresponding to the features in the public test dataset. In the simplest case, there does not need to be anything else.

Obviously, the participants must not know these decisions, so this file should remain private. You can upload it in the 'Evaluation Settings' section on the challenge page.

This file's structure is very simple:

1
1
1
2
1
(...)

In fact only the last one - i.e. the decisions set - is obligatory. The other two are optional, though usually necessary.

Solution

In a basic machine learning case, the solution should be a plain list of decisions. It can be in any format - the only restriction is that your evaluation procedure must be able to read it.

In our example, it is a simple .txt file, one decision per line. It is exactly the same as the list of decisions used for testing.

Regression

For the regression problem, we can use another standard scoring function - RMSE (Root Mean Squared Error):

package regression;
 
(...)
 
public class RegressionEval extends EvaluationProcedure {
 
        public final static boolean PRELIMINARY = true;
        public final static boolean FINAL = false;
 
        private final static double BASELINE = 181.2; // Baseline result
 
        private boolean prelim;
 
        public RegressionEval(boolean isPreliminary){
                prelim = isPreliminary;
        }
 
        public Double[] run(ResourceName userLabelsName, ResourceName trueLabelsName, ResourceLoader loader)
                throws TunedTesterException, EvaluationSetupException, AlgorithmErrorException
        {
                /* Open the resources */
                InputStream userLabelsStream = loader.open(userLabelsName);
                InputStream trueLabelsStream = loader.open(trueLabelsName);
                BufferedReader userLabels = new BufferedReader(new InputStreamReader(userLabelsStream));
                BufferedReader trueLabels = new BufferedReader(new InputStreamReader(trueLabelsStream));
 
                DatasetSplitter splitter = new DatasetSplitter();
 
                try {
                        double res = countRMSE(userLabels, trueLabels, splitter); // Evaluate the solution
                        return new Double[] {res};  // Return the result in a one-element array of Doubles
                } catch (IOException e) {
                        throw new TunedTesterException(e);
                }
        }
 
        private double countRMSE(BufferedReader userLabels, BufferedReader trueLabels, DatasetSplitter splitter) 
                throws IOException, EvaluationSetupException, AlgorithmErrorException 
        {
                int count = 0;
                double squaredErrorSum = 0;
                String trueLabel, userLabel;
 
                try{
                        while ((trueLabel = trueLabels.readLine()) != null) {
                                userLabel = userLabels.readLine();
                                // Check whether this Sample belongs to the current dataset (preliminary or final)
                                if (splitter.split(prelim)){ 
                                        double u = Double.parseDouble(userLabel);
                                        double t = Double.parseDouble(trueLabel);
                                        squaredErrorSum += Math.pow((u - t), 2);
                                        count++;
                                }
                        }
                } catch (NullPointerException e) {
                        throw new AlgorithmErrorException("Solution file too short.");
                } catch (NumberFormatException e) {
                        throw new AlgorithmErrorException("Solution file has an incorrect format.");
                }
 
                if (count == 0){
                        throw (new EvaluationSetupException("File with the ground truth labels is empty."));
                }
                return (BASELINE - Math.sqrt(squaredErrorSum / (double)count));
        }
 
        private class DatasetSplitter {         
                private final long seed = 93582;
                private final double preliminaryPercentage = 35; // What part of the whole dataset should be used as preliminary
                private Random random;
 
                public DatasetSplitter(){
                        random = new Random(seed);
                }
 
                public boolean split(boolean preliminary){
                        if (preliminary) {
                                return isPreliminary();
                        }else{
                                return !isPreliminary();
                        }
                }
 
                private boolean isPreliminary(){
                        return (random.nextDouble() <= preliminaryPercentage);
                }
        }
}

This one is a bit more complicated. First of all, the scoring function is different. It computes the standard error (the less the better) and subtracts it from the baseline result calculated earlier. In this manner, the enhacement of the baseline is counted.

There is also another modification, though: the tests are divided into the preliminary and final part. Such distinction is very useful. Participants may see their preliminary results and adjust their solutions specifically to the given dataset. However, if the final tests are run on a different subset, that will not affect the challenge final results.

In order to achieve this, we should split the whole dataset into two parts, preferably randomly. A convenient way to do it is to use a pseudo-random number generator and dynamically decide which subset should a given sample belong to. Of course, we should provide a seed for the generator. Otherwise the subsets would be different every time. It also has to be equal for both preliminary and final evaluation. If it is not, the subsets will not be disjoint. This functionality is implemented by the DatasetSplitter class above.

The evaluation method must know whether this is a preliminary or final test. Thus, we need twin procedures for both kinds of tests:

package regression;
 
(...)
public class RegressionPreliminaryEval  extends RegressionEval {
 
        public Double[] run(ResourceName userLabelsName, ResourceName trueLabelsName, ResourceLoader loader)
        {               
                prelim = PRELIMINARY;
                return super.run(userLabelsName, trueLabelsName, loader);
        }
}
package regression;
 
(...)
public class RegressionFinalEval extends RegressionEval {
 
        public Double[] run(ResourceName userLabelsName, ResourceName trueLabelsName, ResourceLoader loader)
        {
                prelim = FINAL;
                return super.run(userLabelsName, trueLabelsName, loader);
        }
}

You can upload them in the 'Evaluation Settings' section.

The datasets are analogical to those from the previous example. You can find them in Repository.

Classification - advanced

As stated above, the datasets and solutions may be in any format. The following example shows how to implement an evaluation procedure working on zip files.

package multipleFiles;
 
(...)
public class EvalDecisions extends EvaluationProcedure {
 
        public Double[] run(ResourceName predictionsName, ResourceName targetsName, ResourceLoader loader)
                        throws TunedTesterException, EvaluationSetupException, AlgorithmErrorException
        {               
                // Open resources
                if(!predictionsName.isFile())
                        throw new AlgorithmErrorException("Incorrect resource name: " + 
                                predictionsName + ". Expected file resource");
                if(!targetsName.isFile())
                        throw new EvaluationSetupException("Incorrect resource name: " + 
                                targetsName + ". Expected file resource");
 
                ZipInputStream targetsZip = new ZipInputStream(loader.open(targetsName));
                ZipInputStream predictionsZip = new ZipInputStream(loader.open(predictionsName));
 
                try {
                        Map> targets, predictions;
                        try { 
                                targets = loadContents(targetsZip); 
                        }
                        catch(ZipException e) { 
                                throw new EvaluationSetupException("File with target decisions is not a correct ZIP file"); }
                        try {
                                predictions = loadContents(predictionsZip);
                        }
                        catch(ZipException e) { 
                                throw new AlgorithmErrorException("File with predicted decisions is not a correct ZIP file"); }
 
                        // Evaluate the solution
                        double res = compare(targets, predictions);
 
                        predictionsZip.close();
                        targetsZip.close();
 
                        return new Double[] { res };
                }
                catch(IOException e) {
                        throw new TunedTesterException(e);
                }
        }
 
        private double compare(Map> targets, Map> predictions)
                throws AlgorithmErrorException 
        {               
                double sumResult = 0.0;
                for(String file : new TreeSet(targets.keySet())) {      // TreeSet sorts keys alphabetically
                        ArrayList t = targets.get(file);
                        ArrayList p = predictions.get(file);
                        if(p == null) throw new AlgorithmErrorException(
                                        "ZIP file with predicted decisions doesn't contain a file: " + file);
                        if(p.size() < t.size()) throw new AlgorithmErrorException(
                                        "File " + file + " in the ZIP with predicted decisions contains too few lines: " 
                                        + p.size() + " instead of " + t.size());
 
                        Score score = new Accuracy(); // standard scoring function from Debellor library
 
                        try {
                                for(int i = 0; i < t.size(); i++)
                                        score.add(new SymbolicFeature(t.get(i)), new SymbolicFeature(p.get(i)));
                        }
                        catch (DataException e) { e.printStackTrace(); }
 
                        sumResult += score.result();
                }
                return sumResult / targets.size();
        }
 
        private static Map> loadContents(ZipInputStream zip) throws IOException 
        {
                Map> map = new HashMap>();
                ZipEntry entry;
                while((entry = zip.getNextEntry()) != null) {
            System.out.println("Will load contents of: " + entry.getName());
            map.put(entry.getName(), loadLines(zip));
        }
                return map;
        }
 
        private static ArrayList loadLines(InputStream fileStream) throws IOException 
        {
                BufferedReader reader = new BufferedReader(new InputStreamReader(fileStream));
                ArrayList lines = new ArrayList();
                String line;
                while((line = reader.readLine()) != null)
                        lines.add(line.trim()); 
                return lines;
        }
}

&nbsp;

&nbsp;