Features of TunedIT Challenges Platform

 

 

This is a general list of features of TunedIT Challenges platform. Some of them may be unavailable in a given challenge type. See comparison of challenge types for more details.

General

Fundamentals

Participant registration, submission of solutions, publication of results – all managed by TunedIT web site.

Leaderboard & Interactivity

leaderboard.png

Preliminary scores for all participants are published on the Leaderboard page immediately after submission of a solution. In this way, participants receive online feedback about their performance, may compare it with others and improve their solutions, actively searching for the best algorithm throughout the whole duration of the contest - from its very beginning till the very end. The Leaderboard and the interactivity it provides is the single most important feature that makes TunedIT Challenges so valuable, be it for didactic, scientific or industrial purposes.

View the Leaderboard of ICDM Contest as an example. This contest is already finished, so Final scores are already visible, along with Preliminary ones.

Automated evaluation of solutions

Tutors may relax. No longer have to mark tons of assignments manually.

Multiple submissions by a participant

Participants have a chance to improve solutions many times (upper limit can be specified).

Two-stage evaluation: preliminary and final

For the sake of full objectivity, final results – published at the end of the contest – may be calculated on a separate subset than the preliminary ones, shown on the Leaderboard during competition.
See also: Preliminary vs. Final Evaluation.

Pluggable evaluation procedures

All types of datasets and tasks are supported. If the contest problem is atypical, a custom evaluation procedure can be used.

Multiple tracks

Competition may consist of several independent tasks with separate leaderboards each. For instance, you may have two variants of the same problem: Basic vs. Advanced; Classification vs. Regression; Evaluation of plain predictions vs. Evaluation of code etc.

Discussion Forum

For Scientific and Industrial challenges, we will create a separate subforum in TunedIT Forums, dedicated specifically to your contest. This is the best way to communicate effectively with participants, answer their questions and let them share their thoughts about the challenge.

Group e-mails to all participants

You can send e-mails to all registered participants directly from the challenge web page. Easy to post announcements and reminders.

History of submissions

Participants may view a detailed list of all their solutions submitted so far, with submission date and score.

Central file repository

All contest files: training and test datasets, images, additional documents – are kept in one place on the TunedIT server.

Restricted access to resources

You can decide that selected resources, such as training and test data files, are available for download only by registered participants of the challenge. Users must register before they can download all challenge resources.

Private Task and Leaderboard pages

Student challenges encompass a more private character than the Scientific and Industrial ones, so you may prefer keeping secret not only the resources, but also other contents of the challenge. For this reason, the contents of Task and Leaderboard pages of Student challenges are visible solely to registered participants.

Advanced

Evaluation of code

Typically, participants in data mining contests submit plain text files with test set predictions, which are compared with expected ground truth to calculate the score. In many cases this is inadequate, because you'd like to measure time and memory complexity of algorithms, employ cross-validation procedure etc. In TunedIT, solutions can have a form of executable Java code and you can prepare a dedicated evaluation procedure that handles algorithm evaluation in exactly the way you wish.

Secure Data Sandbox

In software engineering, the sandbox is a special place where new code can be safely tested without breaking outside environment. TunedIT extends this idea to the domain of data security. Our unique technology of Secure Data Sandbox (SDS) enables remote investigation of the data and black-box evaluation of the algorithms submitted by participants, without revealing the dataset itself. Sandbox is located either on our or your machines. In the latter case, even TunedIT team itself doesn't have access to your data, which are kept secret on your computers. Contact us at services(AT)tunedit.org to learn more.

Error logs

If evaluation fails, for instance due to incorrect solution format, the evaluation procedure may generate an error message that will be accessible for the author on his History of Submissions page. If executable code is evaluated, the message may contain full stack trace of the error.

Time limits

If you evaluate executable code, you can specify a time limit for a single evaluation. If evaluation lasts longer, it is terminated with a "timeout" error that's passed then to the author.

Parallel & redundant evaluation

If you evaluate executable code and single run lasts for a long time, you can speed up the entire process and decrease the queue of pending solutions by launching multiple instances of TunedTester on different machines. The instances will automatically distribute workload between each other, with no additional burden on you. Parallelism is also the best way to provide redundancy and robustness, in case if any of the machines fail.

Configurable metric type

You can specify whether your evaluation metric calculates gain (higher values are better, like classification accuracy) or loss (like Mean Squared Error) - results on the Leaderboard will be ordered accordingly.

Configurable precision on Leaderboard

You can specify how many digits after comma will be shown in the results on the Leaderboard. In this way you can account for evaluation metrics of a different scale or limit possibility of overtraining if the test set is small. Precision can be defined differently for preliminary and final evaluation.

Active solution - best or last?

Which solution undergoes final evaluation if a participant submitted more than one? Depending on how much test data you have and how reliable the preliminary results are, you can choose that the active solution will be either the best (in terms of preliminary result) or the last one.

Administration

Secure registration of participants

The organizer can choose from 3 different options how participants will register for the challenge:

  • With secret code. Only the people from a closed group, who know the code, will be able to register for the challenge.
  • With acceptance of Rules. Everyone can register, but acceptance of challenge Rules is required. The organizer may freely adapt the Rules to his needs.
  • Automatically. This is the simplest form, where participants get registered automatically after submitting the first solution. Everyone can register in this way.

Dry run

After creation of a new challenge, it's in a draft state and invisible to other users (except the ones who know its URL). You can freely test it - have a dry run - before it gets published on the Challenges list and becomes visible to everyone. Already before publication the challenge has full functionality, except limitation on the number of participants (max 3) - you can open it, register as participant, submit solutions, evaluate them and see results on the Leaderboard - to check if everything works as expected. Until publication, you are free to reset the challenge and reconfigure it as many times as you wish.

On-the-fly reconfiguration

Don't worry if you notice only after the start of the challenge that some settings should be changed or a dataset file should be fixed. TunedIT can handle reconfigurations without breaking the challenge or introducing inconsistencies. If necessary, participants' scores are automatically recalculated.