Incomplete and invalid submissions will not be evaluated.
The Leaderboards only show the latest submission that will be
considered for the final challenge results, so don't forget to submit
your best results before the deadline.
For task 1, Dice Similarity Coefficient (DSC) and Intersection of
Union (IoU) are used to evaluate the performance of the segmentation
methods.
For task 2 and task 3, quadratic weighted kappa and Area Under Curve
(AUC) are used to evaluate the performance of the classification
methods. In particular, we use macro-AUC-ovo to calculate the AUC
value. (ovo: One VS One)
The valid submission on test set will be used for ranking.
For each metric in task 1, the metric will be calculated for each
label within each test case. and the mean of each label will be
calculated. Then the metric will be averaged over each label. The
averaged DSC (mean DSC in Leaderboard) is used for the ranking
of participating algorithms, if the same, the averaged IoU (mean IoU
in Leaderboard) is used as an auxiliary ranking metric.
For task 2 and task 3, the quadratic weighted kappa is used for the
ranking of participating algorithms, if the same, the macro-AUC-ovo is
used as an auxiliary ranking metric.