Sören Sonnenburg recently brought my attention to a few possibly lesser known benchmark data sets. Of course, benchmark data sets are always a double-sided sword: On the one hand, they are a great way to test and compare your learning algorithms, but on the other hand you’re usually not really solving any real problems anymore.
So you probably already know the UCI repository, or the DELVE repository. Here are a few links to probably lesser known benchmark data sets:
Generic Benchmarking
- IDA dataset repository (a.k.a. “the Gunnar Benchmark Data Set”)
- libsvm datasets
- Datasets from cervisia.org
- KDD datasets
- Pascal Large Scale Challenge
Multiple Kernel Learning
Bioinformatics
Image Processing
No comments:
Post a Comment