Impressive. Very impressive.
As you can see, I am again impressed by the annual SIGGRAPH conference that took place last August, and about which my colleagues reported. There were more than 28000 participants, and the acceptance ratio for the presented papers was below 20%. While the main focus of the conference is on computer graphics, it also includes a wide range of presentations on 3D, image and video enhancement, and image processing in general. Next to these technical sessions, there are also movie screenings, and a computer animation festival.
But, apart from the high quality and interesting mix of topics, I also really like the way papers are presented. Certainly for people like me, who did not attend the conference. Each paper starts off (after the title and author list) with a “telling illustration”, graphically illustrating the paper. Really nice to get a quick idea about the paper. Moreover, for most of those papers, the authors also have a nice video presenting their paper on their website. I have no idea whether that is mandatory, and whether one could find all those presentation videos on the ACM website. My colleagues also told me that all the presentations from this year’s SIGGRAPH conference would be recorded and made available online. I am curious! It’s still not the same as actually going there, but it is as close as I can get. For now.
One of the reproducibility problems with many current papers is that everyone applies his new algorithm to his own set of data. So did I in my super-resolution work, too. A problem with that is that it is very difficult to assess whether the data set is used (a) because that was the one the author had at hand, (b) because it was the most representative one, or (c) because the algorithm performed best on that data set.
To allow more fair comparisons, competitions are being set up in various fields. Often in the period before a conference, a competition is set up, where everyone can try his algorithm on a common dataset given by the organizers.
Continue reading ‘Data set competitions’
To my knowledge, the reproducible research efforts in computational sciences were started by Jon Claerbout (who retired earlier this year) in the early 90s. In his Stanford Exploration Lab at Stanford University, Claerbout and his colleagues (working in seismic imaging) developed a system using Makefiles that allows to remove all figures, and reproduce them using a single Unix command. This allows any person (with a Unix/Linux system) to reproduce all the results in their work. I think it is about as close to “one-click reproducibility” as one can get! Claerbout and his lab performed a lot of the pioneering work in promoting reproducible research, which has spread later to various disciplines. A history by Claerbout himself is available here.
In their work, Claerbout and his colleagues make a distinction between three types of figures/results. First of all, and most common, there are easily reproducible results, which can be reproduced by a reader using the code and data contained in the electronic document. Secondly, conditionally reproducible results are results for which the commands and data are given, provided that certain resources are available (such as Matlab or Mathematica), or for which it requires more than 20 minutes to reproduce the results. And finally, non reproducible results, a label used for results that cannot be reproduced, such as hand-drawn figures, scans, or images taken from other documents for comparison.
Their Makefile setup was recently developed further by Fomel et al. in the Madagascar project, using SCons, a similar language to Makefiles, but which should make reproducibility even more simple, and cross-platform! See their project page for more details.