Session 1
Ingmar Steiner
2017-04-26
“Best Practices for Reproducible Research”
(a.k.a “Agile Research”, “DevOps for Research”, etc.)
Evolved out of personal experience, “what I wish my grad students had known already”.
This course could give you superpowers, and turn you into a rockstar researcher!
This project seminar requires active participation.
There will be regular assignments, as well as a mandatory final project and written report.
The course content is technical, but also flexible!
The systematic application of scientific and technological knowledge, methods, and experience to the design, implementation, testing, and documentation of software.
The systematic application of scientific and technological knowledge, methods, and experience to the design, implementation, testing, and documentation of research.
Any code run on a computer can be seen as software.
Research is no different.
Conducting and documenting research is to develop its source code.
A small command-line interface (CLI) program that outputs a fortune
JFortune.java
public class JFortune {
public String getFortune() {
return "42";
}
public static void main(String[] args) {
JFortune jfortune = new JFortune();
String fortune = jfortune.getFortune();
System.out.println(fortune);
}
}
Add Main-Class
header to manifest and package to JAR file
JFortuneTest.java
import org.junit.*;
public class JFortuneTest {
@Test
public void testGetFortune() {
JFortune jfortune = new JFortune();
String expected = "41"; // let's fail this test!
String actual = jfortune.getFortune();
Assert.assertEquals(expected, actual);
}
}
We also need the current JAR from JUnit.
This is a test-scoped dependency!
patch.diff
--- src/JFortuneTest.java 2017-04-25 17:25:26.000000000 +0200
+++ src/JFortuneTest.java 2017-04-25 17:26:38.000000000 +0200
@@ -5,7 +5,7 @@
@Test
public void testGetFortune() {
JFortune jfortune = new JFortune();
- String expected = "41"; // let's fail this test!
+ String expected = "42";
String actual = jfortune.getFortune();
Assert.assertEquals(expected, actual);
}
We just need to run more and more complex javac
commands, solving these issues once and for all!
High-level tools to manage the build lifecycle as efficiently as possible
build.gradle
plugins {
id 'java'
}
repositories {
jcenter()
}
dependencies {
testCompile 'junit:junit:4.10'
}
jar {
manifest {
attributes 'Main-Class': 'JFortune'
}
}
build.gradle
plugins {
id 'application'
}
mainClassName = 'JFortune'
repositories {
jcenter()
}
dependencies {
testCompile 'junit:junit:4.10'
}
Notice the UP-TO-DATE
tasks (cached).
Download and process additional resource
build.gradle
plugins {
id 'application'
id 'de.undercouch.download' version '3.2.0'
}
mainClassName = 'JFortune'
repositories {
jcenter()
}
dependencies {
testCompile 'junit:junit:4.10'
}
import groovy.json.JsonBuilder
task processFortunes {
def destFile = file("$buildDir/fortunes.json")
outputs.file destFile
doLast {
download {
src 'https://raw.githubusercontent.com/ruanyf/fortunes/master/data/fortunes'
dest temporaryDir
overwrite false
}
def fortunes = file("$temporaryDir/fortunes").text.split(/\n?%\n?/)
destFile.text = new JsonBuilder(fortunes).toPrettyString()
}
}
processResources {
from processFortunes
}
All of these steps can be streamlined with build automation:
Data is just another dependency
Processing is just a sequence of tasks
Scripts can be tested for bugs, validation can be run as tests
Reports can include generated content, compiled automatically
Build outputs can be uploaded to shared or public repositories