Best Practices for Reproducible Research (SS 2017)
Description
This project seminar will introduce students to techniques and workflows for streamlining research projects in ways that increase productivity, enhance collaboration, and promote portability and reproducibility.
Fundamentally, best practices from software engineering are transferred to research problems in speech and language processing. But they are also adapted to challenges specific to natural language research, including corpus management, analysis pipelines, and report generation.
Over the course of the seminar, participants will learn about topics including
- Build automation
- Source code management (SCM)
- Remote repositories, dependency resolution, and artifact publishing
- Data wrangling
- Automated testing and continuous integration
- Literate programming and document generation
To receive credit, participants will need to complete regular assignments, and submit a final written report.
Prerequisites
This seminar has no formal requirements, but experience with object-oriented programming (Java, Python, etc.), SCM (Git, etc.), LaTeX, and Linux (particularly shell interaction) will be invaluable.
Sessions
The project seminar takes place in building C7.1, room U15, on Wednesdays, 8:30 to 10:00.
2017-04-26
Slides
Assignment
Play around with JFortune on GitHub
2017-05-03
Slides
Assignment
- Recreate the “real-world example” using any distributed SCM (except Git).
- Then submit a short written report (PDF format) and present your experience in the next session.
- Bonus points if you manage the process of writing the report using SCM!
2017-05-10
Canceled!
2017-05-17
Slides
2017-05-24
Slides
2017-05-31
Slides
2017-06-07
Canceled!
2017-06-14
Slides
2017-06-21
Slides
2017-06-28
Slides
Assignment
Based on https://bitbucket.org/psibre/bestpract-flaml:
- Develop plugin which adds build logic to any project
- Resolve specified FLAC+YAML file pair as data dependencies
- Extract utterances from YAML to text and label files
- Extract utterance audio from FLAC to wav files