Documenting R Packages: What is a good example? (Hons) [Open]

People

Supervisor

Description

Most programming languages have documentation systems derived from in-line comments. The most reknown of these are Java's Javadoc. Roxygen, is a system created for R programming in the likeliness of Javadoc, which has gained ubiquity in the R community after being championed by primer books and organisations such as rOpenSci and BioConductor.

In this project, you'll mine open-source GitHub packages, extract their Roxygen documentation, detect the sections related to the examples, and perform a number of automated (e.g., length, location, bugs, dependency on external dataseets, does it run?) and manual (is it commented? explained?). You will accompany this with an anonymous, online survey of R developers (you'll need to apply for an Ethical Application, but will be assisted to do so).

Note: This project is open and recruiting students.

 

Requirements

  • Programming knowledge, preferably either Python or R. Other languages are welcome but not needed.
  • Knowledge (or willingness to learn quickly) about using APIs to download data.
  • Demonstrated academic writing skills.
  • Excellent attention to details.
 
Please, contact me via email with a detailed resume, and your comments (1 page only) on why you are interested in on this project.
 
Anybody is welcome to apply. However, female candidates (or female-identifying) are especially encouraged to submit.
 

Background Literature

Z. Codabux, M. Vidoni and F. Fard,  "Technical Debt in the Peer-Review Documentation of R Packages: a rOpenSci Case Study," in 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), 2021 pp. 195-206. https://doi.ieeecomputersociety.org/10.1109/MSR525...
 
M. Vidoni,  "Self-Admitted Technical Debt in R Packages: An Exploratory Study," in 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), 2021 pp. 179-189. https://doi.ieeecomputersociety.org/10.1109/MSR525...
 
 
Please, take a look at Dr Vidoni's papers here: https://melvidoni.rbind.io/project/2020-rse/
 

Keywords

  • Empirical Software Engineering. Mixed-Methods. Developers Survey.
  • Natural Language Processing
  • Data Scienc Software, Scientific Software
  • Developers' Challenges

 

Updated:  10 August 2021/Responsible Officer:  Dean, CECS/Page Contact:  CECS Marketing