Reproducible Research Using Jupyter Notebook
Biostat/Biomath M257
An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures.
– Buckheit and Donoho (1995)
For background and history of reproducible research in statistics/data science, see lecture notes in 203B.
This course assumes familiarity with Git/GitHub and Jupyter Notebook. Your homework should be authored using Jupyter Notebook and submitted via Git/GitHub.
For an introduction to Git/GitHub, see lecture notes in 203B.
1 Jupyter Notebook
IPython notebook (precursor of Jupyter notebook) is a powerful tool for authoring dynamic document in Python, which combines code, formatted text, math, and multimedia in a single document.
Jupyter is the current development that emcompasses multiple languages including Julia, Python, and R.
Julia uses Jupyter notebook through the IJulia.jl package.
In this course, you are required to write your homework reports using Jupyter Notebook.
For each homework, you need to submit your Jupyter Notebook (.e.g,
hw1.ipynb
), html (e.g.,hw1.html
), along with all code and data that are necessary to reproduce the results.You can start with the Jupyter Notebook for the lectures.
1.1 Installation
Installing the IJulia.jl package will install a minimal Python/Jupyter distribution that is private to Julia.
using Pkg
Pkg.add("IJulia")
We can also tell IJulia to use a Jupyter program already installed in our system:
ENV["JUPYTER"] = "path_to_jupyter_executable"
Pkg.build("IJulia")
1.2 Usage
- We can invoke Jupyter notebook within Julia by
using IJulia
notebook() # using home as working directory
or, using current directory as the working directory, by
notebook(dir = pwd()) # using current directory as working directory
Notebook can be stopped by hitting
Ctrl+c
in Julia REPL.Useful to know some keyboard shortcuts. I frequently use
shift + return
: execute current cell.b
: create a cell below current cell.a
: create a cell above current cell.
y
: change cell to code.
m
: change cell to Markdown.
Check more shortcuts in menuHelp
->Keyboard Shortcuts
.
Notebook extensions offer many utilities for productivity. They can be installed by
#Pkg.add("Conda")
using Conda
add_channel("conda-forge")
Conda.add("jupyter_contrib_nbextensions") Conda.
Notebook can be converted to other formats such as html, LaTeX, Markdown, Julia code, and many others, via menu
File
->Download as
. For your homework, please submit both notebook (ipynb) and html.Mathematical formula can can be typeset as LaTeX in Markdown cells. For example, inline math: \(e^{i \pi} + 1 = 0\) and displayed math \[ e^x = \sum_{i=0}^\infty \frac{1}{i!} x^i. \] For multiline displayed math: \[\begin{eqnarray*} e^x &=& \sum_{i=0}^\infty \frac{1}{i!} x^i \\ &\approx& 1 + x + \frac{x^2}{2}. \end{eqnarray*}\]
If you have a lot of commonly used LaTeX macros, put them in a
.tex
file and load them using the notebook extensionLoad TeX macros
.
2 JupyterLab
JupyterLab (more IDE-like) is supposed to replace Jupyter Notebook after it reaches v1.0.
To invoke JupyterLab:
using IJulia
jupyterlab() # use home as working directory
or
jupyterlab(dir = pwd()) # use current directory as working directory
- To install extensions for JupyterLab,
Settings
->Enable Extension Manager (experimental)
then click the extension icon on the left to search, install, and uninstall extensions.