Introduction to¶

System information (for reproducibility)

versioninfo()

Julia Version 1.7.1
Commit ac5cc99908 (2021-12-22 19:35 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin19.5.0)
  CPU: Intel(R) Core(TM) i7-6920HQ CPU @ 2.90GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, skylake)
Environment:
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 4

What's Julia?¶

Julia is a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments

History:
- Project started in 2009. First public release in 2012
- Creators: Jeff Bezanson, Alan Edelman, Stefan Karpinski, Viral Shah
- First major release v1.0 was released on Aug 8, 2018
- Current stable release v1.7.2
Aim to solve the notorious two language problem: Prototype code goes into high-level languages like R/Python, production code goes into low-level language like C/C++.

Julia aims to:

Walks like Python. Runs like C.

See https://julialang.org/benchmarks/ for the details of benchmark.

Write high-level, abstract code that closely resembles mathematical formulas
- yet produces fast, low-level machine code that has traditionally only been generated by static languages.
Julia is more than just "Fast R" or "Fast Matlab"
- Performance comes from features that work well together.
- You can't just take the magic dust that makes Julia fast and sprinkle it on [language of choice]

Learning resources¶

The (free) online course Introduction to Julia, by Jane Herriman.
Cheat sheet: The Fast Track to Julia.
Browse the Julia documentation.
For Matlab users, read Noteworthy Differences From Matlab.

For R users, read Noteworthy Differences From R.

For Python users, read Noteworthy Differences From Python.
The Learning page on Julia's website has pointers to many other learning resources.

Julia REPL (Read-Evaluation-Print-Loop)¶

The Julia REPL, or Julia shell, has at least five modes.

Default mode is the Julian prompt julia>. Type backspace in other modes to enter default mode.
Help mode help?>. Type ? to enter help mode. ?search_term does a fuzzy search for search_term.
Shell mode shell>. Type ; to enter shell mode.
Package mode (@v1.7) pkg>. Type ] to enter package mode for managing Julia packages (install, uninstall, update, ...).
Search mode (reverse-i-search). Press ctrl+R to enter search model.
With RCall.jl package installed, we can enter the R mode by typing $ (shift+4) at Julia REPL.

Some survival commands in Julia REPL:

exit() or Ctrl+D: exit Julia.
Ctrl+C: interrupt execution.
Ctrl+L: clear screen.
Append ; (semi-colon) to suppress displaying output from a command. Same as Matlab.
include("filename.jl") to source a Julia code file.

Seek help¶

Online help from REPL: ?function_name.
Google.
Julia documentation: https://docs.julialang.org/en/.
Look up source code: @edit sin(π).
Discourse: https://discourse.julialang.org.
Friends.

Which IDE?¶

Julia homepage lists many choices: Juno, VS Code, Vim, ...
Unfortunately at the moment there are no mature RStudio- or Matlab-like IDE for Julia yet.
For dynamic document, e.g., homework, I recommend Jupyter Notebook or JupyterLab.
For extensive Julia coding, myself has been happily using the editor VS Code with extensions Julia and VS Code Jupyter Notebook Previewer installed.

Julia package system¶

Each Julia package is a Git repository. Each Julia package name ends with .jl. E.g., Distributions.jl package lives at https://github.com/JuliaStats/Distributions.jl.
Google search with PackageName.jl usually leads to the package on github.com.
The package ecosystem is rapidly maturing; a complete list of registered packages (which are required to have a certain level of testing and documentation) is at http://pkg.julialang.org/.
For example, the package called Distributions.jl is added with
```
# in Pkg mode
(@v1.7) pkg> add Distributions
```
and "removed" (although not completely deleted) with
```
# in Pkg mode
(@v1.7) pkg> rm Distributions
```
The package manager provides a dependency solver that determines which packages are actually required to be installed.

Non-registered packages are added by cloning the relevant Git repository. E.g.,

# in Pkg mode
(@v1.7) pkg> add https://github.com/OpenMendel/SnpArrays.jl

A package needs only be added once, at which point it is downloaded into your local .julia/packages directory in your home directory.

readdir(Sys.islinux() ? ENV["JULIA_PATH"] * "/pkg/packages" : ENV["HOME"] * "/.julia/packages")

764-element Vector{String}:
 ".DS_Store"
 "ADMIXTURE"
 "AMD"
 "ASL_jll"
 "AbstractFFTs"
 "AbstractMCMC"
 "AbstractTrees"
 "AccurateArithmetic"
 "Adapt"
 "AdvancedHMC"
 "AdvancedMH"
 "AdvancedVI"
 "AlgebraResultTypes"
 ⋮
 "ghr_jll"
 "isoband_jll"
 "libass_jll"
 "libfdk_aac_jll"
 "libpng_jll"
 "libsixel_jll"
 "libsodium_jll"
 "libvorbis_jll"
 "nghttp2_jll"
 "x264_jll"
 "x265_jll"
 "xkbcommon_jll"

Directory of a specific package can be queried by pathof():

using Distributions

pathof(Distributions)

"/Users/huazhou/.julia/packages/Distributions/DiPwc/src/Distributions.jl"

If you start having problems with packages that seem to be unsolvable, you may try just deleting your .julia directory and reinstalling all your packages.
Periodically, one should run update in Pkg mode, which checks for, downloads and installs updated versions of all the packages you currently have installed.
status lists the status of all installed packages.
Using functions in package.
```
using Distributions
```
This pulls all of the exported functions in the module into your local namespace, as you can check using the whos() command. An alternative is
```
import Distributions
```
Now, the functions from the Distributions package are available only using
```
Distributions.<FUNNAME>
```
All functions, not only exported functions, are always available like this.

Calling R from Julia¶

The RCall.jl package allows us to embed R code inside of Julia.
There are also PyCall.jl, MATLAB.jl, JavaCall.jl, CxxWrap.jl packages for interfacing with other languages.

using RCall

x = randn(1000)
# $ is the interpolation operator
R"""
hist($x, main="I'm plotting a Julia vector")
"""

RObject{VecSxp}
$breaks
 [1] -3.0 -2.5 -2.0 -1.5 -1.0 -0.5  0.0  0.5  1.0  1.5  2.0  2.5  3.0  3.5

$counts
 [1]   3  14  54  86 143 210 200 162  63  41  19   3   2

$density
 [1] 0.006 0.028 0.108 0.172 0.286 0.420 0.400 0.324 0.126 0.082 0.038 0.006
[13] 0.004

$mids
 [1] -2.75 -2.25 -1.75 -1.25 -0.75 -0.25  0.25  0.75  1.25  1.75  2.25  2.75
[13]  3.25

$xname
[1] "`#JL`$x"

$equidist
[1] TRUE

attr(,"class")
[1] "histogram"

R"""
library(ggplot2)
qplot($x)
"""

RObject{VecSxp}

┌ Warning: RCall.jl: `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
└ @ RCall /Users/huazhou/.julia/packages/RCall/6kphM/src/io.jl:172

x = R"""
rnorm(10)
"""

RObject{RealSxp}
 [1] -0.53043218  0.56761042 -0.25865286  1.57331723 -0.03542783 -0.51245029
 [7]  0.06337553  1.06360507 -1.58427874 -0.72726674

# collect R variable into Julia workspace
y = collect(x)

10-element Vector{Float64}:
 -0.5304321777501675
  0.5676104158047153
 -0.25865286093103507
  1.57331723156036
 -0.03542782948695541
 -0.5124502913313251
  0.06337552593332657
  1.0636050702700575
 -1.5842787434427186
 -0.727266737560106

Access Julia variables in R REPL mode:

julia> x = rand(5) # Julia variable
R> y <- $x

Pass Julia expression in R REPL mode:
```
R> y <- $(rand(5))
```
Put Julia variable into R environment:
```
julia> @rput x
R> x
```
Get R variable into Julia environment:
```
R> r <- 2
Julia> @rget r
```
If you want to call Julia within R, check out the JuliaCall package.

Some basic Julia code¶

# an integer, same as int in R
y = 1

1

# query type of a Julia object
typeof(y)

Int64

# a Float64 number, same as double in R
y = 1.0

1.0

typeof(y)

Float64

# Greek letters:  `\pi<tab>`
π

π = 3.1415926535897...

typeof(π)

Irrational{:π}

# Greek letters:  `\theta<tab>`
θ = y + π

4.141592653589793

# emoji! `\:kissing_cat:<tab>`
😽 = 5.0
😽 + 1

6.0

# `\alpha<tab>\hat<tab>`
α̂ = π

π = 3.1415926535897...

For a list of unicode symbols that can be tab-completed, see https://docs.julialang.org/en/v1/manual/unicode-input/. Or in the help mode, type ? followed by the unicode symbol you want to input.

# vector of Float64 0s
x = zeros(5)

5-element Vector{Float64}:
 0.0
 0.0
 0.0
 0.0
 0.0

# vector Int64 0s
x = zeros(Int, 5)

5-element Vector{Int64}:
 0
 0
 0
 0
 0

# matrix of Float64 0s
x = zeros(5, 3)

5×3 Matrix{Float64}:
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0

# matrix of Float64 1s
x = ones(5, 3)

5×3 Matrix{Float64}:
 1.0  1.0  1.0
 1.0  1.0  1.0
 1.0  1.0  1.0
 1.0  1.0  1.0
 1.0  1.0  1.0

# define array without initialization
x = Matrix{Float64}(undef, 5, 3)

5×3 Matrix{Float64}:
 2.27031e-314  2.27031e-314  2.27031e-314
 2.27031e-314  2.27031e-314  2.27031e-314
 2.27031e-314  2.27031e-314  2.27031e-314
 2.27031e-314  2.27031e-314  2.27031e-314
 2.27031e-314  2.27031e-314  2.27031e-314

# fill a matrix by 0s
fill!(x, 0)

5×3 Matrix{Float64}:
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0

# initialize an array to be constant 2.5
fill(2.5, (5, 3))

5×3 Matrix{Float64}:
 2.5  2.5  2.5
 2.5  2.5  2.5
 2.5  2.5  2.5
 2.5  2.5  2.5
 2.5  2.5  2.5

# rational number
a = 3//5

3//5

typeof(a)

Rational{Int64}

b = 3//7

3//7

a + b

36//35

# uniform [0, 1) random numbers
x = rand(5, 3)

5×3 Matrix{Float64}:
 0.047606   0.0502199  0.679921
 0.0458958  0.0893325  0.524778
 0.634395   0.814916   0.091303
 0.876273   0.640089   0.916614
 0.502563   0.701718   0.11058

# uniform random numbers (in single precision)
x = rand(Float16, 5, 3)

5×3 Matrix{Float16}:
 0.767    0.01758  0.991
 0.8003   0.836    0.185
 0.07715  0.3223   0.66
 0.2134   0.1299   0.2422
 0.689    0.4521   0.9644

# random numbers from {1,...,5}
x = rand(1:5, 5, 3)

5×3 Matrix{Int64}:
 3  5  1
 3  3  2
 4  2  3
 4  2  1
 2  1  4

# standard normal random numbers
x = randn(5, 3)

5×3 Matrix{Float64}:
 -1.65929  -1.17293   -0.373762
 -2.50692   0.334339  -0.0401816
 -1.27288   0.44366   -0.881518
  1.07706  -1.25179   -0.881024
 -1.43587   0.720891   0.661528

# range
1:10

1:10

typeof(1:10)

UnitRange{Int64}

1:2:10

1:2:9

typeof(1:2:10)

StepRange{Int64, Int64}

# integers 1-10
x = collect(1:10)

10-element Vector{Int64}:
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10

# or equivalently
[1:10...]

10-element Vector{Int64}:
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10

# Float64 numbers 1-10
x = collect(1.0:10)

10-element Vector{Float64}:
  1.0
  2.0
  3.0
  4.0
  5.0
  6.0
  7.0
  8.0
  9.0
 10.0

# convert to a specific type
convert(Vector{Float64}, 1:10)

10-element Vector{Float64}:
  1.0
  2.0
  3.0
  4.0
  5.0
  6.0
  7.0
  8.0
  9.0
 10.0

Matrices and vectors¶

Dimensions¶

x = randn(5, 3)

5×3 Matrix{Float64}:
 -0.814703   0.288814  -0.834951
  0.723848  -0.563924  -0.443618
  0.621758   1.06491   -0.163703
 -0.953084   0.233394   0.788431
  0.123303   0.831046   0.156951

size(x)

(5, 3)

size(x, 1) # nrow() in R

5

size(x, 2) # ncol() in R

3

# total number of elements
length(x)

15

Indexing¶

# 5 × 5 matrix of random Normal(0, 1)
x = randn(5, 5)

5×5 Matrix{Float64}:
  1.26751   -0.750595  -0.875274   -0.622825    1.44481
  0.671263  -0.907677   0.0778281   0.636856   -0.290479
 -2.10118    0.241139  -0.613094   -1.16334    -0.563644
 -1.41947   -0.200651   0.277035    0.0971816  -1.10866
 -0.907153   0.391468  -0.435796    0.718942    1.96369

# first column
x[:, 1]

5-element Vector{Float64}:
  1.2675098330277035
  0.6712630769310894
 -2.1011847085576454
 -1.41946867019658
 -0.907153486002921

# first row
x[1, :]

5-element Vector{Float64}:
  1.2675098330277035
 -0.7505947863687497
 -0.8752736245643277
 -0.6228252985715261
  1.4448100906311978

# sub-array
x[1:2, 2:3]

2×2 Matrix{Float64}:
 -0.750595  -0.875274
 -0.907677   0.0778281

# getting a subset of a matrix creates a copy, but you can also create "views"
z = view(x, 1:2, 2:3)

2×2 view(::Matrix{Float64}, 1:2, 2:3) with eltype Float64:
 -0.750595  -0.875274
 -0.907677   0.0778281

# same as
@views z = x[1:2, 2:3]

2×2 view(::Matrix{Float64}, 1:2, 2:3) with eltype Float64:
 -0.750595  -0.875274
 -0.907677   0.0778281

# change in z (view) changes x as well
z[2, 2] = 0.0
x

5×5 Matrix{Float64}:
  1.26751   -0.750595  -0.875274  -0.622825    1.44481
  0.671263  -0.907677   0.0        0.636856   -0.290479
 -2.10118    0.241139  -0.613094  -1.16334    -0.563644
 -1.41947   -0.200651   0.277035   0.0971816  -1.10866
 -0.907153   0.391468  -0.435796   0.718942    1.96369

# y points to same data as x
y = x

5×5 Matrix{Float64}:
  1.26751   -0.750595  -0.875274  -0.622825    1.44481
  0.671263  -0.907677   0.0        0.636856   -0.290479
 -2.10118    0.241139  -0.613094  -1.16334    -0.563644
 -1.41947   -0.200651   0.277035   0.0971816  -1.10866
 -0.907153   0.391468  -0.435796   0.718942    1.96369

# x and y point to same data
pointer(x), pointer(y)

(Ptr{Float64} @0x000000010fbdcd40, Ptr{Float64} @0x000000010fbdcd40)

# changing y also changes x
y[:, 1] .= 0
x

5×5 Matrix{Float64}:
 0.0  -0.750595  -0.875274  -0.622825    1.44481
 0.0  -0.907677   0.0        0.636856   -0.290479
 0.0   0.241139  -0.613094  -1.16334    -0.563644
 0.0  -0.200651   0.277035   0.0971816  -1.10866
 0.0   0.391468  -0.435796   0.718942    1.96369

# create a new copy of data
z = copy(x)

5×5 Matrix{Float64}:
 0.0  -0.750595  -0.875274  -0.622825    1.44481
 0.0  -0.907677   0.0        0.636856   -0.290479
 0.0   0.241139  -0.613094  -1.16334    -0.563644
 0.0  -0.200651   0.277035   0.0971816  -1.10866
 0.0   0.391468  -0.435796   0.718942    1.96369

pointer(x), pointer(z)

(Ptr{Float64} @0x000000010fbdcd40, Ptr{Float64} @0x0000000111867940)

Concatenate matrices¶

# 3-by-1 vector
[1, 2, 3]

3-element Vector{Int64}:
 1
 2
 3

# 1-by-3 array
[1 2 3]

1×3 Matrix{Int64}:
 1  2  3

# multiple assignment by tuple
x, y, z = randn(5, 3), randn(5, 2), randn(3, 5)

([-0.9477628398395715 -0.5680216396442697 0.9776456533674671; 1.320765469378728 0.5539515510691637 0.012879518542824894; … ; -0.3065844868867842 -1.521459579989845 1.7809451492281219; 0.1914985734622127 -1.0096621484763293 -0.8651540453664636], [-0.05928105799550312 -0.177908311648083; -1.38366113824288 0.5239389644482733; … ; -0.4451873100431668 -0.7033886228711982; -0.10116013164005477 1.6062267924557416], [0.6023147943290145 0.15411304627550018 … -0.2656322203833123 -0.8878537807512806; -0.3551317104362576 2.275291330136091 … 0.9548682156109642 0.9508038724421524; 0.49695188635517823 0.24542929333786065 … 1.4477564106391216 -0.09298994940963297])

[x y] # 5-by-5 matrix

5×5 Matrix{Float64}:
 -0.947763  -0.568022   0.977646   -0.0592811  -0.177908
  1.32077    0.553952   0.0128795  -1.38366     0.523939
 -0.950924  -1.31552    0.518165   -1.38032     0.670996
 -0.306584  -1.52146    1.78095    -0.445187   -0.703389
  0.191499  -1.00966   -0.865154   -0.10116     1.60623

[x y; z] # 8-by-5 matrix

8×5 Matrix{Float64}:
 -0.947763  -0.568022   0.977646   -0.0592811  -0.177908
  1.32077    0.553952   0.0128795  -1.38366     0.523939
 -0.950924  -1.31552    0.518165   -1.38032     0.670996
 -0.306584  -1.52146    1.78095    -0.445187   -0.703389
  0.191499  -1.00966   -0.865154   -0.10116     1.60623
  0.602315   0.154113  -1.06799    -0.265632   -0.887854
 -0.355132   2.27529    0.847743    0.954868    0.950804
  0.496952   0.245429   1.03183     1.44776    -0.0929899

Dot operation¶

Dot operation in Julia is elementwise operation, similar to Matlab.

x = randn(5, 3)

5×3 Matrix{Float64}:
 -0.37447    1.30118   -0.480235
 -0.341169   0.395087  -0.980403
 -1.26326    0.545045   0.324121
 -0.862089   1.37121   -1.97035
 -0.260371  -2.01868    1.68225

y = ones(5, 3)

5×3 Matrix{Float64}:
 1.0  1.0  1.0
 1.0  1.0  1.0
 1.0  1.0  1.0
 1.0  1.0  1.0
 1.0  1.0  1.0

x .* y # same x * y in R

5×3 Matrix{Float64}:
 -0.37447    1.30118   -0.480235
 -0.341169   0.395087  -0.980403
 -1.26326    0.545045   0.324121
 -0.862089   1.37121   -1.97035
 -0.260371  -2.01868    1.68225

x .^ (-2)

5×3 Matrix{Float64}:
  7.13126   0.590647  4.33604
  8.59134   6.40642   1.04038
  0.626635  3.36617   9.51889
  1.34554   0.531853  0.257581
 14.7507    0.245394  0.353362

sin.(x)

5×3 Matrix{Float64}:
 -0.365779   0.963872  -0.461987
 -0.334589   0.384888  -0.830722
 -0.953082   0.518456   0.318475
 -0.759204   0.980149  -0.921235
 -0.257439  -0.901364   0.993796

Basic linear algebra¶

x = randn(5)

5-element Vector{Float64}:
  0.6015470533331378
  0.17567572092512737
 -0.642738902849795
 -1.2406965774929115
  1.3045540397878905

using LinearAlgebra
# vector L2 norm
norm(x)

2.0117214900831657

# same as
sqrt(sum(abs2, x))

2.0117214900831657

y = randn(5) # another vector
# dot product
dot(x, y) # x' * y

-1.059520656533084

# same as
x'y

-1.059520656533084

x, y = randn(5, 3), randn(3, 2)
# matrix multiplication, same as %*% in R
x * y

5×2 Matrix{Float64}:
 -1.37288    2.71778
  0.225985   1.09464
  0.723694  -2.59413
  0.350855  -2.32957
 -0.661232   0.0378456

x = randn(3, 3)

3×3 Matrix{Float64}:
 -0.388787  -0.266499  -1.10798
 -0.321274  -0.159995  -0.704043
  0.582178   0.1411    -1.06964

# conjugate transpose
x'

3×3 adjoint(::Matrix{Float64}) with eltype Float64:
 -0.388787  -0.321274   0.582178
 -0.266499  -0.159995   0.1411
 -1.10798   -0.704043  -1.06964

b = rand(3)
x'b # same as x' * b

3-element Vector{Float64}:
  0.30489433149712164
 -0.03490568817581036
 -1.7385157754967215

# trace
tr(x)

-1.618421031162626

det(x)

0.04267851896683521

rank(x)

3

Sparse matrices¶

using SparseArrays

# 10-by-10 sparse matrix with sparsity 0.1
X = sprandn(10, 10, .1)

10×10 SparseMatrixCSC{Float64, Int64} with 12 stored entries:
  ⋅     ⋅         ⋅    ⋅         …    ⋅       -1.30133    ⋅    ⋅ 
  ⋅     ⋅         ⋅    ⋅              ⋅         ⋅         ⋅    ⋅ 
  ⋅    1.16592    ⋅    ⋅              ⋅         ⋅         ⋅    ⋅ 
  ⋅     ⋅         ⋅    ⋅              ⋅         ⋅         ⋅    ⋅ 
  ⋅    0.815178   ⋅    ⋅              ⋅        0.293583   ⋅    ⋅ 
  ⋅     ⋅         ⋅    ⋅         …    ⋅         ⋅         ⋅    ⋅ 
  ⋅     ⋅         ⋅    ⋅              ⋅         ⋅         ⋅   0.832443
  ⋅     ⋅         ⋅   0.0439842     -1.82386    ⋅         ⋅    ⋅ 
  ⋅   -0.389051   ⋅    ⋅              ⋅         ⋅         ⋅   1.48571
  ⋅     ⋅         ⋅    ⋅              ⋅         ⋅         ⋅    ⋅

# dump() in Julia is like str() in R
dump(X)

SparseMatrixCSC{Float64, Int64}
  m: Int64 10
  n: Int64 10
  colptr: Array{Int64}((11,)) [1, 1, 4, 4, 5, 6, 8, 9, 11, 11, 13]
  rowval: Array{Int64}((12,)) [3, 5, 9, 8, 10, 5, 10, 8, 1, 5, 7, 9]
  nzval: Array{Float64}((12,)) [1.1659155119266633, 0.8151779431546513, -0.3890514386593815, 0.0439841569305595, -1.196732092151957, 0.4155217195886783, 0.023865363521475943, -1.8238633794069272, -1.3013262308448805, 0.2935827548071014, 0.8324432685018412, 1.4857089567385118]

# convert to dense matrix; be cautious when dealing with big data
Xfull = convert(Matrix{Float64}, X)

10×10 Matrix{Float64}:
 0.0   0.0       0.0  0.0        …   0.0      -1.30133   0.0  0.0
 0.0   0.0       0.0  0.0            0.0       0.0       0.0  0.0
 0.0   1.16592   0.0  0.0            0.0       0.0       0.0  0.0
 0.0   0.0       0.0  0.0            0.0       0.0       0.0  0.0
 0.0   0.815178  0.0  0.0            0.0       0.293583  0.0  0.0
 0.0   0.0       0.0  0.0        …   0.0       0.0       0.0  0.0
 0.0   0.0       0.0  0.0            0.0       0.0       0.0  0.832443
 0.0   0.0       0.0  0.0439842     -1.82386   0.0       0.0  0.0
 0.0  -0.389051  0.0  0.0            0.0       0.0       0.0  1.48571
 0.0   0.0       0.0  0.0            0.0       0.0       0.0  0.0

# convert a dense matrix to sparse matrix
sparse(Xfull)

10×10 SparseMatrixCSC{Float64, Int64} with 12 stored entries:
  ⋅     ⋅         ⋅    ⋅         …    ⋅       -1.30133    ⋅    ⋅ 
  ⋅     ⋅         ⋅    ⋅              ⋅         ⋅         ⋅    ⋅ 
  ⋅    1.16592    ⋅    ⋅              ⋅         ⋅         ⋅    ⋅ 
  ⋅     ⋅         ⋅    ⋅              ⋅         ⋅         ⋅    ⋅ 
  ⋅    0.815178   ⋅    ⋅              ⋅        0.293583   ⋅    ⋅ 
  ⋅     ⋅         ⋅    ⋅         …    ⋅         ⋅         ⋅    ⋅ 
  ⋅     ⋅         ⋅    ⋅              ⋅         ⋅         ⋅   0.832443
  ⋅     ⋅         ⋅   0.0439842     -1.82386    ⋅         ⋅    ⋅ 
  ⋅   -0.389051   ⋅    ⋅              ⋅         ⋅         ⋅   1.48571
  ⋅     ⋅         ⋅    ⋅              ⋅         ⋅         ⋅    ⋅

# syntax for sparse linear algebra is same as dense linear algebra
β = ones(10)
X * β

10-element Vector{Float64}:
 -1.3013262308448805
  0.0
  1.1659155119266633
  0.0
  1.524282417550431
  0.0
  0.8324432685018412
 -1.7798792224763678
  1.0966575180791303
 -1.1728667286304812

# many functions apply to sparse matrices as well
sum(X)

0.3652265341063363

Control flow and loops¶

if-elseif-else-end

if condition1
    # do something
elseif condition2
    # do something
else
    # do something
end

for loop

for i in 1:10
    println(i)
end

Nested for loop:

for i in 1:10
    for j in 1:5
        println(i * j)
    end
end

Same as

for i in 1:10, j in 1:5
    println(i * j)
end

Exit loop:

for i in 1:10
    # do something
    if condition1
        break # skip remaining loop
    end
end

Exit iteration:

for i in 1:10
    # do something
    if condition1
        continue # skip to next iteration
    end
    # do something
end

Functions¶

In Julia, all arguments to functions are passed by reference, in contrast to R and Matlab.
Function names ending with ! indicates that function mutates at least one argument, typically the first.
```
sort!(x) # vs sort(x)
```
Function definition

function func(req1, req2; key1=dflt1, key2=dflt2)
    # do stuff
    return out1, out2, out3
end

Required arguments are separated with a comma and use the positional notation.
Optional arguments need a default value in the signature.
Semicolon is not required in function call.
return statement is optional.
Multiple outputs can be returned as a tuple, e.g., return out1, out2, out3.

Anonymous functions, e.g., x -> x^2, is commonly used in collection function or list comprehensions.
```
map(x -> x^2, y) # square each element in x
```
Functions can be nested:

function outerfunction()
    # do some outer stuff
    function innerfunction()
        # do inner stuff
        # can access prior outer definitions
    end
    # do more outer stuff
end

Functions can be vectorized using the Dot syntax:

# defined for scalar
function myfunc(x)
    return sin(x^2)
end

x = randn(5, 3)
myfunc.(x)

5×3 Matrix{Float64}:
 -0.0319351  0.768417   -0.703
  0.242969   0.542491    0.00218697
  0.0505571  0.0918307  -0.880399
  0.0557211  0.999963    0.00819753
  0.135464   0.0015931  -0.641239

Collection function (think this as the series of apply functions in R).

Apply a function to each element of a collection:

map(f, coll) # or
map(coll) do elem
    # do stuff with elem
    # must contain return
end

map(x -> sin(x^2), x)

5×3 Matrix{Float64}:
 -0.0319351  0.768417   -0.703
  0.242969   0.542491    0.00218697
  0.0505571  0.0918307  -0.880399
  0.0557211  0.999963    0.00819753
  0.135464   0.0015931  -0.641239

map(x) do elem
    elem = elem^2
    return sin(elem)
end

5×3 Matrix{Float64}:
 -0.0319351  0.768417   -0.703
  0.242969   0.542491    0.00218697
  0.0505571  0.0918307  -0.880399
  0.0557211  0.999963    0.00819753
  0.135464   0.0015931  -0.641239

# Mapreduce
mapreduce(x -> sin(x^2), +, x)

0.6428175068162457

# same as
sum(x -> sin(x^2), x)

0.6428175068162457

List comprehension

[sin(2i + j) for i in 1:5, j in 1:3] # similar to Python

5×3 Matrix{Float64}:
  0.14112   -0.756802  -0.958924
 -0.958924  -0.279415   0.656987
  0.656987   0.989358   0.412118
  0.412118  -0.544021  -0.99999
 -0.99999   -0.536573   0.420167

Type system¶

Every variable in Julia has a type.
When thinking about types, think about sets.
Everything is a subtype of the abstract type Any.
An abstract type defines a set of types
- Consider types in Julia that are a Number:

We can explore type hierarchy with typeof(), supertype(), and subtypes().

typeof(1.0), typeof(1)

(Float64, Int64)

supertype(Float64)

AbstractFloat

subtypes(AbstractFloat)

4-element Vector{Any}:
 BigFloat
 Float16
 Float32
 Float64

# Is Float64 a subtype of AbstractFloat?
Float64 <: AbstractFloat

true

# On 64bit machine, Int == Int64
Int == Int64

true

# convert to Float64
convert(Float64, 1)

1.0

# same as
Float64(1)

1.0

# Float32 vector
x = randn(Float32, 5)

5-element Vector{Float32}:
  1.570766
 -0.78435385
  2.325658
 -0.13330916
  0.27333868

# convert to Float64
convert(Vector{Float64}, x)

5-element Vector{Float64}:
  1.5707659721374512
 -0.7843538522720337
  2.325658082962036
 -0.13330915570259094
  0.2733386754989624

# same as
Float64.(x)

5-element Vector{Float64}:
  1.5707659721374512
 -0.7843538522720337
  2.325658082962036
 -0.13330915570259094
  0.2733386754989624

# convert Float64 to Int64
convert(Int, 1.0)

1

convert(Int, 1.5) # should use round(1.5)

InexactError: Int64(1.5)

Stacktrace:
 [1] Int64
   @ ./float.jl:812 [inlined]
 [2] convert(#unused#::Type{Int64}, x::Float64)
   @ Base ./number.jl:7
 [3] top-level scope
   @ In[102]:1
 [4] eval
   @ ./boot.jl:373 [inlined]
 [5] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
   @ Base ./loading.jl:1196

round(Int, 1.5)

2

Multiple dispatch¶

Multiple dispatch lies in the core of Julia design. It allows built-in and user-defined functions to be overloaded for different combinations of argument types.
Let's consider a simple "doubling" function:

g(x) = x + x

g (generic function with 1 method)

g(1.5)

3.0

This definition is too broad, since some things, e.g., strings, can't be added

g("hello world")

MethodError: no method matching +(::String, ::String)
Closest candidates are:
  +(::Any, ::Any, ::Any, ::Any...) at /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/base/operators.jl:655
  +(::ChainRulesCore.AbstractThunk, ::Any) at ~/.julia/packages/ChainRulesCore/RbX5a/src/tangent_arithmetic.jl:122
  +(::ChainRulesCore.Tangent{P}, ::P) where P at ~/.julia/packages/ChainRulesCore/RbX5a/src/tangent_arithmetic.jl:146
  ...

Stacktrace:
 [1] g(x::String)
   @ Main ./In[104]:1
 [2] top-level scope
   @ In[106]:1
 [3] eval
   @ ./boot.jl:373 [inlined]
 [4] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
   @ Base ./loading.jl:1196

This definition is correct but too restrictive, since any Number can be added.

g(x::Float64) = x + x

g (generic function with 2 methods)

This definition will automatically work on the entire type tree above!

g(x::Number) = x + x

g (generic function with 3 methods)

This is a lot nicer than

function g(x)
    if isa(x, Number)
        return x + x
    else
        throw(ArgumentError("x should be a number"))
    end
end

methods(func) function display all methods defined for func.

methods(g)

When calling a function with multiple definitions, Julia will search from the narrowest signature to the broadest signature.
@which func(x) marco tells which method is being used for argument signature x.

# an Int64 input
@which g(1)

# a Vector{Float64} input
@which g(randn(5))

Just-in-time compilation (JIT)¶

Following figures are taken from Arch D. Robinson's slides Introduction to Writing High Performance Julia.

Julia's efficiency results from its capability to infer the types of all variables within a function and then call LLVM to generate optimized machine code at run-time.

Consider the g (doubling) function defined earlier. This function will work on any type which has a method for +.

g(2), g(2.0)

(4, 4.0)

Step 1: Parse Julia code into abstract syntax tree (AST).

@code_lowered g(2)

CodeInfo(
1 ─ %1 = x + x
└──      return %1
)

Step 2: Type inference according to input type.

@code_warntype g(2)

MethodInstance for g(::Int64)
  from g(x::Number) in Main at In[108]:1
Arguments
  #self#::Core.Const(g)
  x::Int64
Body::Int64
1 ─ %1 = (x + x)::Int64
└──      return %1

@code_warntype g(2.0)

MethodInstance for g(::Float64)
  from g(x::Float64) in Main at In[107]:1
Arguments
  #self#::Core.Const(g)
  x::Float64
Body::Float64
1 ─ %1 = (x + x)::Float64
└──      return %1

Step 3: Compile into LLVM bytecode (equivalent of R bytecode generated by the compiler package).

@code_llvm g(2)

;  @ In[108]:1 within `g`
define i64 @julia_g_5791(i64 signext %0) #0 {
top:
; ┌ @ int.jl:87 within `+`
   %1 = shl i64 %0, 1
; └
  ret i64 %1
}

@code_llvm g(2.0)

;  @ In[107]:1 within `g`
define double @julia_g_5812(double %0) #0 {
top:
; ┌ @ float.jl:399 within `+`
   %1 = fadd double %0, %0
; └
  ret double %1
}

We didn't provide a type annotation. But different LLVM code was generated depending on the argument type!

In R or Python, g(2) and g(2.0) would use the same code for both.

In Julia, g(2) and g(2.0) dispatches to optimized code for Int64 and Float64, respectively.

For integer input x, LLVM compiler is smart enough to know x + x is simple shifting x by 1 bit, which is faster than addition.

Step 4: Lowest level is the assembly code, which is machine dependent.

@code_native g(2)

	.section	__TEXT,__text,regular,pure_instructions
; ┌ @ In[108]:1 within `g`
; │┌ @ int.jl:87 within `+`
	leaq	(%rdi,%rdi), %rax
; │└
	retq
	nopw	%cs:(%rax,%rax)
; └

@code_native g(2.0)

	.section	__TEXT,__text,regular,pure_instructions
; ┌ @ In[107]:1 within `g`
; │┌ @ float.jl:399 within `+`
	vaddsd	%xmm0, %xmm0, %xmm0
; │└
	retq
	nopw	%cs:(%rax,%rax)
; └

Profiling Julia code¶

Julia has several built-in tools for profiling. The @time marco outputs run time and heap allocation.

# a function defined earlier
function tally(x::Array)
    s = zero(eltype(x))
    for v in x
        s += v
    end
    s
end

using Random
Random.seed!(123)
a = rand(20_000_000)
@time tally(a) # first run: include compile time

  0.034342 seconds (4.27 k allocations: 218.401 KiB, 18.19% compilation time)

9.9992648402688e6

@time tally(a)

  0.027129 seconds (1 allocation: 16 bytes)

9.9992648402688e6

For more robust benchmarking, the BenchmarkTools.jl package is highly recommended.

using BenchmarkTools

@benchmark tally($a)

BenchmarkTools.Trial: 197 samples with 1 evaluation.
 Range (min … max):  24.818 ms …  31.434 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     25.125 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   25.368 ms ± 799.563 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

    █▅█▄ ▁                                                      
  ▄▆████▅█▆▆▄▃▃▃▃▃▂▃▂▃▂▂▁▁▁▃▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▁▁▁▂▁▁▁▁▃▁▁▃ ▂
  24.8 ms         Histogram: frequency by time         28.5 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

The Profile module gives line by line profile results.

using Profile

Profile.clear()
@profile tally(a)
Profile.print(format=:flat)

 Count  Overhead File                    Line Function
 =====  ======== ====                    ==== ========
    24         0 In[120]                    5 tally(x::Vector{Float64})
    24         0 @Base/array.jl           835 iterate
    26         2 @Base/boot.jl            373 eval
    26         0 @Base/condition.jl        78 lock
    26         0 @Base/condition.jl       123 wait(c::Base.GenericCondition{R...
    26         0 @Base/essentials.jl      716 #invokelatest#2
    26         0 @Base/essentials.jl      714 invokelatest
    24        24 @Base/int.jl             476 <
    24         0 @Base/int.jl             483 <
    26         0 @Base/loading.jl        1196 include_string(mapexpr::typeof(...
    26         0 @Base/lock.jl            190 lock(f::Distributed.var"#134#13...
    26         0 @Base/task.jl            423 (::IJulia.var"#15#18")()
    78        78 @Base/task.jl            827 poptask(W::Base.InvasiveLinkedL...
    52         0 @Base/task.jl            544 task_done_hook(t::Task)
    78         0 @Base/task.jl            836 wait()
    26         0 ...eadingconstructs.jl   178 (::Distributed.var"#133#135")()
    26         0 ...ia/src/eventloop.jl     8 eventloop(socket::ZMQ.Socket)
    26         0 .../execute_request.jl    67 execute_request(socket::ZMQ.Soc...
    26         0 .../SoftGlobalScope.jl    65 softscope_include_string(m::Mod...
    26         0 ...d/src/remotecall.jl   281 #134
    26         0 ...d/src/remotecall.jl   279 macro expansion
Total snapshots: 104

One can use ProfileView package for better visualization of profile data:

using ProfileView

ProfileView.view()

# check type stability
@code_warntype tally(a)

MethodInstance for tally(::Vector{Float64})
  from tally(x::Array) in Main at In[120]:2
Arguments
  #self#::Core.Const(tally)
  x::Vector{Float64}
Locals
  @_3::Union{Nothing, Tuple{Float64, Int64}}
  s::Float64
  v::Float64
Body::Float64
1 ─ %1  = Main.eltype(x)::Core.Const(Float64)
│         (s = Main.zero(%1))
│   %3  = x::Vector{Float64}
│         (@_3 = Base.iterate(%3))
│   %5  = (@_3 === nothing)::Bool
│   %6  = Base.not_int(%5)::Bool
└──       goto #4 if not %6
2 ┄ %8  = @_3::Tuple{Float64, Int64}
│         (v = Core.getfield(%8, 1))
│   %10 = Core.getfield(%8, 2)::Int64
│         (s = s + v)
│         (@_3 = Base.iterate(%3, %10))
│   %13 = (@_3 === nothing)::Bool
│   %14 = Base.not_int(%13)::Bool
└──       goto #4 if not %14
3 ─       goto #2
4 ┄       return s

# check LLVM bitcode
@code_llvm tally(a)

;  @ In[120]:2 within `tally`
define double @julia_tally_6661({}* nonnull align 16 dereferenceable(40) %0) #0 {
top:
;  @ In[120]:4 within `tally`
; ┌ @ array.jl:835 within `iterate` @ array.jl:835
; │┌ @ array.jl:215 within `length`
    %1 = bitcast {}* %0 to { i8*, i64, i16, i16, i32 }*
    %2 = getelementptr inbounds { i8*, i64, i16, i16, i32 }, { i8*, i64, i16, i16, i32 }* %1, i64 0, i32 1
    %3 = load i64, i64* %2, align 8
; │└
; │┌ @ int.jl:483 within `<` @ int.jl:476
    %.not = icmp eq i64 %3, 0
; │└
   br i1 %.not, label %L43, label %L18

L18:                                              ; preds = %top
; │┌ @ array.jl:861 within `getindex`
    %4 = bitcast {}* %0 to double**
    %5 = load double*, double** %4, align 8
    %6 = load double, double* %5, align 8
; └└
;  @ In[120]:5 within `tally`
; ┌ @ float.jl:399 within `+`
   %7 = fadd double %6, 0.000000e+00
; └
; ┌ @ array.jl:835 within `iterate`
; │┌ @ int.jl:483 within `<` @ int.jl:476
    %.not1317.not = icmp eq i64 %3, 1
; │└
   br i1 %.not1317.not, label %L43, label %L37.lr.ph

L37.lr.ph:                                        ; preds = %L18
   %8 = icmp ugt i64 %3, 2
   %umax = select i1 %8, i64 %3, i64 2
   br label %L37

L37:                                              ; preds = %L37, %L37.lr.ph
   %9 = phi i64 [ 1, %L37.lr.ph ], [ %value_phi418, %L37 ]
   %10 = phi double [ %7, %L37.lr.ph ], [ %14, %L37 ]
   %value_phi418 = phi i64 [ 2, %L37.lr.ph ], [ %13, %L37 ]
; │┌ @ array.jl:861 within `getindex`
    %11 = getelementptr inbounds double, double* %5, i64 %9
    %12 = load double, double* %11, align 8
; │└
; │┌ @ int.jl:87 within `+`
    %13 = add nuw nsw i64 %value_phi418, 1
; └└
; ┌ @ float.jl:399 within `+`
   %14 = fadd double %12, %10
; └
; ┌ @ array.jl:835 within `iterate`
; │┌ @ int.jl:483 within `<` @ int.jl:476
    %exitcond.not = icmp eq i64 %value_phi418, %umax
; │└
   br i1 %exitcond.not, label %L43, label %L37

L43:                                              ; preds = %L37, %L18, %top
   %value_phi9 = phi double [ 0.000000e+00, %top ], [ %7, %L18 ], [ %14, %L37 ]
; └
;  @ In[120]:7 within `tally`
  ret double %value_phi9
}

@code_native tally(a)

	.section	__TEXT,__text,regular,pure_instructions
; ┌ @ In[120]:4 within `tally`
; │┌ @ array.jl:835 within `iterate` @ array.jl:835
; ││┌ @ array.jl:215 within `length`
	movq	8(%rdi), %rdx
; ││└
; ││┌ @ int.jl:483 within `<` @ int.jl:476
	testq	%rdx, %rdx
; ││└
	je	L62
; ││┌ @ array.jl:861 within `getindex`
	movq	(%rdi), %rax
	vxorpd	%xmm0, %xmm0, %xmm0
; │└└
; │ @ In[120]:5 within `tally`
; │┌ @ float.jl:399 within `+`
	vaddsd	(%rax), %xmm0, %xmm0
; │└
; │┌ @ array.jl:835 within `iterate`
; ││┌ @ int.jl:483 within `<` @ int.jl:476
	cmpq	$1, %rdx
; ││└
	je	L61
	cmpq	$2, %rdx
	movl	$2, %ecx
	cmovaq	%rdx, %rcx
	movl	$1, %edx
	nopl	(%rax)
; │└
; │┌ @ float.jl:399 within `+`
L48:
	vaddsd	(%rax,%rdx,8), %xmm0, %xmm0
; │└
; │┌ @ array.jl:835 within `iterate`
; ││┌ @ int.jl:483 within `<` @ int.jl:476
	incq	%rdx
	cmpq	%rdx, %rcx
; ││└
	jne	L48
; │└
; │ @ In[120]:7 within `tally`
L61:
	retq
; │ @ In[120] within `tally`
L62:
	vxorps	%xmm0, %xmm0, %xmm0
; │ @ In[120]:7 within `tally`
	retq
	nopw	%cs:(%rax,%rax)
; └

Exercise: Annotate the loop in tally function by @simd and look for the difference in LLVM bitcode and machine code.

Memory profiling¶

Detailed memory profiling requires a detour. First let's write a script bar.jl, which contains the workload function tally and a wrapper for profiling.

;cat bar.jl

using Profile

function tally(x::Array)
    s = zero(eltype(x))
    for v in x
        s += v
    end
    s
end

# call workload from wrapper to avoid misattribution bug
function wrapper()
    y = rand(10000)
    # force compilation
    println(tally(y))
    # clear allocation counters
    Profile.clear_malloc_data()
    # run compiled workload
    println(tally(y))
end

wrapper()

Next, in terminal, we run the script with --track-allocation=user option.

#;julia --track-allocation=user bar.jl

The profiler outputs a file bar.jl.51116.mem.

;cat bar.jl.51116.mem

        - using Profile
        - 
        - function tally(x::Array)
        -     s = zero(eltype(x))
        -     for v in x
        -         s += v
        -     end
        -     s
        - end
        - 
        - # call workload from wrapper to avoid misattribution bug
        - function wrapper()
        0     y = rand(10000)
        -     # force compilation
        0     println(tally(y))
        -     # clear allocation counters
        0     Profile.clear_malloc_data()
        -     # run compiled workload
      528     println(tally(y))
        - end
        - 
        - wrapper()
        -