Blurb: Forward Sample Exploration in Soss
May 10, 2021
Over the past few months, we've made some big improvements to our tools for working with samples in Soss. This short post gives an update.
Say you have a linear model
using Soss
using MeasureTheory
using TupleVectors
m = @model n begin
α ~ Normal()
β ~ Normal()
x ~ Normal() |> iid(n)
σ ~ Exponential(λ=1)
y ~ For(x) do xj
Normal(α + β * xj, σ)
end
return y
end
You can "run" the model like this:
rand(m(3))
3-element Vector{Float64}:
-1.3481528254815156
-0.8467839966758355
-3.2793581323670113
If you want more (iid) samples from the model, you can do
rand(m(3),5)
5-element ArraysOfArrays.ArrayOfSimilarArrays{Float64, 1, 1, 2, ElasticArrays.ElasticMatrix{Float64, 1, Vector{Float64}}}:
[-0.5801870698412048, -0.7831790492917818, -3.694601328454099]
[2.0979335431889803, 3.5630270546059557, 2.769652054248688]
[-3.1107934379484066, -2.6920071046722596, -2.1705448322142686]
[1.208651101972964, 1.7374231087644396, 2.2250207982499965]
[1.4589264372159674, 1.2845073278482224, 1.3285317115334254]
Note the ArraysOfArrays
. The underlying storage is a single Matrix
. Still, this quickly becomes a wall of numbers that's hard to track. We'll be adding more tools to make rand
output easy to work with (Note to self...). But for now, let's compare simulate
:
simulate(m(3), 5)
5-element TupleVector with schema (trace = (x = Vector{Float64}, y = Vector{Float64}, α = Float64, β = Float64, σ = Float64), value = Vector{Float64})
(trace = (x = [-0.51±2.0, -0.58±1.3, -0.13±0.72], y = [-0.43±1.7, -0.54±1.0, 0.05±1.1], α = -0.1±0.72, β = -0.37±0.91, σ = 0.64±0.54), value = [-0.43±1.7, -0.54±1.0, 0.05±1.1])
Much better! Now we can easily do things like
mysim = simulate(m(3), 1000)
1000-element TupleVector with schema (trace = (x = Vector{Float64}, y = Vector{Float64}, α = Float64, β = Float64, σ = Float64), value = Vector{Float64})
(trace = (x = [0.02±0.95, 0.07±0.99, 0.0±0.97], y = [-0.06±1.9, 0.04±1.9, -0.0±2.0], α = -0.001±0.97, β = 0.02±1.0, σ = 1.0±1.0), value = [-0.06±1.9, 0.04±1.9, -0.0±2.0])
Just to be clear, what we're looking at here is a summary of 1000 samples. Here's one of them:
mysim[1]
(trace = (x = [0.30021947827881323, -1.7858459256694803, 1.077205113757541], y = [0.6374538286688411, -5.9348120120560885, 0.7044161798044803], α = -0.4563348174767442, β = 2.0661080442940025, σ = 0.8741722654925329), value = [0.6374538286688411, -5.9348120120560885, 0.7044161798044803])
It can be handy to make the names of a TupleVector
available as variables, transform them, and then wrap the result into a new TupleVector
. For that, we have @with
. Here are the expected y
values and studentized residuals
mytrace = mysim.trace
@with mytrace begin
Ey = α .+ β .* x
r = (y - Ey) / σ
(;Ey, r)
end
1000-element TupleVector with schema (Ey = Vector{Float64}, r = Vector{Float64})
(Ey = [-0.0±1.4, -0.04±1.4, 0.0±1.3], r = [0.03±1.0, 0.06±0.96, 0.0±1.1])
We could even be a little fancy and do
@with mysim begin
@with trace begin
Ey = α .+ β .* x
r = (y - Ey) / σ
(;Ey, r)
end
end
1000-element TupleVector with schema (Ey = Vector{Float64}, r = Vector{Float64})
(Ey = [-0.0±1.4, -0.04±1.4, 0.0±1.3], r = [0.03±1.0, 0.06±0.96, 0.0±1.1])
You can do this with posterior samples too! Let's save that one for next time.