Itō Calculus and Stochastic Differential Equations
by Manuel de Prada Corral
2 min read
Basic notes of Itō calculus applied to stochastic differential equations (SDEs), with code to simulate and filter stochastic processes.
This blog is more a public notebook than a real blog.
Expect typos and unfinished posts :)
by Manuel de Prada Corral
2 min read
Basic notes of Itō calculus applied to stochastic differential equations (SDEs), with code to simulate and filter stochastic processes.
by Manuel de Prada Corral
2 min read
Continuing the saga of the Toy Transformer (that I will eventually merge here), I have built a few more toy models to debug generation algorithms.
by Manuel de Prada Corral
6 min read
This WIP post collects some of my notes on sampling theory.
by Manuel de Prada Corral
3 min read
A few weeks ago, I found myself implementing "Stochastic Beams and Where to Find Them" (Gumbel-top-k based sampling without replacement from a Transformer LM).
Debugging and verifying the correctness of a sampling algorithm is not straightforward. Thus, I built a fake carcass for a Transformer model with a small vocabulary and fixed controlled probabilities that could allow to keep a close eye on the logits and the generated sequences.
stateDiagram-v2
state "[0]" as 1
state "[0,1]" as 01
state "[0,2]" as 02
state "[0,1,1]" as 011
state "[0,1,1,3]" as 0113
state "[0,1,2]" as 012
state "[0,1,2,3]" as 0123
state "[0,2,1]" as 021
state "[0,2,1,3]" as 0213
state "[0,2,2]" as 022
state "[0,2,2,3]" as 0223
note right of 0113
prob=0.075
logp=-2.59
end note
note right of 0123
prob=0.675
logp=-0.39
end note
note right of 0223
prob=0.225
logp=-1.49
end note
note right of 0213
prob=0.025
logp=-3.68
end note
[*] --> 1 : 0 (BOS)
1 --> 01 : 75%
1 --> 02 : 25%
01 --> 011 : 10%
01 --> 012 : 90%
02 --> 021 : 10%
02 --> 022 : 90%
011 --> 0113 : EOS
012 --> 0123 : EOS
021 --> 0213 : EOS
022 --> 0223 : EOS
by Manuel de Prada Corral
4 min read
Stochastic beam search is a principled way of getting a sample-without-replacement from an autoregressive model, just by perturbing the scores of the beam search algorithm. This allows to construct low-variance estimators over the model's distribution, which can be useful to estimate model's properties and explore stochastic strategies for generation.
by Manuel de Prada Corral
6 min read
While implementing a new generation strategy for Transformer models, I found myself delving deep into the HuggingFace library. The documentation is clear with respect to the usage, but not so much with respect to the implementation details.
Here is a collection of notes I've compiled from my dive into the codebase. This may prove beneficial for anyone looking to understand or extend HuggingFace's generation pipeline.
by Manuel de Prada Corral
2 min read
Breve resumen de mi experiencia cambiando el router Sercomm H500-s de Vodafone por un router neutro con OpenWrt, explicando los problemas para obtener las credenciales PPPoE y cómo solucionarlos.
More posts can be found in the archive.