Understanding Itō Calculus and Its Application in Stochastic Differential Equations

by Manuel de Prada Corral

3 min read

In this blog post, we will explore the essential concepts of Itō calculus and how they apply to solving stochastic differential equations (SDEs). We will also see how these concepts are implemented in Python code to simulate and filter stochastic processes.

Continue reading →

A Toy Probabilistic Transformer for Debugging Generation Algorithms in HuggingFace🤗

by Manuel de Prada Corral

3 min read

A few weeks ago, I found myself implementing "Stochastic Beams and Where to Find Them" (sampling without replacement from a Transformer).

Debugging and verifying the correctness of a sampling algorithm in HuggingFace is not straightforward. Thus, I built a fake carcass for a Transformer model with a small vocabulary and fixed controlled probabilities that could allow to keep a close eye on the logits and the generated sequence.

stateDiagram-v2
    state "[0]" as 1
    state "[0,1]" as 01
    state "[0,2]" as 02
    state "[0,1,1]" as 011
    state "[0,1,1,3]" as 0113
    state "[0,1,2]" as 012
    state "[0,1,2,3]" as 0123
    state "[0,2,1]" as 021
    state "[0,2,1,3]" as 0213
    state "[0,2,2]" as 022
    state "[0,2,2,3]" as 0223
    
    note right of 0113
    prob=0.075
    logp=-2.59
    end note
    note right of 0123
        prob=0.675
        logp=-0.39
    end note
    note right of 0223
        prob=0.225
        logp=-1.49
    end note
             
    note right of 0213
        prob=0.025
        logp=-3.68
    end note


    [*] --> 1 : 0 (BOS)
    1 --> 01 : 75%
    1 --> 02 : 25%
    01 --> 011 : 10%
    01 --> 012 : 90%
    02 --> 021 : 10%
    02 --> 022 : 90%
    011 --> 0113 : EOS
    012 --> 0123 : EOS
    021 --> 0213 : EOS
    022 --> 0223 : EOS


Continue reading →

Porting Stochastic Beam Search to HuggingFace🤗

by Manuel de Prada Corral

4 min read

Stochastic beam search is a principled way of getting a sample-without-replacement from an autoregressive model, just by perturbing the scores of the beam search algorithm. This allows to construct low-variance estimators over the model's distribution, which can be useful to estimate model's properties and explore stochastic strategies for generation.

Continue reading →

Unofficial documentation for the HuggingFace🤗 generation pipeline

by Manuel de Prada Corral

6 min read

While implementing a new generation strategy for Transformer models, I found myself delving deep into the HuggingFace library. The documentation is clear with respect to the usage, but not so much with respect to the implementation details.

Here is a collection of notes I've compiled from my dive into the codebase. This may prove beneficial for anyone looking to understand or extend HuggingFace's generation pipeline.

Continue reading →

More posts can be found in the archive.