Saved tweets

Just published: Do large language models understand us? https://t.co/Ks6zztTzxv It’s sometimes claimed that ML is "just stats" and AI can't "understand". I'm arguing that LLMs have a great deal to teach us about language, understanding, intelligence, sociality, even personhood.
— Blaise Aguera (@blaiseaguera) December 16, 2021

You use GPUs everyday, but do you (actually) know how they work?

GPU-Puzzles (v0.1) - 14 short puzzles in Python with a visual debugger. No background required. Do puzzles, learn CUDA.

Link: https://t.co/Yk1lWRqilN pic.twitter.com/eFs7u5lxES
— Sasha Rush (@srush_nlp) July 12, 2022

When I started lifting I thought I was a hard gainer.

Now people tell me I have good genetics.

If you struggle to gain muscle, read this: pic.twitter.com/gDSHgRdflm
— Warren English (@TheWarEnglish) August 23, 2022

There are more than 3,000 TED Talks.

Here are 10 TED Talks that will change the way you think forever:
— Unleash Your Mind (@MentalUnleash) August 23, 2022

3. Public APIs

A collective list of free APIs for use in software and web development

🔗 https://t.co/vDRQKBf15V pic.twitter.com/cGI11nbfeV
— Pratham (@Prathkum) November 7, 2022

Here is a great explanation of the Lagrangian multiplier (the intuition of which is typically not given). 🧵

The problem: maximising a function f(x,y) under a constraint g(x,y)=k (the point has to be on the red curve). pic.twitter.com/xgLQ9UAKdp
— Lionel Page (@page_eco) November 10, 2022

I wrote a blog post on why I decided to join OpenAI instead of academia.

(after I went on the academic & industry job markets, and got offers from both.)

This post (pt2 in a series) took a while 😅- hoping my experience helps others make life decisions!https://t.co/B5z4DGP9yI
— Rowan Zellers (@rown) February 12, 2023

Asociados, necesito una forma de ver la f1 sin pagar un céntimo a DAZN, confío en vosotros
— - (@Nanoestafa) February 27, 2023

I finally find an explanation for why RL is needed for RLHF that satisfied me. It's actually like playing board games.
The reward model can only judge a full answer and a "critic" is needed to efficiently improve the intermediate moves (earlier tokens in the answer) 1/4 pic.twitter.com/GMjSVC6Tee
— Zhengyao Jiang (@zhengyaojiang) February 28, 2023

If you sit for more than 6 hours a day, read this: pic.twitter.com/CYUr79mYF9
— Dan Go (@FitFounder) February 28, 2023

I love Pandas! I've been using it ever since I started doing machine learning more than a decade ago.

Now that the Pandas 2.0 release candidate is out, I was just taking the new PyArrow backend for a test drive. It's a significant boost over the original https://t.co/vk8mtViPtJ… pic.twitter.com/wThAm26cdB
— Sebastian Raschka (@rasbt) March 4, 2023

I packed-up a full-text paper scraper, vector database, and LLM into a CLI to answer questions from only highly-cited peer-reviewed papers. Feels unreal to be able instantly get answers by an LLM "reading" dozens of papers. 1/2 pic.twitter.com/a6PWxWyuc1
— Andrew White 🐦‍⬛ (@andrewwhite01) February 25, 2023

In my mid 20's to early 30's I dealt with chronic low back pain.

I went to a Chiropractor & he told me I'd have to live with the pain & get adjustments for the rest of my life.

I said screw that, searched for a different solution & fixed my back.

Here's how I did it: pic.twitter.com/sy5DkpZZsp
— Dan Go (@FitFounder) March 11, 2023

Harsh truths I know at 32 I wish knew at 22:
— Sahil Bloom (@SahilBloom) March 11, 2023

ML runs everywhere now because of 2 recent trends:

* 𝚓𝚊𝚟𝚊𝚜𝚌𝚛𝚒𝚙𝚝-𝚏𝚒𝚌𝚊𝚝𝚒𝚘𝚗 of ML (transformers.js, huggingface.js) for running on browsers
* 𝚌𝚙𝚙-𝚏𝚒𝚌𝚊𝚝𝚒𝚘𝚗 of ML (llama.cpp, bloomz.cpp, whisper.cpp) for running on embedded devices
— Mishig Davaadorj (SF Mar 8-20) (@mishig25) March 17, 2023

RIP IOS Developers 💀

I just built a 95% functional IOS app in less than 2 hours with GPT-4. (With Payments and OpenAI integration)

I have zero Swift experience (I hired a freelancer to pair program thou.)

Gonna write a thread about this entire experience in 1-2 days. 🤯 pic.twitter.com/H1pffLfx1Y
— peter! 🥷 (@pwang_szn) March 26, 2023

The normalization scheme that DeepMind researchers came up with for their "linear recurrent unit" (LRU) is a nice example of how it is possible to predictably engineer circuits in artificial neural networks, when you know what you're doing. A thread: pic.twitter.com/AxzFE58OQk
— Charles Foster (@CFGeek) March 27, 2023

If you are a PhD student, you should check out the book called "How to take smart notes".

I wrote a bit about this book and what I have learned about note-taking in https://t.co/ROFiqmoYC6. pic.twitter.com/vTwlC32xdL
— Mahesh Sathiamoorthy (@madiator) April 1, 2023

⬇⬇ Evolución de mi patrimonio a 01/02/2023 ⬇ ⬇

Seguramente este sea el último mes por debajo de los 40k, y creo que para la edad que tengo está muy bastante bien :P

Hoy quería preguntaros una cosa...

¿Esta publicación mensual os aporta algo de valor? ¿O no? pic.twitter.com/aXtw0lEX49
— La Carrera Del Dinero (@carrera_dinero) April 1, 2023

1/🧵🔍 Making sense of Principal Component Analysis (PCA), Eigenvectors & Eigenvalues: A simple guide to understanding PCA and its implementation in R! Follow this thread to learn more! #RStats #DataScience #PCA pic.twitter.com/An6qxZGLDP
— Selçuk Korkmaz (@selcukorkmaz) April 23, 2023

I was puzzled for a while as to why we need RL for LM training, rather than just using supervised instruct tuning. I now have a convincing argument, which is also reflected in a recent talk by @johnschulman2 . I summarize it in this post:https://t.co/DQD1wgyjg3
— (((ل()(ل() 'yoav))))👾 (@yoavgo) April 22, 2023

“Transformers from scratch” by Brandon Rohrer 🤖

This is one of the best write ups, that starts from 0 and explains every single detail of the model architecture.

Even if you need a refresher or don’t, I would still highly recommend reading it:https://t.co/D25bs6TP5X pic.twitter.com/IWOWP1P9QW
— Sanyam Bhutani (@bhutanisanyam1) April 24, 2023

[1/9] 🎲 Let's talk about the difference between probability and likelihood in #statistics. These two terms are often confused, but understanding their distinction is key for making sense of data analysis! #Rstats #DataScience pic.twitter.com/Xuo19nyTvU
— Selçuk Korkmaz (@selcukorkmaz) April 23, 2023

This is *exactly* what I had in mind when disliking the term "emergent" recently.

It seems due to the metrics (like binary correct/incorrect), in reality the model does smoothly approach the right answer.

But I was too lazy to verify this intuition myself, glad this paper did! https://t.co/xLQBGPkwxs pic.twitter.com/ZPNaEtiYGM
— Lucas Beyer (bl16) (@giffmana) May 1, 2023

𝗖𝗼𝗻𝘃𝗼𝗹𝘂𝘁𝗶𝗼𝗻𝘀 𝗳𝗿𝗼𝗺 𝗳𝗶𝗿𝘀𝘁 𝗽𝗿𝗶𝗻𝗰𝗶𝗽𝗹𝗲𝘀
Linear+Shift Invariant = Convolution. A simple proof with circulant matrices 🌻https://t.co/P5ifI6nPJh
📋learn how stacking convolutions with a kernel of size 3 get you a network with a receptive field of size 9👇 pic.twitter.com/swSCqdIusY
— Marc Lelarge 🌻 (@marc_lelarge) May 8, 2023

The team at @CohereAI just released an awesome API endpoint (called Rerank) that can easily improve search and recommendation offerings by using LLMs. Here's what you need to know...

Some background: Most search engines follow a two-step process.

1. Filtering: a rough/efficient… pic.twitter.com/C15AVexFPq
— Cameron R. Wolfe, Ph.D. (@cwolferesearch) May 9, 2023

Interesting paper from my ex-colleagues at @GoogleAI led by @vqctran. Generative retrieval (i.e., DSI) is one of the most fun works I've worked on (and pioneered) during my Google career.

Also, @vqctran is driving a lot of the agenda that we worked on together back then. He has… https://t.co/FYyDng4n2e
— Yi Tay (@YiTayML) May 22, 2023

Is Adam the best optimizer to train neural networks?🤔
We don't know. And we won't know until we test training algorithms properly.

🚀That's why we spent ~2.5 years building AlgoPerf, a competitive, time-to-result training algorithms benchmark using realistic workloads! pic.twitter.com/Tzf4R4UXVO
— Frank Schneider (@frankstefansch1) June 13, 2023

shower thought : drop the position embeddings, rewrite the transformer using complex numbers, encode the position information in the complex phase

ref : see how MRI phase encoding works
— Georgi Gerganov (@ggerganov) June 23, 2023

this paper's nuts. for sentence classification on out-of-domain datasets, all neural (Transformer or not) approaches lose to good old kNN on representations generated by.... gzip https://t.co/6eZiXlJxOX pic.twitter.com/sF9kd1FzI4
— Luke Gessler (@LukeGessler) July 12, 2023

github repo w/ model: https://t.co/cRh08vsHFN

just a modded @karpathy nanoGPT with ada embeddings projected in as input tokens

heads up, the checkpoint on HF is only ~117M parameters, and it's finetuned on a tiny subset of Wikipedia sentences so it's very easy to go OOD
— MF FOOM (@MF_FOOM) August 3, 2023

every now and then there’s an insane alpha bomb on reddit pic.twitter.com/2Gw9Yuu83Y
— varepsilon (@var_epsilon) August 18, 2023

what's your favorite codebase for training larger scale neural networks?

fairseq?
GPT-NeoX?
MosaicML's composer/llm-foundry?
something else?
— Aleksa Gordić (水平问题) (@gordic_aleksa) August 27, 2023

worth taking a look at https://t.co/mFpvQ8MXfA for some intuition on these new knobs pic.twitter.com/m9jEqj9Dej
— Susan Zhang (@suchenzang) August 31, 2023

Whatever inductive bias you can bake into your model... bake it in. https://t.co/Upxj521Bh6 pic.twitter.com/PRMraEJOQN
— Susan Zhang (@suchenzang) August 31, 2023

¿Quieres empezar a invertir y no sabes por dónde empezar?

En 4 vídeos te doy lo más básico para que puedas empezar a entender este mundillo. pic.twitter.com/Z6XDigvanK
— Pobre Millenial (@pobremillenial) September 9, 2023

I started my career in Data Science back in 2016 ⏳

Being self-taught, I came across several courses and books, but these are some of my favourites! 📚

1️⃣ ISL

Tested by time and read by millions, the bible for statistical & classical machine learning.

Mathematical concepts… pic.twitter.com/83DnSnqgkx
— Akshay 🚀 (@akshay_pachaar) September 9, 2023

What is Causal Inference?

Causal Inference is a new science of causation. This field is nothing less than a revolution in how scientists understand data. Read on to learn more.

This is the first post in a series based on the Book of Why by Judea Pearl. I will be reading the… pic.twitter.com/IOI7vHthBc
— Kareem Carr, Statistics Person (@kareem_carr) September 11, 2023