GreatReads - Blog Aggregator · Phoenix Framework

Bayesian A/B testing is not immune to peeking

Introduction Over the last few months at RevenueCat I’ve been building a statistical framework to flag when an A/B test has reached statistical significance. I went through the usual literature, including Evan Miller’s posts. In his well known “How Not to Run an A/B Test” there’s a claim that with Bayesian experiment design you can stop at any time and still make valid inferences, and that you don’t need a fixed sample size to get a valid result. I’ve read this claim in other posts. The impression I got is that you can peek as often as you want, stop the moment the posterior clears a threshold (eg $P(A>B) > 0.95$), and you won’t inflate false positives. And this is not correct. If you’re an expert in Bayesian statistics this is probably obvious, but it wasn’t for me. So I decided to run some simulations to see what really happens, and I’m sharing the results here in case it can be useful for others.

Testing

Data Analysis

Statistics

0 views

Alex Molas 3 months ago

Adding search to my static blog.

How I added fast, client‑side search to this site with Lunr.js and a build-time index.

Web Development

Tutorial

Frontend

JavaScript

0 views

Alex Molas 4 months ago

Who needs git when you have 1M context windows?

Lately I’ve heard a lot of stories of AI accidentally deleting entire codebases or wiping production databases. But in my case it was the other way around. I removed some working code and the LLM helped me to recover it.

Programming

AI

0 views

Alex Molas 7 months ago

Semantic Unit Testing

Left Wallapop a couple of weeks ago, heading to RevenueCat soon. In that classic ‘between jobs’ hacking window, I built suite: a Python library for semantic unit testing. What’s semantic unit testing? Think unit tests that understand context and meaning, not just assert obj == expected. Sound interesting? I’ll break down what semantic unit testing is, how suite works under the hood, and how you can integrate it.

Programming

DevOps

Python

1 views

Alex Molas 8 months ago

Three symmetric math riddles

I like problems that are easy to pose, and that seem difficult to solve at first glance, but that a slight change of perspective makes them simple and easy to solve. In this post, I will expose my 3 favorite problems of this type.

Science

Programming

0 views

Alex Molas 10 months ago

Optimizing Jupyter Notebooks for LLMs

I’ve been using LLM-assisted coding for the last couple of months, and it has been a game-changer. After a couple of iterations, my setup consists in

AI

Python

0 views

Alex Molas 1 years ago

Win your fantasy league using operations research

I’ve never been good at playing football 1. I started playing again last year and I have scored more own goals than goals for my team. Also, I support the RCD Espanyol 2, which has spent last season in the second division and has miraculously ascended to the first division last week. And to be honest, I only watch games to spend time with my friends and family. So it’s not a secret that I’m not a football expert, and I have no shame in admitting it. But I’m very competitive. So when my friends invited me to a football fantasy league 3 some years ago I said “yes, and I’m going to wipe the floor with you, losers!”. I’m from Spain, so I’m speaking about the real and original football, the one that’s played with the foot and a ball, not with the hands and an egg. ↩ See point 54 in my 100 list. ↩ “A fantasy sport (also known less commonly as rotisserie or roto) is a game, often played using the Internet, where participants assemble imaginary or virtual teams composed of proxies of real players of a professional sport. These teams compete based on the statistical performance of those players in actual games.” Source. ↩

Sports

Data Analysis

Gaming

0 views

Alex Molas 1 years ago

Something happens to everyone.

<!– When my daughter was born something was off. She was premature, and during the last weeks of pregnancy she wasn’t growing at all, so the doctors had to induce labour. Then, as weeks and months passed, we realized she wasn’t developing as expected, and at some point a doctor also pointed out that her head was too small. We decided to do some genetic tests, and we discovered that she had a very rare genetic disorder - a microdeletion of the 13q33.34 region. From the available literature it seems that only 60 patients with this disorder have been studied, so it’s indeed a rare condition. I’m not going to bore you with all the details and all the process my wife and I have had to go through, but let me say that now we have accepted it and we are getting used to the situation. But when we got the news, we asked ourselves “why did this happen to her?”. Answering this question is difficult, and in our case the answer has its roots in our faith, but this post is not about my daughter specific condition or our trust in God, but to argue with some numbers that things like this happen to everyone.

Science

0 views

Alex Molas 1 years ago

In defense of Leetcode Interviews

For the last weeks at Wallapop, we have been interviewing candidates for a Data Scientist position. Our current interview process is quite standard, but there are some things we would like to change about the process. We were talking about it during lunch, and I saw my opportunity to propose one of my hot takes: “We should start doing Leetcode interviews” 1. And, as expected, no one agreed with me. Their main arguments against my proposal were For those who don’t know, Leetcode interviews are technical interviews where the candidate has to solve Leetcode problems under time constraints. For example, the candidate needs to solve an easy and medium Leetcode problem in less than 30 minutes. ↩

Programming

Data

Career

0 views

Alex Molas 1 years ago

Good code is rarely read

The other day I was interviewing a developer for a position at Wallapop. The candidate was a little bit junior, but I enjoyed their technical assignment and the conversation was going great. One of the questions I had to ask was

Programming

Career

0 views

Alex Molas 1 years ago

You need more than p-values

Explore the critical considerations and potential pitfalls of relying solely on A/B testing in making business decisions. While A/B testing is a valuable tool, this post challenges the assumption of exchangeability between past and future data, emphasizing the dynamic nature of the business environment. I propose two solutions (1) validating the exchangeability assumption through holdback groups and (2) advocating for a holistic decision-making approach that goes beyond statistical tests. Executives are urged not to blindly trust p-values, emphasizing the importance of intuition, market understanding, and forecasting in shaping successful business strategies. In conclusion, the post encourages a balanced approach, combining statistical rigor with real-world insights for effective decision-making.

Data

UI

Statistics

0 views

Alex Molas 1 years ago

hn-index

h-index for HackerNews

Web Development

0 views

Alex Molas 1 years ago

A search engine in 80 lines of Python

In this post I explain how I built a search engine from scratch using python. The resulting search engine is used to search in the posts of the blogs I follow.

Python

Web Development

Tutorial

0 views

Alex Molas 1 years ago

ChatGPT knows things that Google doesn’t

Here I explore a doubt I've had for the last 15 years. I vaguely remember a phrase attributed to Voltaire or Robespierre. The phrase was " I'm not a believer, but I prefer my barber to be a Christian, even more, when he's using his razor on my neck". For the last 15 years, Google has failed me to find who was the author of this phrase. But today ChatGPT has helped me to take a step forward in an answer. Who said that phrase? Does ChatGPT know more than Google?

AI

0 views

Alex Molas 1 years ago

Guide to onboarding in a new job

Insights and tips for a smooth transition when starting a new job

Career

0 views

Alex Molas 1 years ago

2023 review

Programming

0 views

Alex Molas 1 years ago

I hate MFA

I hate MFA, and in this rant post I explain why. It's basically because (1) is distracts me a lot, (2) it forces me to have a smartphone, (3) it's a leaky abstraction, and (4) it can be replaced with better solutions.

Programming

Security

Web Development

0 views

Alex Molas 2 years ago

Conditioning is grouping by

This insightful exploration draws parallels between mathematical formulations and practical implementations, showcasing how understanding conditioning as grouping can elevate your statistical insights. Discover the parallels between conditional expectations like $\mathbb{E}(y | X=x)$ and popular grouping techniques found in data analysis tools like pandas. Uncover the hidden synergy that exists between statistical conditioning and groupby operations, demystifying complex mathematical concepts with real-world applications. Whether you're a seasoned data scientist or a curious learner, this journey into the interconnected world of conditioning and grouping promises newfound clarity. Elevate your statistical understanding with practical examples, bridging the gap between theory and application. Embark on this enlightening exploration today and revolutionize your approach to statistical modeling. Uncover the simplicity behind complex conditional expressions through the power of grouping, transforming your data analysis skills along the way.

Statistics

Data Analysis

Python

0 views

Alex Molas 2 years ago