Latest Posts (10 found)
James O'Claire 1 weeks ago

Self hosting 10TB in S3 on a framework laptop + disks

About 5 months ago I made the decision to start self hosting my own S3 . I was working on AppGoblin’s SDK tracking of the top 100k Android and iOS apps so was wanting a lot of space, but for cheap. I got really lucky with getting a second hand Framework laptop. The laptop was missing it’s screen, and was one of the older ones, so it was perfect for a home server. In addition I bought a “just a bunch of disks” JBOD. The framework laptop is running ZFS + garage S3 . I’ve been away, I’ve been working, I’ve been busy, and I’ve definitely been using my S3. But I hadn’t thought about the laptop in 4 months. When I finally logged in, I saw I’ve used 10TB of space and it was patiently waiting for a restart for some upgrades. I nervously restarted, and was so relieved to see everything come right back up. I also saw a pending upgrade for garage v1 to v2. This went along without a hitch too. Feels like it’s been a good weekend. Just so you know, I understand my use case for ZFS is possibly a bit non standard as I’m using a USB to connect the laptop and JBOD . This initially caused me issues with ZFS when garage was heavily reading and writing (the initial setup had the SQLite metadata also stored on the JBOD/ZFS). I moved my metadata to the laptop, which has so far resolved any ZFS issues again.

0 views
James O'Claire 1 months ago

Contabo Defaults Encourage Using SSH Passwords

I recently started helping a less technical friend and had my first chance to see/use Contabo VPS. I’ve been really surprised at their default security practices so far. Contabo’s default VPS creation seems to be root user and password? If you go to “Advanced” the default is to create a user called “admin” (good!) and has the option for a public SSH key. Our new server, that barely any bot knows exists, already gets 350 failed password login attempts an hour. Worse, these bots can see that password login is enabled on our server, meaning they know they should keep trying. 350 password requests an hour on a strong password isn’t much, but eventually more bots will realize our IP is using passwords and try more. Eventually after copy pasting around a password enough some compromised browser plugin / discord plugin etc will capture the password and put it in a list. Contabo “knows” this, even if they don’t practice it: https://contabo.com/blog/how-to-use-ssh-keys-with-your-server Conatabo is bad that they encourage you via their defaults when setting up the VPS to use SSH password. They went out of their way to do that, likely to not put roadblocks up for new users, but it’s bad security. When we hit an issue we contacted Contabo support. They asked us to copy paste our password so they could help troubleshoot an issue. While I appreciate that level of support, assuming users have a SSH password and asking them to email it seems crazy to me. Now there is a record on both our email providers of that password and IP. 1) Use pub/private SSH keys. We can copy/paste public keys anywhere we want, super safe. 2) Make future servers with a user other than root and disable login with root. Root user has special login privileges (allowing SSH to get hammered). And in the future, I’ll complain a bit less about DigitalOcean/AWS/GCP.

0 views
James O'Claire 1 months ago

Mobile Trackers your Ad-Blocker Doesn’t Know About

This is the full list of the main API endpoints that apps send data to. This is across ~70k android apps and the smallest endpoint has about ~90 apps that send data to it, meaning it’s unlikely to be an app developer’s domain. Then I checked whether these domains were in any of: https://github.com/StevenBlack/hosts , https://easylist.to/easylist/easylist.txt or https://easylist.to/easylist/easyprivacy.txt . I was surpised how many were NOT in any of the lists, over 1/3 at 127 and individually the blocker lists only had about 1/3 of these domains. I’m not currently going to open PR tickets for these domains, as the scale that I collected these at could easily contain some perfectly acceptable use cases. I’ve tried to remove those that I recognize as acceptable so far. Here’s a Google sheet of the same data (feel free to leave comments).

0 views
James O'Claire 1 months ago

Contabo Private Networks are a Pain

I recently started using Contabo VPS’ while working with some friends and at some point we decided we wanted to use local networks. On all other major cloud providers this is free, faster and encouraged to send data over the private local network rather than across the public internet when communicating with two instances. Contabo? It costs $1.50 USD per month per VPS to add them to a private network. Ok, that’s a a surprisingly high amount compared to free / encouraged. So let’s just do it right? Wait! After having signed up two servers, both require “reinstallation” AKA they require being completely wiped and remade. Thankfully, these are not old servers, they’ve all been setup in the past couple weeks, but wow Contabo is prettty painful:

0 views
James O'Claire 1 months ago

The 300 Most Common Android Data Endpoints (and the Companies Behind Them)

Last week I wrote a blog working through a few unknown endpoints. My goal was to bring some attention to these lesser known end points where many apps send their data. This post is split into two sections. The first at the top here are endpoints that do not have landing pages and have not been tagged. The second is my full list of the top 300 endpoints called by Android apps and the companies that own them. Where did this data come from? I’ve been collecting this data with AppGoblin and all code is open source. I’ve been running ~60k apps in an Android emulator with ~50k successfully run, I’ve been mapping each endpoint called by the apps when they are open for 1 minute. This means most ad networks and analytics are well represented as they often load on start. Here is a Google Sheet with the same data. The data may be updated for correctness today, but I will not keep it updated over time. For up to date mobile ad data check AppGoblin in the future. Here is the full list. Note there are some untagged end points here such as the IP geo tagging ones or large app publishers that I have not tagged yet as I was unsure how to categorize them.

0 views
James O'Claire 1 months ago

Uncovering Lesser Known Mobile Adtech Domains

AppGoblin has now run over 40k apps in an emulator, tracking millions of API calls thousands of advertising domains. Unfortunately, some of them are dark, meaning they have no landing page of any kind, and I’m unclear who controls these domains. Let’s see if we can figure them out! This one is a mystery. Seems like it’s related to Germany since it’s always resolving to HETZNER and german IPs. Checking the shared IPs, it looks like they do overlap with unity3d.com domains sometimes. Again, this whole list is games. Looking at the requests I can match various keys to values from untiy3d.com API calls! Specifically they share the same `app_key` values. Well that name definitely comes off as esoteric at first. First let’s check the IP cluser and see what we find, of the 233 apps sending/receiving from acobt.tech we have 4 other sites with 1:1 matches that are all sites that do not have any landing pages. acobt.tech 233 news-cdn.site 233 inmense.site 232 kickoffo.site 232 Searching the internet shows various hits saying some of these belong to Bigo Ads . Let’s check the apps’ SDKs and see Again we got lots of games, and looking it looks like AppGoblin has indeed already found that each of these has a Bigo Ad SDK . Wait, this one also matches the IPs for the other various Bigo Ads. Seems like Bigo really uses a lot of random domains? OK, great name. This one appears in clusters of SDK advertising, making me think it’s related to a mediation SDK of some kind (rather than to one specific ad network). Possibly this is bidmachine.io’s as it is the most common, but really all the top ad newtorks appear nearly 1:1 along side it across the 276 apps I’ve found it in: bidmachine.io 275 unity3d.com 270 doubleclick.net 269 mtgglobals.com 267 rayjump.com 267 applovin.com 261 vungle.com 257 Definitely game focused list here. They almost all call variations of Looking around there are lots of examples of shared IP addresses with everestop.io and bidmachine, so I think that might have solved that. everestop.io 172.240.40.172 bidmachine.io 172.240.40.172 bidmachine.io 204.74.252.252 everestop.io 172.240.61.171 voisetech.com 34.216.198.39 Looks like a lot of the apps have the and SDKs, so I’m thinking that `lazybumblebee.com` does indeed belong to BidMachine and helps it with some app mediation service. is just the kind of generic descriptive name I’d come up with. Example apps, there are a lot of very corporate apps in here along with lots of shopping. Each app sends off two API calls on start to a unique (per app) subdomain on marketingcloudapis.com with the response from the first API call below. The information sent seems somewhat bland compared to the usual deep scraping that advertising SDKs do. So this is likely paired with other API calls already going out. Checking on domains that are called together, it looks like this is almost always called with so possibly this is related to Google, but this is a bit weak as a lot of Android apps have integrations with Google. EDIT: I posted this on HackerNews and user politelemon correctly identified this as SalesForce . Very awesome spot by that user, and it matches the various AppGoblin SDKs for each app as in this example for the Adidas app SDKs . Much better than I expected. A bit of digging and all the URLs were figured out with the exception of marketingcloudapis.com which I was a bit unsure of, but looks like google.com Woodoku – Wood Block Puzzle Draw Climber Spider Rope Hero: Action Game Running Pet: Dec Rooms Bubble Shooter 2 Pirate Treasures: Jewel & Gems Tik Tap Challenge Collect Em All! Clear the Dots Gun Simulator & Lightsaber Pizza Ready! Sculpt People Vita Mahjong Modern Bus Simulator: Bus Game Gym Heros: Fighting Game Blockman Go Going Balls aquapark.io Snake.io – Fun Snake .io Games 1945 Air Force: Airplane Games adidas: Shop Shoes & Clothing Claro música Domino’s Pizza USA SiriusXM: Music, Sports & News GasBuddy: Find & Pay for Gas

0 views
James O'Claire 1 months ago

August 2025: Top Mobile App Advertisers

AppGoblin is excited to share the latest ad network rankings for August 2025 , based on our ongoing scans of live mobile app ads. AppGoblin SDK scans already make it clear which apps are monetizing with which ad networks, or which companies provide the technology behind a given app. But the most important question was missing: which apps are actually buying ads? With the release of AppGoblin’s in-app advertiser tracking, you can now browse August 2025’s top mobile app advertisers and their creatives . This allows for mobile marketers to see real advertising activity from competitors, measure market momentum, and understand where budgets are being spent. It’s also useful for B2B companies looking to identify active app advertisers they may want to approach. Let’s start with Water Sort , a hyper-casual puzzle game that I myself recognize from seeing Water Sort ads constantly. On Water Sort’s AppGoblin ad creatives page . you can view the full gallery of creatives currently being run. These include both video and static variations, and you can track how they evolve from month to month. From there, head to Water Sort’s ad placements to see which apps were publishing Water Sort’s creatives and on which mobile ad networks. For some programmatic ad buys we can also see the buy side DSP that was used. For example, we can see that Appreciate.mobi (a DSP owned by Digital Turbine ) is a major buy side channel for Water Sort. This means Water Sort is buying inventory through Appreciate, which in turn places ads into the monetizing games connected to the exchange. By clicking into a specific placement, you can trace which publishing app was showing the Water Sort ad, and which intermediary companies (exchanges, SSPs, or other ad tech vendors) were involved in the transaction. This type of visibility is key for understanding both competition and the ad tech supply chain. Other than TikTok, Plarium’s Throne: Kingdom at War , a midcore strategy game that has been active in the market for years but still spends aggressively to acquire new players. On Throne’s AppGoblin ad placements page you can see the current campaigns in action and the creatives being used. In August, Throne ads were observed across dozens of publishing apps mostly using Google Ads (on the publishing side that would be AdMob) , the ads here are mostly served via doubleclick.net which is common. This new layer of insight—connecting advertisers to placements—lets us go beyond simply knowing which SDKs apps use. For traditional SDK-based integrations, the story is straightforward: publishers monetize directly through the SDK. But in the programmatic ecosystem, things are more complex. Daisy-chained connections between SSPs, exchanges, and DSPs mean that an ad for Water Sort or Throne might pass through multiple intermediaries before it reaches the end-user’s screen. By mapping this supply path, AppGoblin allows marketers to better understand where budget is flowing, and how networks interact with one another in practice. Finally, the ad creative library provides a quick way to compare formats. You can browse still image ads, video ads, and even see how creative strategies differ between hyper-casual titles like Water Sort and midcore strategy games like Throne: Kingdom at War. Over time, this archive will highlight broader creative trends—such as the dominance of puzzle-to-win ads, cinematic battle simulations, or the rise of interactive playable ads. For now, thumbnails of captured creatives are available to browse, with expanded metadata coming soon.

0 views
James O'Claire 4 months ago

AI can’t solve novel problems yet

I saw this hilarious exchange in one of the issues covering Apple’s recent changes that make it impossible to download IPA files . Recreating an iPhone’s authentication to Apple App Store is a bit of a cat and mouse game. Apple recently changed it’s backend, so now dozens of projects are trying to find ways to circumvent or crack the Apple App Store download authentication. Blacktop, out of likely boredom or curiosity, threw Claude at this issue for Claude to error out. To be fair, it’s likely impossible for Claude to have figured this issue out. The ‘fix’ will be something quite novel, for example perhaps reverse engineering a way to authenticate with modern Apple App Store using an old Windows version of iTunes that is still supported. Perhaps in the near future AI agents will be powerful enough to understand more complex environments, which will be interesting for these types of problems, but for now, it’s far from capable. Edit: For anyone interested in the issue itself, I’d recommend checking / following this issue as well

0 views
James O'Claire 4 months ago

The Trackers and SDKs in ChatGPT, Claude, Grok and Perplexity

Well for a quick weekend recap I’m going to look at which 3rd party SDKs and API calls I can find in the big 4 Android chat apps based. We’ll be using free data from AppGoblin which you can feel free to browse at any of the links below or on tables. Data is collected via de-compiled SDKs and MITM API traffic. Let’s look first at the development tools. These were interesting to me because I had assumed I’d see more of the dynamic JavaScript libraries like React. Instead we see these are all classic Kotlin apps. If you click through the Chat App names you’ll see the more detailed breakdowns of which specific parts of the libraries they’re using like e (in app animations library) or Kotlin Coil Compose or Square’s . Wow, way more than I expected and with quite the variety! I guess it’s enough we can further break these down. As is common now, most apps have more than one analytics tracker in their app. First up let’s recognize Google, it’s across every app in multiple ways. The main one that is used in most apps is the . GMS which is required for both Firebase and Google Play Services. Here’s an example of the measurement SDKs related to this: Next was statsig.com and wow! I was blown away I found this one in 3 of the 4 apps. This looks like a super popular company and I was surprised as I hadn’t heard of them before. Looking around, they look a bit more developer / product focused, but have tons of features and look quite popular. Finally in the analytics section we’ll add the classic segment.com (marketing analytics) and sentry.io (deployment analytics) which get to call OpenAI and Anthropic as it’s clients. It’s always interesting how every company from games to AI end up needed multiple analytics platforms and probably still depend most on their home BI/Backend. Here’s where the money is at. Now SUPER cool is that RevenueCat is now in both OpenAI and Perplexity. RevenueCat helps to use react native updatable web payment / subscription walls so that marketers can change those sections of the apps without needing to do an entire app update. I believe Perplexity is using Stripe, but that could also be a part of their bigger app ecosystem livekit.io ( AppGoblin livekit.io ) is an AI voice platform which is used by OpenAI and Grok. I’m surprised that OpenAI uses this, as they were quite early to the voice game, but perhaps they use this for some deeper custom voice tools. Perplexity has the most interesting third party tools with MapBox and Shopify. I believe MapBox, which delivers mapping tiles, is used for some of Perplexity’s image generation tools like adding circles/lines etc to maps. After seeing Shopify in Perplexity, I realized there wasn’t a Shopify SDK found for OpenAI (despite checking recently). They have been rolling out shopping features as a way to monetize their app, so I am curious if these are just implemented via API or if they were obfuscated well enough to not be found. If you’re still interested, you can also check out the API calls recorded by each app while open. The data is scrubbed, and I’m not sharing the clear text JSONs associated, but you can see some of the endpoints related to the SDKs. If you have further questions about these, or have a specific piece of data (say GPS, or email) that you’d like to check if it is sent along to any of these, just let me know and we can do further research: https://appgoblin.info/apps/com.openai.chatgpt/data-flows https://appgoblin.info/apps/com.anthropic.claude/data-flows If you have feedback please join the https://appgoblin.info Discord, you can find the link on the home page.

0 views
James O'Claire 4 months ago

How to self host your own S3 in 2025

Well, multiple disks hovering close to 99%, it’s definitely time to increase storage for OpenAttribution and AppGoblin. I needed cheap S3 hosting, and I had a pretty strong urge to do it myself. This should be up to you, but in the end I ended up grabbing an 8-bay Just-a-Bunch-Of-Disks (JBOD) which is a hard drive rack and 5 8TB HDs to go in it. Unfortunately, this is going to be a big pain in the future as increasing the size of arrays is not possible the way I set mine up, so I’ll be stuck moving everything off if I want to add in those final 3 missing HDs when I get the money. So, I had long ago come to the conclusion that I would be using MinIO. It was suggested all around and looked very feature rich with dashboards, single node / multi-node and lots of cool ways to manage the object storage. Strangely, the day my JBOD and HDs arrived MinIO had made massive changes to their open source software. It seems that the first part had happened earlier in the year when they switched to AGPL license, and this week they completely removed ALL features from the open source community edition of MinIO GUI. Their idea it seems was to remove all features from the GUI dashboard, but keep them available in the CLI. As someone who hasn’t used MinIO, it was pretty unclear what I would be missing out on, I didn’t even plan on using the dashboard, but the sense that this was a signal for how the Community Edition will be treated going forward made me worried to use MinIO. Specifically that it “will only receive security fixes if any” sounded quite ominous and not like how I’d like to start off a likely multiyear project. Finally, that MinIO was straightforwardly mentioning that it’s common for business-related use cases to require a commercial license, so in the end this just wasn’t going to work for me. The next project people mentioned was Garage . This was interesting comparison because it also is AGPLv3 and it didn’t come with a GUI at all . So at first I was thought this was about the same as situation with MinIO. Digging around though I started to get a sense that Garage was distinctly different and had a few things going for it: developing new features, simple documentation and a strong self-hosted mentality. Reaching out a bit in the community I got open and constructive feedback. I was in. The Garage docs were refreshingly simple, and I was able to follow the quick start guide here . The overview of steps is: Now, of course I still had some questions. But I think what made this so smooth was how helpful the Garage community was on their Matrix channel. I definitely recommend joining that if you have questions about the quick start. Setting up the S3 bucket / keys / and connecting with Python Boto3. This part was also so smooth! As is stated on their site, they follow the regular S3 provisions, so I was able to immediately start using Boto3! That was it. I just wanted make a little post about this. Hope the team does well. Download the garage executable Start the garage server Create a layout Start Using Garage S3

0 views