Latest Posts (14 found)
Evan Schwartz 2 weeks ago

Scour Year End Update 2025

I thought about sending out a personalized "Scour Wrapped"... until I got the 7th Wrapped from some random service. So instead, I'll just say Happy New Year and thanks for your support in 2025! 🥂 These were the new features added since the last update in October. Scour now identifies articles that are paywalled and indicates them with a yellow dollar sign next to the domain. In your settings , you can opt to hide paywalled content. If you do, you can also exempt specific domains where you have a subscription so you will see their content even if it is behind the paywall. Thank you to Johnny and Allen for requesting this feature! For anyone interested in the technical details, I wrote a blog post about a neat SQL trick I came across while building this: Short-Circuiting Correlated Subqueries in SQLite . You can also now block content from specific websites. The option to block a domain can be found by clicking the "..." button below each post. You can see and manage your excluded domains in your settings . Thanks to Vahe for this suggestion! If you subscribe to specific feeds (as opposed to scouring all of them), Scour will now recommend other sources for you to follow right in your personalized feed. These recommendations are based on Scour looking for content that matches your interests that you aren't currently getting. You can find more recommendations on your Feeds page . Each feed also now displays its three most recent posts below its description to make it easier to know what you'll get if you subscribe. You can click on the feed's title to see all of the posts from that feed. Thanks to Tiago for this suggestion! By default, clicking on a link to a post will bring you to the original website where it was published. However, if you prefer to read it on Scour, you can read the Preview, which can be found in the "..." menu under each post. Thanks to Linh for this suggestion! The filter menu for your feed (accessible via the button next to where it says Your Top Finds) should be clearer and more mobile-friendly. You can filter by time range and toggle between seeing posts from feeds you’ve subscribed to or see posts from everyone’s feeds. Thanks Stefan for the feedback on this! A number of people have told me that they are confused about how the love/like/dislike reactions are used on Scour. I'll work on making this clearer in the future but in the meantime, there's now a section in the FAQs about this. The answer is: Loves and likes are saved to your Likes page, so you can use them to bookmark interesting content. Unlike most content aggregators, Scour does not use reactions to change what shows up in your feed. Instead, reactions are used to generate Interest Recommendations for you. Scour only shows content related to topics you've explicitly chosen. You can also subscribe to other users' Likes as feeds. Everyone's reactions contribute to the Popular Posts page. Here were some of my favorite posts I found on Scour in November and December: Thanks to everyone who wrote about Scour on their blog or website in 2025! This included: If you write about Scour in the future, or if you already did and I didn't include you, please let me know! Thank you to everyone who provided feedback on Scour this year! Specifically, thank you to Aaron, Alberto, Alex K, Alex W, Allen, Andrew D, Andrew M, Andy M, Andy P, Cairin, Cole, Daniel, Elyem, Hary, Imperfect, Jadi, Jeppe, Jesse, Johnny, Jon, Karit, Kilpatrj, Linh, Proudmuslim-dev, Ryan, Sarah, Stefan, Tiago, Tomáš, Tyler, and Vahe. And thank you to all of the anonymous feedback givers as well! Because you made it to the end of the post, here's a little preview of an upcoming feature for you. Let's say you want to only see posts from small websites, like individuals' blogs. You can now try filtering your feed by how many posts each website or feed publishes per month. For example, you can use these links to see only posts from quieter domains or quieter feeds . Or, you can try this one to only see articles from larger websites . Let me know what you think! UI for controlling these filters is coming soon! Happy New Year and happy Scouring! - Evan Scour scoured 9,940,460 posts from 15,608 feeds 1,013 new users signed up (welcome!!) 12,620 interests were added, with 6,688 of those from recommendations 26,702 posts were read, 3,023 were liked, and 383 were loved 55 suggestions on the feedback board were completed Paper AI Tigers Build / Buy / Bot More databases should be single-threaded Disks Lie: Building a WAL that actually survives Minsuk Kang: Scour and minifeed are 100X better than Instagram and X (January) Winther: Blog Discovery (June) Daniel Prindii: My Read it later and discoverability systems in 2025 (July) PPC Land: Developer revives RSS with AI while Google targets syndication infrastructure (August) Tomáš Burkert: RSS feeds discovery strategies (October) Alex White: Discovering the Indie Web (November) Matt Maldre: Search engine for blogs (November) Andrew Doran: Tools for discovering the IndieWeb (December)

0 views
Evan Schwartz 4 weeks ago

Short-Circuiting Correlated Subqueries in SQLite

I recently added domain exclusion lists and paywalled content filtering to Scour . This blog post describes a small but useful SQL(ite) query optimization I came across between the first and final drafts of these features: using an uncorrelated scalar subquery to skip a correlated subquery (if you don't know what that means, I'll explain it below). Scour searches noisy sources for content related to users' interests. At the time of writing, it ingests between 1 and 3 million pieces of content from over 15,000 sources each month. For better and for worse, Scour does ranking on the fly, so the performance of the ranking database query directly translates to page load time. The main SQL query Scour uses for ranking applies a number of filters and streams the item embeddings through the application code for scoring. Scour uses brute force search rather than a vector database, which works well enough for now because of three factors: A simplified version of the query looks something like: The query plan shows that this makes good use of indexes: To add user-specified domain blocklists, I created the table and added this filter clause to the main ranking query: The domain exclusion table uses as a primary key, so the lookup is efficient. However, this lookup is done for every row returned from the first part of the query. This is a correlated subquery : A problem with the way we just added this feature is that most users don't exclude any domains, but we've added a check that is run for every row anyway. To speed up the queries for users who aren't using the feature, we could first check the user's settings and then dynamically build the query. But we don't have to, because we can accomplish the same effect within one static query. We can change our domain exclusion filter to first check whether the user has any excluded domains: Since the short-circuits, if the first returns (when the user has no excluded domains), SQLite never evaluates the correlated subquery at all. The first clause does not reference any column in , so SQLite can evaluate it once and reuse the boolean result for all of the rows. This "uncorrelated scalar subquery" is extremely cheap to evaluate and, when it returns , lets us short-circuit and skip the more expensive correlated subquery that checks each item's domain against the exclusion list. Here is the query plan for this updated query. Note how the second subquery says , whereas the third one is a . The latter is the per-row check, but it can be skipped by the second subquery. To test the performance of each of these queries, I replaced the with and used a simple bash script to invoke the binary 100 times for each query on my laptop. Starting up the process each time adds overhead, but we're comparing relative differences. At the time of this benchmark, the last week had 235,975 items, 144,229 of which were in English. The two example users I tested this for below only look for English content. This test represents most users, who have not configured any excluded domains: This shows that the short-circuit query adds practically no overhead for users without excluded domains, whereas the correlated subquery alone makes queries 17% slower for these users. This test uses an example user that has excluded content from 2 domains: In this case, we do need to check each row against the domain filter. But this shows that the short-circuit still adds no overhead on top of the query. When using SQL subqueries to filter down result sets, it's worth thinking about whether each subquery is really needed for most users or most queries. If the check is needed most of the time, this approach won't help. However if the per-row check isn't always needed, using an uncorrelated scalar subquery to short-circuit a condition can dramatically speed up the average case with practically zero overhead. This is extra important because the slow-down from each additional subquery compounds. In this blog post, I described and benchmarked a single additional filter. But this is only one of multiple subquery filters. Earlier, I also mentioned that users had asked for a way to filter out paywalled content. This works similarly to filtering out content from excluded domains. Some users opt-in to hiding paywalled content. For those users, we check if each item is paywalled. If so, we check if it comes from a site the user has specifically allowed paywalled content from (because they have a subscription). I used the same uncorrelated subquery approach to first check if the feature is enabled for the user and, only then, does SQLite need to check each row. Concretely, the paywalled content filter subquery looks like: In short, a trivial uncorrelated scalar subquery can help us short-circuit and avoid a more expensive per-row check when we don't need it. There are multiple ways to exclude rows from an SQL query. Here are the results from the same benchmark I ran above, but with two other ways of checking for whether an item comes from an excluded domain. The version of the query uses the subquery: The variation joins with and then checks for : And here are the full benchmarks: For users without excluded domains, we can see that the query using the short-circuit wins and adds no overhead. For users who do have excluded domains, the is faster than the version. However, this version raises the exact problem this whole blog post is designed to address. Since joins happen no matter what, we cannot use the short-circuit to avoid the overhead for users without excluded domains. At least for now, this is why I've gone with the subquery using the short-circuit. Discuss on Hacker News , Lobsters , r/programming , r/sqlite . Scour uses SQLite, so the data is colocated with the application code. It uses binary-quantized vector embeddings with Hamming Distance comparisons, which only take ~5 nanoseconds each . We care most about recent posts so we can significantly narrow the search set by publish date.

0 views
Evan Schwartz 2 months ago

Scour - October Update

Hi friends, In October, Scour ingested 1,042,894 new posts from 14,140 sources . I was also training for the NYC Marathon (which is why this email comes a few days into November)! Last month was all about Interests: Your weekly email digest now includes a couple of topic recommendations at the end. And, if you use an RSS reader to consume your Scour feed, you’ll also find interest recommendations in that feed as well. When you add a new interest on the Interests page, you’ll now see a menu of similar topics that you can click to quickly add. You can browse the new Popular Interests page to find other topics you might want to add. Infinite scrolling is now optional. You can disable it and switch back to explicit pages on your Settings page. Thanks Tomáš Burkert for this suggestion! Earlier, Scour’s topic recommendations were a little too broad. I tried to fix that and now, as you might have noticed, they’re often too specific. I’m still working on solving this “Goldilocks problem”, so more on this to come! Finally, here were a couple of my favorite posts that I found on Scour in October: Happy Scouring! - Evan Introducing RTEB: A New Standard for Retrieval Evaluation Everything About Transformers Turn off Cursor, turn on your mind

1 views
Evan Schwartz 3 months ago

Scour - September Update

Hi friends, Welcome if you've recently signed up! And if you've been using Scour for a while, product updates are back! This summer was one where life got in the way of me working on Scour, but I'm back at it now. Please share any feedback you have on the feedback board to help me prioritize what to work on next! Since the last update in May, Scour has tripled the amount of content it is ingesting per month and it scoured 1,535,995 posts from 13,875 sources in September. Interest recommendations are now more specific and, hopefully, more interesting! Take a look at what it suggests for you and let me know what you think. You can now see the history of all the articles you clicked on through Scour. That should make it easier to find those good posts you read without needing to search for them through your feed. Relatedly, Scour now syncs the read state across devices (so links you've clicked will appear muted, even if you clicked them on another device). Thanks to u/BokenPhilia for the suggestion! Scour's weekly update emails and the RSS/Atom/JSON feeds it produces now include links to 🌰 Love, 👍 Like, or 👎 Dislike posts. Thank you to @ashishb and an anonymous user for suggesting this addition! You can now flag posts as Harmful, Off-Topic, Low-Quality, etc (look for the flag icon below the link). Thank you to Andy Piper for the suggestion! On a related note, I also switched away from using LlamaGuard to identify harmful content. It was flagging too many okay posts, not finding many bad ones, and it became the most expensive single cost for operating this service. Scour now uses a domain blocklist along with explicit flagging to remove harmful content. Thank you to an anonymous user for the feedback! The RSS/Atom/JSON feeds produced by Scour now include a short preview of the content to help you decide if the article is worth reading. Tripling the number of items Scour ingests means there are more posts to search through to find your hidden gems when you load your feed. That, unfortunately, slowed down the page load speed, especially when you're scouring All Feeds and/or looking at a timeframe of the past week or month. (Thank you to Adam Gluck for pointing out this slowness!) I spent quite a bit of time working on speeding up the feed loading again and cut the time by ~35% when you're scouring All Feeds. If you're interested in reading about the technical details behind this speedup, you can read this blog post I wrote about it: Subtleties of SQLite Indexes . Finally, here were a couple of my favorite posts that I found on Scour in September: Happy Scouring! Identifying and Upweighting Power-Niche Users to Mitigate Popularity Bias in Recommendations Rules for creating good-looking user interfaces, from a developer Improving Cursor Tab with online RL

4 views
Evan Schwartz 3 months ago

Subtleties of SQLite Indexes

In the last 6 months, Scour has gone from ingesting 330,000 pieces of content per month to over 1.4 million this month. The massive increase in the number of items slowed down the ranking for users' feeds and sent me looking for ways to speed it up again. After spending too many hours trying in vain to squeeze more performance out of my queries and indexes, I dug into how SQLite's query planner uses indexes, learned some of the subtleties that explained why my initial tweaks weren't working, and sped up one of my main queries by ~35%. Scour is a personalized content feed that finds articles, blog posts, etc related to users' interests. For better and for worse, Scour does its ranking on the fly whenever users load their feeds page. Initially, this took 100 milliseconds or less, thanks to binary vector embeddings and the fact that it's using SQLite so there is no network latency to load data. The most important table in Scour's database is the table. It includes an ID, URL, title, language, publish date (stored as a Unix timestamp), and a text quality rating. Scour's main ranking query filters items based on when they were published, whether they are in a language the user understands, and whether they are above a certain quality threshold. The question is: what indexes do we need to speed up this query? When I first set up Scour's database, I put a bunch of indexes on the table without really thinking about whether they would help. For example, I had separate indexes on the published date, the language, and the quality rating. Useless. It's more important to have one or a small handful of good composite indexes on multiple columns than to have separate indexes on each column. In most cases, the query planner won't bother merging the results from two indexes on the same table. Instead, it will use one of the indexes and then scan all of the rows that match the filter for that index's column. It's worth being careful to only add indexes that will be used by real queries. Having additional indexes on each column won't hurt read performance. However, each index takes up storage space and more indexes will slow down writes, because all of the indexes need to be updated when new rows are inserted into the table. If we're going to have an index on multiple columns, which columns should we include and what order should we put them in? The order of conditions in a query doesn't matter, but the order of columns in an index very much does. Columns that come earlier in the index should be more "selective": they should help the database narrow the results set as much as possible. In Scour's case, the most selective column is the publish date, followed by the quality rating, followed by the language. I put an index on those columns in that order: ...and found that SQLite was only using one of the columns. Running this query: Produced this query plan: It was using the right index but only filtering by (note the part of the plan that says ). Puzzling. My aha moment came while watching Aaron Francis' High Performance SQLite course . He said the main rule for SQLite indexes is: "Left to right, no skipping, stops at the first range." (This is a much clearer statement of the implications of the Where Clause Analysis buried in the Query Optimizer Overview section of the official docs.) This rule means that the query planner will: My query includes two ranges ( ), so the "stops at the first range" rule explains why I was only seeing the query planner use one of those columns. This rule does, however, suggest that I can create an index that will allow SQLite to filter by the one non-range condition ( ) before using one of the ranges: The query plan shows that it can use both the and columns (note the part that reads ): Now we're using two out of the three columns to quickly filter the rows. Can we use all three? (Remember, the query planner won't be able to use multiple range queries on the same index, so we'll need something else.) SQLite has a nice feature called Partial Indexes that allows you to define an index that only applies to a subset of the rows matching some conditions. In Scour's case, we only really want items where the is less than or equal to 90%. The model I'm using to judge quality isn't that great , so I only trust it to filter out items if it's really sure they're low quality. This means I can create an index like this: And then update the query to use the same condition: And it should use our new partial index... right? Wrong. This query is still using the previous index. There's a subtle mistake in the relationship between our index and our query. Can you spot it? Our index contains the condition but our query says . These are mathematically equivalent but they are not exactly the same . SQLite's query planner requires the conditions to match exactly in order for it to use a partial index. Relatedly, a condition of or even in the query would also not utilize the partial index. If we rewrite our query to use the exact same condition of , we get the query plan: Now, we're starting with the items whose , then using the index to find the items in the desired language(s), and lastly narrowing down the results to the items that were published in the correct time range. As mentioned in the intro, these changes to the indexes and one of Scour's main ranking queries yielded a ~35% speedup. Enabling the query planner to make better use of the indexes makes it so that SQLite doesn't need to scan as many rows to find the ones that match the query conditions. Concretely, in Scour's case, filtering by language removes about 30% of items for most users and filtering out low quality content removes a further 50%. Together, these changes reduce the number of rows scanned by around 66%. Sadly, however, a 66% reduction in the number of rows scanned does not directly translate to a 66% speedup in the query. If we're doing more than counting rows, the work to load the data out of the database and process it can be more resource intensive than scanning rows to match conditions. (The optimized queries and indexes still load the same number of rows as before, they just identifying the desired rows faster.) Nevertheless, a 35% speedup is a noticeable improvement. It's worth digging into how your database's query planner uses indexes to help get to the bottom of performance issues. If you're working with SQLite, remember that: Thanks to Aaron Francis for putting together the High Performance SQLite course ! (I have no personal or financial relationship to him, but I appreciate his course unblocking me and helping me speed up Scour's ranking.) Thank you also to Adam Gluck and Alex Kesling for feedback on this post.

5 views
Evan Schwartz 7 months ago

Scour - May Update

Hi friends, First off, I'd love your feedback on Scour! Please email me at evan at scour.ing if you'd be willing to get on a quick call with me to talk about your experience with and opinions on Scour. As always, you can also leave suggestions on the feedback board . Now, on to the update! In May, Scour scoured 458,947 posts from 4,323 feeds looking for great content for you. Here are the new features I added in the past month: Scour now uses infinite scroll instead of pagination, which should make it easier to keep reading. Thanks Vahe for the suggestion! Topic recommendations are interspersed into the main feed to make them easier to see and add (you can also find them on the Interests page ). Let me know what you think of these suggestions! Every interest is assigned an emoji and those are used to identify which topics are most closely related to any given post. This should make it easier to visually scan through the feed. Thanks Vahe for this suggestion as well! If you haven't tried it already, you can click on any one of your interests to see all of the posts related to that specific interest. Email digests have landed! They're currently sent out every week on Fridays. The contents are made up of the first 10 results from the weekly feed view (which is why you'll see different items if you go to your feed and have it set to Hot mode or another timeframe). Let me know if you find these useful -- or if you'd prefer to get them at another day or time! Thanks to Allen and Laurynas for this suggestion, and to Alex for his feedback on the styling! Previously, Scour would use light or dark mode based on your browser settings. Now, you can toggle it for yourself using the switch in the dropdown menu. Thanks to u/Hary06 on Reddit for the suggestion! For those of you that use Scour along with an RSS reader, you can now consume each of your interest pages as a separate feed. You can also export all of your interest feeds as an OPML file on your Interests page or via this link . This idea came out of a great feedback call I had with Connor Taffe . Thanks Connor for the idea and for taking the time to chat! The Browse Popular Feeds and Browse Newest Feeds pages now hide feeds you are already subscribed to by default. This should make it easier to use those to find new feeds you want to subscribe to. If you haven't tried these out yet, you can also find personalized feed recommendations on your Feeds page . Scour has been using Mixedbread's embedding model, mxbai-embed-large-v1 , since its inception and I have been a happy user of the company's hosted embeddings API for the past 7 months. This month, I chatted with one of their co-founders, they liked Scour, and they offered to sponsor Scour's embeddings. I think they're doing interesting work on embeddings, search, and information retrieval, so I was happy to take them up on the offer. I fixed a bug that caused some posts to have the wrong embedding assigned, which meant that some posts would be shown as being related to the wrong interests. If you noticed some strange post recommendations before, sorry about that! Shouldn't happen anymore. I wrote two posts in May, one about Scour and the other not: Finally, here were a couple of my favorite posts that I found on Scour in May: (You can also save posts that you find on Scour using the love / like icons next to each post's title.) Happy Scouring!

0 views
Evan Schwartz 7 months ago

Announcing Scour on r/RSS

Last week, I wrote up a little announcement post about Scour on the r/RSS subreddit. I'm copying that post over here for anyone else that might be interested: Hi everyone, A service I'm building has been mentioned a couple of times in this subreddit and it's now in a spot where I want to 'officially' tell you all about it. Feedback is very welcome! Scour searches through noisy feeds for content matching your interests. You can sign up for free, add topics you're interested in (anything from "RSS" to "Pourover coffee brewing techniques"), and import feeds via OPML or scour the 3,200+ feeds that have already been added. You'll have a feed tailored to your interests in under a minute. I'd love to hear what you think! You can respond to this thread with any feedback or put suggestions on the public feedback board . News aggregators like Hacker News and Reddit can be great sources of content. HN especially, though, is so popular that tons of good posts get buried in the constant stream of new submissions. I wanted something that would watch noisy feeds like the HN firehose and find content related to topics I'm interested in. After building a prototype, I immediately started finding lots of good posts that only had 1 or 2 upvotes on HN but were plenty interesting to me. It's been a couple of months since then, there are a few hundred users, and I'm hoping to turn it into a little solo-dev business (more on this below). Scour checks feeds for new content every ~15 minutes. It runs the text of each post through an embedding model, a text quality model, and language classifier. When you load your feed, Scour compares the post embedding to each of your interests to find relevant content (using Hamming Distance between binary vector embeddings , in case you're curious). Scour also reranks posts to make sure your view isn't dominated by any one story, interest, source, or feed. And, it'll try to filter out low-quality posts and content in languages that you don't understand. Everything Scour currently does is free and I plan to keep it that way. I am working on this full time and hoping to make a small business of it, so I'll be adding some additional paid features . I'm taking inspiration from Herman from BearBlog : try to build a useful free service and offer premium features at a reasonable price that are a no-brainer for power users and can support a single developer. I blog about the technical details of building Scour and write a monthly product update. You can find those on emschwartz.me and you can subscribe via RSS or add it on Scour . Looking forward to hearing what you think! P.S. If you read this far and need the link to sign up, it's: scour.ing/signup .

0 views
Evan Schwartz 8 months ago

New Life Hack: Using LLMs to Generate Constraint Solver Programs for Personal Logistics Tasks

I enjoy doing escape rooms and was planning to do a couple of them with a group of friends this weekend. The very minor and not-very-important challenge, however, was that I couldn't figure out how to assign friends to rooms. I want to do at least one room with each person, different people are arriving and leaving at different times, and there are only so many time slots. Both Claude 3.7 Sonnet and ChatGPT o3 tried and failed to figure out a workable solution given my constraints. However, after asking the LLMs to generate a constraint solver program and massaging the constraints a bit, I was able to find a solution. TL;DR: If you need help juggling between a bunch of constraints for some personal logistical task, try asking an LLM to translate your requirements into a constraint solver program. You'll quickly find if there exists a solution and, if not, you can edit the constraints to find the best approach. The escape room location we're going to is PuzzleConnect in New Jersey. They have 3 rooms that are all Overwhelmingly Positively rated on Morty , which practically guarantees that they're going to be good. Side note on Morty: Escape rooms as a category are fun but very hit-or-miss. Knowing which rooms are going to be great and which aren't is a game changer because you can skip the duds. Morty is an app specifically for reviewing escape rooms, and it shows you how many rooms each reviewer has played. If someone who has gone out of their way to play 100+ escape rooms says a room is great, you can probably take their word for it. If you happen to be in the Netherlands, Belgium, or Luxembourg, EscapeTalk.nl is also fantastic. Side side note: I'm not affiliated with Morty or EscapeTalk in any way. I just appreciate what they do. As you can see, these are a fair number of constraints to juggle. I started with pen and paper, then a spreadsheet, but quickly gave up and asked the LLMs. Constraint solvers are programs in which you declaratively express constraints on the solution and the solver effectively tries to explore all possible states to find a valid solution. In practice, they use lots of clever methods and heuristics to efficiently explore the state space. Unlike with imperative programming where you specify the steps for the program to take, under this paradigm you are just describing a valid final state. In addition to specifying hard constraints, you can also provide soft constraints that the solver will attempt to maximize. I would not have thought to ask the LLMs to build a constraint solver program for me if not for Carolyn Zech's talk at this week's Rust NYC meetup about verifying the Rust standard library (see the announcement and the project on Github ). I have no experience writing programs for constraint solvers, but I was able to describe all of my constraints and ChatGPT was perfectly capable of translating those requirements into code. In this case, we used Google's OR-Tools python package. The first version was impossible to satisfy, but it worked once I added and moved around some time slots. After finding a workable solution, I think it's interesting to note how hard and soft constraints are expressed. For example, each player can only play each room once: Or, my desire to play at least one room with each person expressed as a soft constraint looks like this: You can find the full code below. I'm not a big fan of balancing logistical tasks and constraints, but I do like finding optimal solutions. Getting an LLM to generate a constraint solver program for me to find optimal or feasible solutions is a nice new life hack that I'm sure I'll be using again. ChatGPT can run Python code that it generates for you, but isn't available as an import (for now!). It would be neat if OpenAI or Anthropic added it as a dependency and trained the models to reach for it when given some set of hard constraints or an optimization problem to solve. In the meantime, though, I'll just use to optimize random life logistics.

0 views
Evan Schwartz 8 months ago

Scour - April Update

Hi friends, Scour is now checking over 3,000 sources (3,034 to be precise) for content matching your interests. This month, it scoured 281,968 posts. Here are the new features I added: Scour now has a hot mode that shows content from the past week while balancing between relevance and recency. This will show you a mix of very recent posts with especially relevant ones from a few days ago that you might have missed. Right now, Hot mode is the default. If you prefer a different timeframe (for example, the past day), you can set that in your settings . Thanks justmoon for the suggestion! (One thing I'm not so happy about with Hot mode is that it takes closer to ~250ms to load, rather than the <100ms I'm aiming for. I'm working on getting that back down so your feed feels extra snappy, no matter which timeframe you pick.) You can now Love especially great posts you find on Scour using the acorn icon. It's up to you what counts as a Love versus a Like, but I'm personally using it to mark those posts that make me think "wow, I'm so glad I found that! That's so timely and I definitely would have missed it otherwise." Thanks to everyone who voted for this idea . That said, Vahe Hovhannisyan brought up a good point that thinking about whether to like or love a post introduces extra decision fatigue. Please weigh in on this if you agree or disagree! I might make the love reaction an optional setting, hide it somehow, or remove it in the future. You can click on any one of your interests on the Interests page or on the tags next to post titles to see only posts related to that interest. If you browse other users' pages , you can also see posts related to any one of their interests. And, if you like a topic someone else has added, you can easily add that interest to your set. Thanks also to justmoon for this suggestion! Scour got a design refresh! The website should look and feel a bit more modern and polished now. Let me know if you run into any little design issues. (For those that are interested, I switched from PicoCSS to Tailwind. Pico was great for getting off the ground quickly, but I found I was fighting with it more and more as I built out more features.) I'm almost done building out automatic email updates so you'll hopefully have the first personalized weekly roundup in your inboxes this Friday. Thanks to Allen Kramer for this suggestion! Here were a couple of my favorite posts that I found on Scour this month: Happy Scouring — and keep the feedback coming!

0 views
Evan Schwartz 9 months ago

Scour - March Update

Hi friends, In March, Scour scoured 276,710 posts from 2,872 sources, you gave lots of good suggestions , and we've got some new features that I hope you'll like. We also have quite a few new users, so welcome to everyone who recently signed up! Also, I gave the first presentation about Scour and wrote a blog post about it called Building a fast website with the MASH stack in Rust . The big feature this month is likes and dislikes. Next to each post in your feed you'll find the classic thumbs up and thumbs down buttons, which do what you might expect. You can find all of your likes here and you can use that to save posts for later. Likes are also used to help recommend topics you might be interested in. (Personally, I've found that I sometimes want to "love" a post, particularly when Scour finds me a gem that I'm especially glad to have found and might not have seen otherwise. If you'd also be interested in that, please let me know by upvoting that idea here: Extra-like reaction (❤️ / 💎) .) To go along with the Likes feature, you can now see Popular Posts from across Scour and you can see what specific users have liked as well. And, naturally, all of those are feeds you can subscribe to — just visit the Popular Posts or a user's Likes page and click the Subscribe link. Under each post in your feed, you can now find a Show Feeds link that shows which feeds that post appeared in. Feeds you are subscribed to appear first (so you can unsubscribe if you want), followed by others that you could subscribe to if you want more content from that source. Thanks to Vahe Hovannisyan for the suggestion! Thanks again to everyone who submitted ideas, feedback, and bug reports ! They help me figure out what to work on, so please try out these new features and let me know what else you'd like to see next! Happy scouring! - Evan P.S. Three posts I was especially happy to find through Scour this month were:

0 views
Evan Schwartz 9 months ago

Building a fast website with the MASH stack in Rust

I'm building Scour , a personalized content feed that sifts through noisy feeds like Hacker News Newest, subreddits, and blogs to find great content for you. It works pretty well -- and it's fast . Scour is written in Rust and if you're building a website or service in Rust, you should consider using this "stack". After evaluating various frameworks and libraries, I settled on a couple of key ones and then discovered that someone had written it up as a stack. Shantanu Mishra described the same set of libraries I landed on as the "mash 🥔 stack" and gave it the tagline "as simple as potatoes". This stack is fast and nice to work with, so I wanted to write up my experience building with it to help spread the word. TL;DR: The stack is made up of Maud , Axum , SQLx , and HTMX and, if you want, you can skip down to where I talk about synergies between these libraries. (Also, Scour is free to use and I'd love it if you tried it out and posted feedback on the suggestions board !) Scour uses server-side rendered HTML, as opposed to a Javascript or WebAssembly frontend framework. Why? First, browser are fast at rendering HTML. Really fast. Second, Scour doesn't need a ton of fancy interactivity and I've tried to apply the "You aren't gonna need it" principle while building it. Holding off on adding new tools helps me understand the tools I do use better. I've also tried to take some inspiration from Herman from BearBlog's approach to "Building software to last forever" . HTML templating is simple, reliable, and fast. Since I wanted server-side rendered HTML, I needed a templating library and Rust has plenty to choose from. The main two decisions to make were: Here is a non-exhaustive list of popular template engines and where they fall on these two axes: I initially picked because of its popularity, performance , and type safety. (I quickly passed on all of the runtime-evaluated options because I couldn't imagine going back to a world of runtime type errors. Part of the reason I'm writing Rust in the first place is compile-time type safety!) After two months of using , however, I got frustrated with its developer experience. Every addition to a page required editing both the Rust struct and the corresponding HTML template. Furthermore, extending a base template for the page header and footer was surprisingly tedious. templates can inherit from other templates . However, any values passed to the base template (such as whether a user is logged in) must be included in every page's Rust struct , which led to a lot of duplication. This experience sent me looking for alternatives. Maud is a macro for writing fast, type-safe HTML templates right in your Rust source code. The format is concise and makes it easy to include values from Rust code. The Hello World example shows how you can write HTML tags, classes, and attributes without the visual noise of angle brackets and closing tags: Rust values can be easily spliced into templates (HTML special characters are automatically escaped ): Control structures like , , , , and are also very straightforward: Partial templates are also easy to reuse by turning them into small functions that return : All in all, Maud provides a pleasant way to write HTML components and pages. It also ties in nicely with the rest of the stack (more on that later). Axum is a popular web framework built by the Tokio team. The framework uses functions with extractors to declaratively parse HTTP requests. The Hello World example illustrates building a router with multiple routes, including one that handles a POST request with a JSON body and returns a JSON response: Axum extractors make it easy to parse values from HTTP bodies, paths, and query parameters and turn them into well-defined Rust structs. And, as we'll see later, it plays nicely with the rest of this stack. Every named stack needs a persistence layer. SQLx is a library for working with SQLite, Postgres, and MySQL from async Rust. SQLx has a number of different ways of working with it, but I'll show one that gives a flavor of how I use it: You can derive the trait for structs to map between the database row and your Rust types. Note that you can derive both and 's and on the same structs to use them all the way from your database to the Axum layer. However, in practice I've often found that it is useful to separate the database types from those used in the server API -- but it's easy to define implementations to map between them. The last part of the stack is HTMX . It is a library that enables you to build fairly interactive websites using a handful of HTML attributes that control sending HTTP requests and handling their responses. While HTMX itself is a Javascript library, websites built with it often avoid needing to use custom Javascript directly. For example, this button means "When a user clicks on this button, issue an AJAX request to /clicked, and replace the entire button with the HTML response". Notably, this snippet will replace just this button with the HTML returned from , rather than the whole page like a plain HTML form would. HTMX has been having a moment, in part due to essays like The future of HTMX where they talked about "Stability as a Feature" and "No New Features as a Feature". This obviously stands in stark contrast to the churn that the world of frontend Javascript frameworks is known for. There is a lot that can and has been written about HTMX, but the logic clicked for me after watching this interview with the creator of it. The elegance of HTMX -- and the part that makes its promise of stability credible -- is that it was built from first principles to generalize the behavior already present in HTML forms and links . Specifically, (1) HTML forms and links (2) submit GET or POST HTTP requests (3) when you click a Submit button and (4) replace the entire screen with the response. HTMX asks and answers the questions: By generalizing these behaviors, HTMX makes it possible to build more interactive websites without writing custom Javascript -- and it plays nicely with backends written in other languages like Rust. Since we're talking about Rust and building fast websites, it's worth emphasizing that while HTMX is a Javascript library, it only needs to be loaded once. Updating your code or website behavior will have no effect on the HTMX libraries, so you can use the directive to tell browsers or other caches to indefinitely store the specific versions of HTMX and any extensions you're using. The first visit might look like this: But subsequent visits only need to load the HTML: This makes for even faster page loads for return users. Overall, I've had a good experience building with this stack, but I wanted to highlight a couple of places where the various components complemented one another in nice ways. Earlier, I mentioned my frustration with , specifically around reusing a base template that includes different top navigation bar items based on whether a user is logged in or not. I was wondering how to do this with Maud, when I came across this Reddit question: Users of maud (and axum): how do you handle partials/layouting? David Pedersen, the developer of Axum, had responded with this gist . In short, you can make a page layout struct that is an Axum extractor and provides a method that returns : When you use the extractor in your page handler functions, the base template automatically has access to the components it needs from the request: This approach makes it easy to reuse the base page template without needing to explicitly pass it any request data it might need. (Thanks David Pedersen for the write-up -- and for your work on Axum!) This is somewhat table stakes for HTML templating libraries, but it is a nice convenience that Maud has an Axum integration that enables directly return from Axum routes (as seen in the examples just above). HTMX has a number of very useful extensions , including the Preload extension . It preloads HTML pages and fragments into the browser's cache when users hover or start clicking on elements, such that the transitions happen nearly instantly. The Preload extension sends the header with every request it initiates, which pairs nicely with middleware that sets the cache response headers: (Of course, this same approach can be implemented with any HTTP framework, not just Axum.) Update: after writing this post, u/PwnMasterGeno on Reddit pointed out the crate to me. This library includes Axum extractors and responders for all of the headers that HTMX uses. For example, you can use the header to determine if you need to send the full page or just the body content. also has a nice feature for cache management . It has a that automatically sets the component of the HTTP cache headers based on the request headers you use, which will ensure the browser correctly resends the request when the request changes in a meaningful way. While I've overall been happy building with the MASH stack, here are the things that I've found to be less than ideal. I would be remiss talking up this stack without mentioning one of the top complaints about most Rust development: compile times. When building purely backend services, I've generally found that Rust Analyzer does the trick well enough that I don't need to recompile in my normal development flow. However, with frontend changes, you want to see the effects of your edits right away. During development, I use Bacon for recompiling and rerunning my code and I use to have the frontend automatically refresh. Using some of Corrode's Tips For Faster Rust Compile Times , I've gotten it down to around 2.5 seconds from save to page reload . I'd love if it were faster, but it's not a deal-breaker for me. For anyone building with the MASH stack, I would highly recommend splitting your code into smaller crates so that the compiler only has to recompile the code you actually changed. Also, there's an unmerged PR for Maud to enable updating templates without recompiling , but I'm not sure if that will end up being merged. If you have any other suggestions for bringing down compile times, I'd love to hear them! HTMX's focus on building interactivity through swapping HTML chunks sent from the backend sometimes feels overly clunky. For example, the Click To Edit example is a common pattern involving replacing an Edit button with a form to update some information such as a user's contact details. The stock HTMX way of doing this is fetching the form component from the backend when the user clicks the button and swapping out the button for the form. This feels inelegant because all of the necessary information is already present on the page, save for the actual form layout. It seems like some users of HTMX combine it with Alpine.js , Web Components, or a little custom Javascript to handle this. For the moment, I've opted for the pattern lifted from the HTMX docs but I don't love it. If you're building a website and using Rust, give the MASH stack a try! Maud is a pleasure to use. Axum and SQLx are excellent. And HTMX provides a refreshing rethink of web frontends. That said, I'm not yet sure if I would recommend this stack to everyone doing web development. If I were building a startup making a normal web app, there's a good chance that TypeScript is still your best bet. But if you are working on a solo project or have other reasons that you're already using Rust, give this stack a shot! If you're already building with these libraries, what do you think? I'd love to hear about others' experiences. Thanks to Alex Kesling for feedback on a draft of this post! Discuss on r/rust , r/htmx or Hacker News . If you haven't already signed up for Scour, give it a try and let me know what you think !

0 views
Evan Schwartz 10 months ago

[Video] Scour Presentation @ Rust NYC

Here’s the recording of my presentation about Scour at the Rust NYC Meetup ( link ). The presentation goes through the motivation for Scour and aspects of the architecture that make it fast. You can read more about some of the topics mentioned in the presentation here: If you haven't already signed up for Scour, give it a try and let me know what you think !

0 views
Evan Schwartz 10 months ago

Scour - February Update

Hi friends, In February, Scour scoured 224,806 posts from 2,140 sources looking for great content for you. It's not the most important addition, but you can find these stats on the home page . Now, on to more useful features. Thanks everyone for your feedback and suggestions ! Here's what's new this month: Adding interests and feeds can be a little tedious. Now, Scour gives you recommendations for both. On your Interests page, you'll find a list of recommended topics. These are AI-generated and I'm still tweaking it, so I'd love to hear what you think of the suggestions. (This isn't quite the "more of this" that Laurynas asked for, but that idea prompted me to get moving on this feature.) You can also find recommended feeds on your Feeds page. These are the sources that had the most relevant content for you over the last month. (This was previously in the Browse section, but that might not have been obvious. Thanks to justmoon for the suggestion to move it over.) Scour now uses a model to determine the language of each article and will filter out posts in languages you don't speak. If you understand languages other than English, you can add them in your settings . (Thanks Cairinmichie for the suggestion!) Scour wouldn't be all that useful if it just told you to go read a single newspaper or only showed you content from Hacker News. Now, it automatically tries to diversify the domains and feeds it shows you content from. (Thanks to Allen for the feedback!) Happy scouring — and keep the feedback coming ! P.S. For any Rust programmers, I also wrote a blog post on Pinning Down "Future Is Not Send" Errors .

0 views
Evan Schwartz 11 months ago

Pinning Down "Future Is Not Send" Errors

If you use async Rust and Tokio, you are likely to run into some variant of the "future is not Send" compiler error. While transitioning some sequential async code to use streams, a friend suggested a small technique for pinning down the source of the non- errors. It helped a lot, so I thought it would be worth writing up in case it saves others some annoying debugging time. I'll give a bit of background on and bounds first, but if you want to skip past that you can jump to The DX Problem with Non- s or Pinning Down the Source of Non- Errors . I wrote another blog post about the relationship between async Rust and so we won't go into detail about that here. The main thing we'll focus on here is that if you're using Tokio, you're probably going to be ing some Futures, and if you a Future it must be . Most types are automatically marked as , meaning they can be safely moved between threads. As the The Rustonomicon says: Major exceptions include: Pointers and s cannot be moved between threads and nor can anything that contains them. s are structs that represent the state machine for each step of the asynchronous operation. When a value is used across an point, that value must be stored in the . As a result, using a non- value across an point makes the whole non- . To illustrate the problem in the simplest way possible, let's take an extremely simplified example. Below, we have an async function and an async function. The function holds an across an await point and thus loses its bound -- but shhh! let's pretend we don't know that yet. We then have an that calls both methods and a function that s that . This code doesn't compile ( playground link ). But where does the compiler direct our attention? If we only take a quick look at the error message, it seems like the error is coming from the call: In this example, it's easy to spot the mention of the not being -- but we know what we're looking for! Also, our async chain is pretty short so that types and error messages are still somewhat readable. The longer that chain grows, the harder it is to spot the actual source of the problem. The crux of the issue is that the compiler draws our attention first to the place where the bounds check fails. In this case, it fails when we try to a non- -- rather than where the loses its bound . There are a number of different ways we could pin down the source of these errors, but here are two: Instead of using an , we can instead use a normal that returns a . (This is what the keyword does under the hood, so we can just forego that bit of syntactic sugar.) We can transform our example above into something that looks like the code below using an block, or alternatively using combinators. Neither of these will compile ( playground link ), but this time the compiler errors will point to the s returned by or not fulfilling the bound that we are specifying. The idea here is that we are foregoing the syntax to explicitly state that the our functions return must be . In the code below ( playground link ), we'll keep our original s but this time we'll use a helper function to ensure that the value we pass to it implements . Here, the compiler will also point us to the right place. While debugging, you could wrap any part of the async chain with the function call until you've pinpointed the non- part. (This is similar what the macro creates under the hood -- and using that crate is another option.) Since the introduction of / , I have mostly stopped using combinators. However, combinators still seem like the way to go when working with s. s present the same DX problems we've seen above when you have a combinator that produces a non- result. Here's a simple example ( playground link ) that demonstrates the same issue we had with s above: As with the s examples above, we can use the same type of helper function to identify which of our closures is returning a non- ( playground link ): Async Rust is powerful, but it sometimes comes with the frustrating experience of hunting down the source of trait implementation errors. I ran into this while working on Scour , a personalized content feed. The MVP used a set of sequential async steps to scrape and process feeds. However, that became too slow when the number of feeds grew to the thousands. Transitioning to using s allows me to take advantage of combinators like , which polls nested streams with a configurable level of concurrency. This works well for my use case, but writing the code initially involved plenty of non- - hunting. The techniques described above helped me narrow down why my combinator chains were becoming non- . I hope you find them useful as well! Thanks to Alex Kesling for mentioning this technique and saving me a couple hours of fighting with . If you're working with Rust streams, you might also want to check out: Discuss on r/rust , Lobsters , or Hacker News .

0 views