Latest Posts (17 found)
Emil Privér 1 months ago

We Re-Built Our Integration Service Using Postgres and Go

Our integration service connects our platform to external systems. Earlier this year, we reached a scaling limit at 40 integrations and rebuilt it from the ground up. The service handles three primary responsibilities: sending data to external systems, managing job queues, and prioritizing work based on criticality. The original implementation functioned but had architectural constraints that prevented horizontal scaling. We use microservices because different components have conflicting requirements. The management API handles complex business logic with normalized schemas—separate tables for translations and categories. The public API optimizes for read performance under load, using denormalized data by adding translations directly into category tables and handling filtering in Go. A monolithic architecture would require compromising performance in one area to accommodate the other. The integration service currently processes millions of events daily, with volume increasing as we onboard new customers. This post describes our implementation of a queue system using PostgreSQL and Go, focusing on design decisions and technical trade-offs. The first implementation used GCP Pub/Sub, a topic-to-many-subscription service where messages are replicated across multiple queues. This architecture introduced several scalability issues. The integration service maintained a database for integration configurations but lacked ownership of its operational data. This violated a distributed systems principle: services should own their data rather than depend on other services for it. This dependency forced our management service to serialize complete payloads into the queue. Updating a single attribute on a sub-object required sending the entire parent object with all nested sub-objects, metadata, and relationships. Different external APIs have varying data requirements—some need individual sub-objects while others require complete hierarchies. For clients with records containing 300-500 sub-objects, this resulted in significant message size inflation. GCP charges by message size rather than count, making large messages substantially more expensive than smaller ones. GCP’s WebSocket delivery requires clients to buffer messages internally. With 40 integrations running separate consumers with filters, traffic spikes created memory pressure: This prevented horizontal scaling and limited us to vertical scaling approaches. External APIs enforce varying rate limits. Our in-memory rate limiter tracked requests per integration but prevented horizontal scaling since state couldn’t be shared across instances without risking rate limit violations. By early 2025, these issues had compounded: excessive message sizes increasing costs, memory bloat requiring oversized containers, vertical-only scaling, high operational expenses, rate limiting preventing horizontal scale, and lack of data independence. The system couldn’t accommodate our growth trajectory. A complete rebuild was necessary. The v2 design addressed specific limitations: Additional improvements: The standard approach involves the producer computing payloads and sending them to the queue for consumer processing. We used this in v1 but rejected it for v2. Customers frequently make multiple rapid changes to the same record—updating a title, then a price, then a description. Each change triggers an event. Instead of sending three separate updates, we consolidate changes into a single update. We implemented a in the jobs table. Multiple updates to the same record within a short time window are deduplicated into a single job, reducing load on both our system and recipient systems. We chose PostgreSQL as our queue backend for several reasons: Often, we think we need something bigger like Apache Kafka when a relational database like PostgreSQL is sufficient for our requirements. The jobs table structure: Each job tracks: Postgres-backed queues require careful indexing. We use partial indexes (with WHERE clauses) only for actively queried states: , , , and . We don’t index or states. These statuses contain the majority of jobs in the table and aren’t needed in the job processing flow. Indexing them would just add more data into the memory when we don’t use it in the flow. Jobs are ordered by for FIFO processing, with priority queue overrides when applicable. Jobs follow a defined lifecycle: Timestamp fields serve observability purposes, measuring job duration and identifying bottlenecks. For jobs, retry timing is calculated using exponential backoff. The worker system requirements: We evaluated two approaches: maintaining in-memory queues with multiple goroutines using for and select to fetch jobs, or having goroutines fetch data from the database and iterate over the results. We chose the database iteration approach for its simplicity. pgxpool handles connection pooling, eliminating the need for channel-based in-memory queues. Each worker runs in a separate goroutine, using a ticker to poll for jobs every second. Before processing, workers check for shutdown signals ( or channel). When shutdown is initiated, workers stop accepting new jobs and mark in-flight jobs as . This prevents stalled jobs from blocking integration queues. Checking shutdown signals between jobs ensures clean shutdowns. During shutdown, we create a fresh context with for retrying jobs. This prevents database write failures when the main context is canceled. The query implements fair scheduling to prevent high-volume integrations from monopolizing workers: Query breakdown: Step 1: Identify busy integrations This CTE identifies integrations with 50+ concurrent processing jobs. Step 2: Select jobs with priority ordering Jobs are selected from integrations not in the busy list. Priority updates are ordered first, followed by FIFO ordering. locks selected rows to the current transaction, preventing duplicate processing by concurrent workers. Step 3: Update job status Selected jobs are updated to status with a recorded start time. This ensures fair resource allocation across integrations. Job timeouts are critical for queue health. In the initial release, we reused the global context for job processing. When jobs hung waiting for slow external APIs, they couldn’t be marked completed or failed due to context lifecycle coupling. Jobs accumulated in state indefinitely. The solution: context separation. The global context controls worker lifecycle. Each job receives its own context with a timeout. Timed-out jobs are marked , allowing queue progression. This also enables database writes during shutdown using a fresh context, even when the global context is canceled. Failed jobs require retry logic with appropriate timing. Immediate retries against failing external APIs are counterproductive. We implement exponential backoff: instant first retry, 10 seconds for the second, 30 seconds for the third, up to 30 minutes. The field drives backoff calculation. After 10 attempts, jobs are marked . Error types guide retry behavior: This allows each integration to decide how to handle errors based on the external API’s response. For example, a 400 Bad Request might be a permanent validation failure (NonRetryableError), while a 503 Service Unavailable is transient and should retry (RetryableError). The integration implementation determines the appropriate error type for each scenario. Jobs occasionally become stuck in state due to worker panics, database connection failures, or unexpected container termination. A cron job runs every minute, identifying jobs in state beyond the expected duration. These jobs are moved to with incremented retry counts, treating them as standard failures. This ensures queue progression despite unexpected failures. Rate limiting across multiple containers was v2’s most complex challenge. V1’s in-memory rate limiter worked for single containers but couldn’t share state across instances. While Redis was an option, we already had PostgreSQL with sufficient performance. The solution: a table tracking request counts per integration per second: Before external API requests, we increment the counter for the integration’s current time window (rounded to the second). PostgreSQL returns the new count. If the count exceeds the limit, we sleep 250ms and retry. If under the limit, we proceed. This works because all containers share the database as the source of truth for rate limiting. Occasionally, jobs are rate-limited during heavy load due to the gap between count checking and request sending. These jobs retry immediately. The occurrence rate is acceptable. Hope you enjoyed this article and learned something new. This system has worked really well so far, and we’ve had only a few minor issues that we fixed quickly. I will update this article over time. Mass updates generate large objects per record Objects are duplicated for each configured integration Copies buffer across 5-10 consumer instances Infrastructure requires 2GB RAM and 2 cores to handle spikes, despite needing only 512MB and 1 core during normal operation Horizontal scaling - Enable scaling across multiple containers Distributed rate limiting - Coordinate rate limits across instances Data ownership - Store operational data within the service Delta updates - Send only changed data rather than complete records Fair scheduling - Prevent single integrations from monopolizing resources Priority queuing - Process critical updates before lower-priority changes Self-service re-sync - Enable customers to re-sync catalogs independently Visibility - Provide APIs for customers to monitor sent data and queue status Performance - PostgreSQL is fast enough for our use case. We don’t need sub-second message delivery. Simplicity - Using a managed PostgreSQL instance on GCP is significantly simpler than introducing new infrastructure. Familiarity - Most developers understand SQL, reducing onboarding time. Existing infrastructure - We already use PostgreSQL for our data, eliminating the need for additional systems. - Links logs across services - Specifies the action (e.g., ) - Records failure details - Tracks current workflow state - Counts retry attempts - Schedules next retry , , - Provides metrics for observability - Links to specific integrations - Identifies the platform - Contains job data - Prevents duplicate execution Created → Initial state: Picked up → Transitions to Success → Becomes , records Failed (10 retries) → Becomes , records Failed (retries remaining) → Becomes , increments , calculates Parallel worker execution Horizontal scaling across containers Graceful shutdowns without job loss Distributed rate limit enforcement—we need to respect rate limits no matter how many containers we run - Permanent failures (e.g., validation errors). No retry. - Transient failures (e.g., 500 Internal Server Error). Retry with backoff. - Retry limit reached. Mark failed.

0 views
Emil Privér 1 months ago

Proposal for better code reviews

I’ve been reviewing tons of code throughout my life, both professionally at work and on my open-source projects such as Geni , and something that has always bothered me when it comes to code reviews is how much time you can spend reviewing code. And we’re all different: some review every commit, and some just check the end result. Some of us want to follow a way-of-working guideline, and some just want a basic explanation in the PR description. This is something I also find quite interesting because I’ve met three types of code reviewers: And don’t get me wrong—there are pros and cons to all three of them, but the one that probably provides the best review is the last one, mostly because of the amount of time they spend understanding how this code is used. I don’t think the PR code diff we have at GitHub, GitLab, and so on is good enough to achieve its main goal: reviewing code. Sure, if you just want to see that a change has happened, the existing code review UIs are quite good because you see the changes, but I guess you, like me, also want to understand how this code is used because we do have logical bugs. However, I do think they are phenomenally good at keeping other developers up to date. It can be very hard to understand logical bugs—which is the biggest reason I want a PR—when the UI looks like this: Now, this is a bad example since this is a tiny amount of code, but you get the point—it’s hard to follow the code when there are a lot of changes because the only thing you see is changes. And the hard truth is that we can’t always deliver tiny PRs to improve things for the code reviewer because the changes become incoherent, so we don’t really get the value from it. It’s also quite interesting to see how different we are when it comes to code reviews. Some developers don’t do it because they are pair-programming and review the code on the fly, some people want in-person meetings when they go through the code, and some just want a link to a GitHub PR so they can review it on the bus home. It’s fun that we’re so different. Every time I’ve read about code reviews, people speak about how we can be better colleagues by being more respectful, asking questions in a better way, creating smaller PRs to make the amount of code to review smaller, and so on. And I think this mostly covers the easy problems with code reviews. So I’ve been thinking a bit about code reviews and how we can make them easier. Sure, there’s stuff like adding a better changelog, reviewing the code before we ask someone, and adding a bunch of context to the PR as comments. But there is something that I’ve always thought of when I am reviewing code: context. Many times when I look at changes in a function and I am not that familiar with the codebase, understanding how this function is used is quite time-consuming. This is also something I’ve realized that people who spend a lot of time in PRs try to do. Many times, they find the same file in the commit SHA and look at how the function is used. I think we need to improve the tools we have because the other part depends a lot on how is involved in the code reviews but the tools is mostly always the same. The hardest problem for me, at least, is understanding the context—the how, where, and why. For instance, this code: now chunks the items into chunks of 20 items before we print them, which is clearly a logical change—and also the hardest type to review. Something we need to do there is check how this function is used to see if this introduces a new bug, but the problem is that GitHub, GitLab, and even ‘AI-powered code review’ tools are so bad at helping us understand the context of how this piece of code is used. In general, there is extremely little context on how this piece of code is used. I don’t think the correct thing is to add a bunch of AI to code reviews, because it mostly bloats the pull request with things that make us less focused on what matters. My proposal for code review tools is to add a section which shows where some code is used directly in the UI So I used v0 by Vercel to create a diff UI that adds simple context to the function, showing how and where the code is used. This makes it much more efficient to understand the impact of a change and whether it alters the logic, which is crucial since logical bugs are the hardest to find. This UI makes it easy to see where and how code is used, adding tons of context to the review with little hassle. I think this would be a great improvement for GitHub and would speed up the time we spend on code reviews. If you want to see and “feel” this UI for yourself, you can visit the website here: https://v0-code-review-beta.vercel.app/ The person who just skims through the code and gives LGTM 👍. This is also the type of person you ask for a code review when you just need an approval from someone. An AI would give a better code review, even if the AI mostly gives feedback on misspellings and hallucinates reviews. The person in between who looks for common issues in the code and gives feedback but doesn’t give much feedback on the actual logic outside the function that you changed. The person who dedicates their life to code reviews, checks every new diff, how the code/function is used, and in what files it’s used.

1 views
Emil Privér 2 months ago

Improving My Health Improved My Code

I think something important in life is to find the thing we love to do—a passion that doesn’t depend on other people. It should be this one thing we can do for ourselves that drives us forward in life. I’m speaking about that one thing we can’t live without—such as reading books, painting, building stuff, cars, whatever. For me, this is software engineering. I love software engineering and I have tremendous passion for it - I literally see myself being a software engineer until I die. I love everything about it: meetups, reading tech articles, debates, having problems that frustrate you until you solve them. I love learning new frameworks, languages, techniques, and ways to build systems. I enjoy going to meetups where you meet individuals who also love programming and debate topics even when I think they’re completely crazy and wrong. I love flashing my computer to Ubuntu, then trying Arch, Nix and PopOS, only to go back to Arch again. My fiancée bought me a shirt that says “In data we trust” - one of several tech-themed shirts I own. My Sunday mornings with coffee and my dog on my lap are pure gold. As you can probably tell, system development is my greatest passion, and even though I’ve been told that my job isn’t who I am, I still want to claim there are exceptions. Six years ago, I embarked on a journey that took considerable time and that I’m still not finished with. The primary goal was not to lose weight - it was to become healthier and have more energy, and losing weight was the solution to my goal. At my peak, I weighed 143.7 kg and was 194 cm tall, with much of it being fat. I was constantly tired and never had energy to try anything new. I ate enormous amounts of food, candy, and soda - easily 3-4 kilos of candy per week. If I had gone to a doctor, they probably would have said I was on the verge of diabetes. It was bad, really bad. But what frustrated me wasn’t that I was overweight and ate so much junk food. It was that I was constantly exhausted. I slept poorly, was constantly tired during the day, which in turn made me mentally fatigued when doing what I love - building systems. There wasn’t much energy left for innovation, and I couldn’t maintain focus - it was literally shit. Another effect of being overweight was that I slept poorly, which in turn made me often stressed and wound up. When COVID-19 began, I started working from home like many others, but unlike other countries, we in Sweden were allowed to go outside during restrictions - for example, I could go out for walks. So when COVID started, I began taking lunch walks, which became the start of losing 40 kilos and beginning a much healthier life that includes exercise and better sleep. Often when I talk about system development with people who don’t program themselves, I get the question “isn’t it a lot of math?” and the answer to that question is yes and no. The math that exists for most developers isn’t the usual addition, subtraction, multiplication, and division, but problem-solving from math. There are absolutely those who have much more math than most people do, but usually it’s pretty basic math. When I solved problem-solving tasks in school over long periods, I was usually tired, which was easy to see - the first problems were solved well, but the longer the time went on, the more sloppy I became. Programming is somewhat similar - we have a lot of problem-solving. When I had a poor lifestyle, I lost focus quickly and became tired quickly. My solutions were poor and had little creativity. Already during my transformation, but especially after, I noticed clear differences where I had significantly more energy when programming. My solutions became much better, more thoughtful, less stressed, and more elegant. My energy for learning more, exploring, and testing new software has increased tremendously. There’s much more love in the solutions that get built. But I didn’t just improve in the code I wrote. I also started to improve at my work—I was more involved in conversations both with clients and within the company. I performed better at the companies I worked for and got rewarded with a higher salary. All because I improved my health. Now, six years later, I still have some fat on my stomach. When I lost 40 kilos, I started going to the gym more and did a dirty bulk where I gained weight with both fat and muscle, which doesn’t bother me at all because I’ve achieved one of my goals - a much healthier life that has given me much more energy for my passion. My favorite food is still sausage stroganoff and hamburgers. I’m more than happy to drink a Stigberget beer on Fridays with some chips. But the biggest difference now compared to then is that today I have much more physical activity - I walk more with my dog, play basketball, and go to the gym. There are days when I’m not so keen on working out or walking, especially when it’s raining outside, but the most important thing is to be consistent. Motivation comes and goes, but being consistent doesn’t come and go unless we choose to break it. Sometimes the best thing we can do to solve a problem is to take a 20-minute walk or go to the gym. Software engineering and physical activity is closely tied together and both super important and it’s important that we take care of both parts

0 views
Emil Privér 3 months ago

About AI

For the last 1.5 years, I have forced myself to work with and learn AI, mostly because the future of software engineering will inevitably have more AI within it. I’ve focused on optimizing my workflow to understand when AI is a genuinely useful tool versus when it’s a hindrance. Now, 1.5 years later, I feel confident enough to say I’ve learned enough about AI to have some opinions, which is why I’m writing this post. AI has become a race between countries and companies, mostly due to status. The company that creates an AGI first will win and get the most status. The models provided by USA-based companies are heavy and require a lot of resources to operate, which is why we build data centers of GPUs in the USA and Norway . At the same time, China delivers models that are almost as good as Claude Opus but need a fraction of the resources to deliver a result. There’s a strange disconnect in the industry. On one hand, GitHub claims that 20 million users are on Copilot, and Sundar Pichai says that over 25% of the code at Google is now written by AI. On the other hand, independent studies show that AI actually makes experienced developers slower. The common thread seems to be that companies selling AI solutions advocate for their efficiency, while independent sources tell a different story. It’s also incredibly difficult to measure AI’s true efficiency. Most metrics focus on whether we accept an AI’s suggestion, not on whether we accept that code, leave it unedited, and ship it to production—mostly because tracking that is a profoundly difficult task. My experience lands somewhere in the middle. I’ve learned that AI is phenomenally good at helping me with all the “bullshit code”: refactoring, simple tasks that take two minutes to develop, or analyzing a piece of code. But for anything else, AI is mostly in my way, especially when developing something new. The reason is that AI can lure you down a deep rabbit hole of bad abstractions that can take a significant amount of time to fix. I’ve learned that you must understand in detail how you want to solve a problem to even have a fair shot at AI helping you. When a task is more than just busywork, AI gets in the way. The many times I’ve let AI do most of the job, I’ve been left with more bugs and poorly considered details in the implementation. Programming is the type of work where there often is no obvious solution; we need to “feel” the code as we work with it to truly understand the problem. When I work on something I’ve never worked on before, AI is a nice tool to get some direction. Sometimes this direction is really good, and sometimes it’s horrible and makes me lose a day or two of work. It’s a gamble, but the more experienced you become, the easier it is to know when you’re going in the wrong direction. But I am optimistic. I do think we can have a beautiful future where engineers and AI can work side-by-side together and create cool stuff. I’ve used IDEs, chat interfaces, web interfaces like Lovable, and CLIs, and it’s with CLIs that I’ve gained the most value. So far, CLIs are the best way to work with AI because you have full control over the context, and you are forced to think through the solution before you hit enter. In contrast, IDEs often suggest code and make changes automatically, sometimes without my full awareness. In a way, CLIs keep me in the loop as an active participant, not a passive observer. For everything I don’t like doing, AI is phenomenally good. Take design, for instance. I’ve used Lovable and Figma to generate beautiful UIs and then copied the HTML to implement in an Elixir dashboard, and the results have been stunning. I also use AI when I write articles to help with spelling and maintaining a clear narrative thread. It’s rare, but sometimes you get lucky and the AI handles a simple task for you perfectly. There are a couple of really nice things about AI that I’ve learned. I previously mentioned getting rid of the boring bullshit stuff I do in my daily work—this bullshit stuff is about 5% of everything I do on a daily basis. For example, today I managed to one-shot solve a new feature using AI. The feature was adding search to one of our APIs using Generalized Search Tree. I added the migration file with the create index, and the AI added query parameters to the HTTP handler and updated the database function to search on if we have added a search parameter. But the thing that AI has provided me the most value in my day-to-day work is that instead of bringing a design to a meeting, we can bring a proof of concept. This changes a lot because developers can now easily understand what we want to build, and we can save tons of time on meetings. Building PoCs is also really good when we enter sales meetings, as a common hard problem in sales is understanding what the customer really wants. When we can bring a PoC to the meeting, it’s easier to get feedback faster on what the customer wants because they can try it and give direct feedback on something they can press on. This is an absolute game-changer. Another really nice thing is when we have a design and we ask the AI to take a Figma design into some React code—it can generate the boilerplate and some of the design. It doesn’t one-shot it, but it brings some kind of start which saves us some time. The reason why AI works well with frontend is that the hard part about frontend is not writing the code from a design; it’s optimizing it for 7 billion people with different minds and thoughts, different devices, internet connections, and different disabilities. I use research mode from time to time when I have some bug that is super hard to find. For instance, I switched from to in Go a while ago. When I pushed to production, we got an error where the connection pool manager gave an error because the transaction ID already exists, so I rolled back. I did some quick DuckDuckGoing to find the issue but didn’t find it. So I asked Gemini to do research on the error, and a comment in some GitHub issue suggested changing the default query exec mode to cache describe from cache statement: By making this change, I solved the problem. I probably would have spent another 20 minutes debugging or searching for the solution, and research mode could point me in the correct direction, which helped me solve it faster. But it’s a gamble—sometimes what the AI suggests is the wrong direction and you lose some time. Vibe coding, for those who don’t know what it is, is when you prompt your way to write software. You instruct an AI to build something and only look at the result rather than using AI as a pair-programmer to build the software, where you check the code you add. There is a major problem with this, as AI is, for a fact, terrible at writing code and introduces security problems. It also doesn’t understand the concept of optimizing code—it just generates a result. For instance, I asked Claude in a project to fetch all books and then fetch the authors and merge the books and authors into a schema structure where the book includes the author’s name. This is a simple task, but Claude preferred to fetch (database query) the Author in a loop instead of fetching all books first, then their authors, and merging them together in a loop. This created an N+1 query. If we have 10,000 products, this would mean 10,001 queries, which is not a smart way of fetching data as this could lead to high usage of the database, making it expensive because we need to fetch so much data. Vibe coding will pass. I don’t think it will stay for much longer, mostly because we require developers to actually understand what we’re working with, and vibe coding doesn’t teach you—plus vibe coding creates too much slop. Instead, it will be more important to learn how to work with context and instructions to the AI. But there is one aspect of it which I think is great and dangerous. When you’re a new developer, it’s really easy to struggle with programming because it sometimes takes time to get feedback on what you’re working with. This is why Python or JavaScript are great languages to start with (even if JavaScript is a horrible language) because it’s easy to get feedback on what you’re building. This is also why it’s great to start with frontend when you’re new, because the hardest part of building frontends is not the design, and you can get feedback in the UI on all changes you make, which makes you feel more entitled and happy to continue programming. If you don’t get the feedback, it’s easy to enter a situation where you feel that you’re not getting anywhere and you give up. But with an AI, we can get a lot of help on the way to building a bit more advanced stuff. When you’re a new developer, it’s not really the most important thing to learn everything—it’s to keep going forward and building more stuff as you will learn more after a while. The only problem with vibe coding and learning is that it takes you more time to learn new stuff when you don’t do it yourself. There are some services that claim non-technical people can build SaaS companies using their services, which of course is a lie. For the non-technical people I’ve talked to regarding this matter, a tool like Replit can be good when what they sell is not a SaaS offering—for example, they are a barber and need a website. When we try to vibe code a SaaS company using a prompt web interface, these “non-technical” founders often throw their money into a lake because what they want to build doesn’t work and has tons of bugs and wrongly built solutions, because you need to understand software in order to write software. There is a saying in programming: the last 10% of the solution is 90% of the work. The last 10% is the details, such as how should we handle this amount of messages—should they be an HTTP handler or should we create a message queue? How should my system and the other system communicate? What is a product and what is an ingredient? This is the type of stuff that non-technical people don’t understand and what the AI doesn’t understand as well—the AI just generates code. AI is fundamentally the biggest tech debt generator I’ve worked with in my life. Tech debt, for those who don’t know what it means, is the trade-off created when we develop software. Tech debt can create situations where the system is too expensive or it becomes too much of a problem to get new features out in production due to limitations in other parts of the system. Every company has some kind of tech debt—some have higher and some have lower. Limiting tech debt is easy: you only need to say no to stuff or don’t write any code at all, or you need to focus on building software in a standard and simple way. The more simple (but not stupid), the lower the amount of tech debt. For example, the problem with choosing PostgreSQL over DynamoDB is that it requires more manual work to scale a PostgreSQL database, while a DynamoDB scales automatically as it’s managed. When we get traffic spikes, DynamoDB handles this better, but at the same time, DynamoDB forces us to structure our data in a different way, and you don’t really know what you will pay for the usage. It increases memory usage as you need to fetch entire documents and not just some specific fields, which is common for NoSQL databases. The reason why AI is the biggest tech debt generator I’ve seen is that it leads us into rabbit holes that can be really hard to get out of. For instance, the AI could suggest you should store data in big JSON blobs in the database due to “flexibility” in a relational database. Another problem is that AI loves complexity, and the problem is that we accept complexity. Complexity creates issues where code becomes harder to maintain and creates bugs. AI often prefers not to reuse code and instead writes the same logic again. Another issue with AI and tech debt is that AI creates tons of security vulnerabilities as it still doesn’t understand security well, and many new developers don’t understand it either. For instance, how to prevent SQL injections. A general thing that companies who sell AI services claim is that developers become more efficient, and some developers think they are, mostly because the only thing they look at is when they write code and not the part after writing code, such as debugging. For developers who use AI, some parts of their job switch from writing code to debugging, code review, and fixing security vulnerabilities. More experienced developers also feel that they are being held back by AI-generated slop code due to how badly the AI thinks, while more inexperienced developers feel the opposite. For instance, I a while ago created bulk endpoints to our existing API and asked AI to do the job. I thought I saved some time as most of this was just boilerplate code and some copying. The task was that we should take the existing logic for creating, updating, and deleting objects. I prompted Claude to do updates and creations in parallel due to the logic, so it created channels and goroutines to process the data in parallel. When I started to test this solution, I quite quickly saw some problems. For example, the AI didn’t handle channels correctly—it set a fixed length of the channel, which would create a problem where we create a situation where the code would hang as we’re trying to add more data to the channels while we at the same time use . This was an easy fix, but I spent some time debugging this. When we added deletion of objects, we didn’t build a solution where we deleted multiple IDs at once; instead we used goroutines and channels to send multiple delete SQL queries, which created a much slower and more expensive solution. This was also fixed. Instead of logging the error and returning a human-readable error, we returned the actual error code, which could create security vulnerabilities. After my try at offloading more to the AI, I stopped and went back to writing everything myself as it generally took less time. It’s also a really weird way of measuring efficiency on how much code we accept, especially when most of the code won’t even hit production. But I guess they need to sell their software in some way. So with all these tools built for developers, I realized that the people who gain the most from all these tools are not the developers—it’s all the people around who don’t write code. It’s easier for customers to show what they really want, we can enter sales meetings with a PoC which makes the selling part easier, and product owners can generate a PoC to show the developers how they think and get quicker feedback. Another good thing is when we can summarize text from different sources into one concise piece of text, which could help us prevent the need for reading multiple Slack channels, emails, and calendar entries. This saves us really nice time as well. As a manager, AI is really nice to get a summary of how everything is going at the company and what tasks everyone is working on and the status of the tasks, instead of having refinement meetings to get status updates on tasks. There are probably multiple things we do on our daily basis that AI can help us do to prevent needless meetings and organizing, removing the need for many managers. The most gain we get so far is all the easy tasks we repeat on our daily basis. This is a new part of this article added 7 August 2025. About 40% of NVIDIA’s revenue comes from 5 companies (Microsoft, Amazon, Meta, Google, and Tesla) , which means that if one of these companies—for example, Amazon—decides to cut its AI spending in half, NVIDIA would lose 50% of its revenue from Amazon. This would adjust the stock price, which would later affect the other big tech companies. Most of these AI companies are not even profitable. NVIDIA is 7.74% of the index value of the S&P 500 according to https://slickcharts.com/sp500 . The biggest tech companies represent about 35% of the US stock market, and NVIDIA is 19% of them. The difference between NVIDIA and other companies selling hardware is that other companies have a bigger offering. If one customer of NVIDIA says they are not interested in buying more GPUs, there’s a big chance that more companies will do this (like a domino effect), and NVIDIA would have little to fall back on. The chance of this happening is quite high, as most of these big tech companies are not profitable and they are all pressured to show returns on their AI investments. Imagine the cost this would have for the individual investor and our saving accounts. One problem with NVIDIA is that they need to sell more GPUs every quarter in order to keep up with the revenue. This fragility extends beyond NVIDIA to the entire AI service ecosystem. Most companies that sell services on top of LLMs price them at costs that aren’t sustainable long-term. Cursor is a good example—their rapid growth came from a business plan that didn’t generate profit, which means they needed to heavily restrict model usage and add rate limits to avoid losing money on customer usage. The irony is that the biggest income will come from the infrastructure to run LLMs and AI, not the actual LLMs themselves. The reason I added this is that the AI market is built on hyperscalers and has created a bubble more dangerous than the dot-com era. I think we will have a bright future with AI and that AI can help us in so many ways—for example, removing stress from our day-to-day tasks that might not even be related to jobs, such as repetitive tasks we do with our family. If an AI can replace these repeated tasks, I could spend more time with my fiancé, family, friends, and dog, which is awesome, and I am looking forward to that. It is also sad how companies use AI as an excuse to fire people to “optimize” while the real problem is the company structure, the ways of working, and the type of people working at the company.

0 views
Emil Privér 5 months ago

We Should All Code Like Steve Jobs

I really don’t like Test-Driven Development (TDD), Domain-Driven Design (DDD), the Clean Code philosophy, or most of the methodologies that tell us how we should write code. This is mostly because they never seem to work in codebases designed to last for years. And, even though the title includes the name of Apple’s creator, I’m not really a fan of Apple. As a matter of fact, I’m the only one in my big family who moved away from Apple entirely and now uses Linux and Android. This is mostly because it brings more peace to my everyday work, but also because I genuinely don’t understand the appeal of an iPhone when Apple’s latest big innovation was adding 0.8 opacity to all UI elements. Anyway, the reason I titled this article “We Should All Code Like Steve Jobs” isn’t because of any code he wrote; it’s about his preference for simplicity and ease of understanding. When the iPod was released, you could look at it and, with minimal mental effort, figure out how to play a song. For me, it’s the same with writing code: you should be able to look at a piece of code and quickly grasp what it’s doing. The code should be simple, not smart. Of course, this doesn’t mean sacrificing essential functionality or creating naive solutions in the name of simplicity. The goal isn’t to make code ‘stupid’ by oversimplifying; that would indeed be counterproductive stupidity. In my experience with large Domain-Driven Design projects, or any project that rigidly follows a set of “rules,” adding new features often becomes a struggle. The effort to get them into production increases because the existing structure, with its self-imposed limits, forces extensive refactoring instead of allowing us to directly solve the problem at hand. This complexity often arises from trying to make one domain serve the diverse needs of all other domains. Since each domain has different requirements, this approach frequently leads to convoluted systems and unnecessary overhead. Methodologies like Clean Code and DDD often result in layers of abstraction calling other abstractions, when sometimes, duplicating a bit of code and moving on would be far more practical. A common reason I see for developers wanting to restart a project is that they’ve locked themselves into so many rules that the project becomes overly complex. Instead of fixing this inherent complexity, they opt for a restart. Ironically, this often leads to a similarly architected project, especially if the same developers are involved—it’s often said that the same developer rebuilding a project is likely to make many of the same choices, leading to a similar outcome. Domain-Driven Design is just one of the philosophies I find problematic to work with; I could go on about this topic forever 😊 What I believe is most important about software code, and how we should measure its quality, includes: If any of these four points aren’t being met, it’s a strong indicator that the codebase might be veering away from the simplicity I advocate. It could mean that some refactoring is in order—perhaps to choose clearer function or variable names. Or, it might simply mean it’s time to step away, clear your head, and return to the problem with a fresh perspective, ensuring the solution remains straightforward and understandable. When I worked at CarbonCloud , I learned a valuable lesson about this from an old co-worker I asked for feedback. Essentially, the feedback was that instead of solving the higher-level problem first, I had jumped straight into the code and built something. This is something I now consciously consider before starting larger projects or features. While tiny, straightforward tasks might not need extensive upfront thinking, something like the bulk API I am currently building certainly did. It required careful consideration of what we wanted to solve and how we wanted to solve it before any code was written. By focusing on the higher problem first, I managed to deliver a much simpler solution that works great. It isn’t a difficult concept, but it’s one that significantly improves the product: solve the actual problem before writing any code. Sometimes this is super obvious, and sometimes you need to talk to your teammates before you write any code. Sometimes, we jump into coding and ship something we think solves the problem, rather than what actually solves it, simply because we haven’t fully understood the requirements. A good example of this is when you need to add a new queue for asynchronous message handling. One way is to introduce a new system to the stack, like AWS SQS or Kafka. Another way is to add a new table to the existing database and create a worker that polls for jobs every few seconds. One approach is more complicated and potentially requires more people to maintain; the other is simpler and might not. By focusing on the higher-level problem, you might realize that: However, if speed were a critical requirement, then using the existing database might indeed lead to a suboptimal or ‘stupid’ solution because it wouldn’t meet those specific needs. What I find interesting is that the developers I personally look up to, who are all far more experienced than I am, almost always say the same thing when I ask for feedback on my systems: keep it simple, and don’t add any more than what is truly necessary. This is also why I would recommend a new company build a monolith if its developers aren’t deeply familiar with distributed systems. It’s generally easier to run and scale a monolith initially, and they can grow with it for a long time before needing to consider more complex architectures. Systems built simply (not naively simple, but thoughtfully simple) are often the ones that work the best and last the longest. I decided to write this article after I’ve worked in enough codebases where developers’ opinions on what “clean code” truly means have influenced projects far too much. Hopefully, this article helps you understand my firm belief: many so-called “clean code” ideas are concepts we might have been better off without. I genuinely love simple, easy-to-understand code that isn’t bogged down by a bunch of rules. The projects I’ve maintained that have remained in use for the longest time are precisely these types of projects. Conversely, projects that rigidly adhere to some “clean code” philosophy have often required complete rewrites. This frequently happens because developers’ personal preferences for these abstract principles take precedence over the primary goal of shipping a working solution. Their pursuit of an idealized “clean code” state can complicate development, leading to extensive refactoring that, in turn, introduces bugs and devilish complexity. To be clear, I don’t want to dunk on developers who like Clean Code, Domain-Driven Design, or any similar methodologies. I do appreciate developers who stand by their convictions and keep pushing for them. I just firmly believe these methodologies often work better in theory than in practice. Anyway, if you liked this article and want to follow me on X, you can find me at https://x.com/emil_priver

0 views
Emil Privér 7 months ago

You Have Imposter Syndrome? Good for You!

You might read this article and think that having anxiety about not being good enough is, in itself, good enough. And no, it probably isn’t. But the reason why you have that anxiety can be a good thing. The idea behind this post is to discuss something I’ve spoken about with a lot of people: junior developers, senior developers, non-software engineers—many people. Imposter Syndrome. Imposter Syndrome is this thing where we think our skills are not good enough. It can happen in many situations, such as when we switch jobs or join a team with more experienced developers. It’s part of life to experience imposter syndrome, and it’s actually really good that you get it; it’s a sign that you’re doing something good for yourself as you’re evolving. Imposter syndrome only means that you’re putting yourself out of your comfort zone and that you’re evolving as a person. Unfortunately, this feeling can make us doubt ourselves, and that sure does suck. I’ve had imposter syndrome plenty of times, and it happens less and less the longer I’ve been an engineer. I’ve thought about this topic every now and then and how big a part of life and our development it is. Iteration is everywhere and happens every day. It’s simply a big feedback loop. When we’re young, we do assignments, give the teacher the results, and get feedback on why we didn’t get the highest grade. We might decide one day to try something new in the kitchen, so we mix a stock cube and salt in a sauce and realize that both the stock cube and the salt add salt to the dish, so the dish becomes too salty, and we learn. We build a new feature and release it, it breaks and kills production, and then we learn that if we don’t want to lock the database while we create a new index, we need to create it concurrently. So we learn. And this is how your experience as a junior engineer should and probably is. You build something, push it to a PR, and get feedback that something might not work, and you learn and evolve. The reason why junior roles exist in the world is that in order to be good at a job, you need to learn the job. For example, the role essentially signifies that you’re a person with less experience in the field. That’s why there are fewer expectations on a junior than on a senior, which is something I think is really important to understand. I hope this doesn’t sound too harsh, but no one expects you to deliver top-notch results, except yourself. I started as a frontend developer at Rivercode, an agency building (at the time) mostly websites and e-commerce sites. I remember when I did the interviews, especially the technical part of it. The goal of the interview was to design and build a simple frontend that talked to the Unsplash API. I went home, built it, and sent it to my former boss, and went to the technical interview with a lot of confidence. And the reason why I went there with a lot of confidence was that I knew exactly what I was talking about because it was in my comfort zone. I’d built a million React frontends using create-react-app and similar APIs, such as the Star Wars API. Piece of cake. But then the real world got to me, and I started as a junior frontend developer and was assigned a WordPress site where my goal was to add all the PHP code needed to an existing template. I needed to understand PHP, SQL, MariaDB, and FTP, and I just realized that I knew nothing, and I got imposter syndrome, and I did all the things I shouldn’t do. It took a week or two before my boss at the time asked how it was going and realized that I hadn’t gotten far. He helped me to move forward, and I realized that I should have asked for help way earlier. What I realized after that project and every new project I worked on was that I got the exact same feeling every time I did something I had never done before, and every time I finished that project, I felt more confident as I knew way more stuff. A while later, I switched jobs, and imposter syndrome hit me again. I was building in Go, which I had only done in hobby projects at the time, but the key difference between this time and the first time was that I asked for help way faster. I then realized that every time I had this feeling, I evolved, so I started to use this feeling as something positive and accepted that this feeling of imposter syndrome happens and it’s just a part of the process of learning. So why did I tell you about iterations and my first 4 years as a developer? The idea was to tell you that imposter syndrome is something we need to accept and learn to work with because it happens everywhere, at any time, and to anyone. It happened to me in my first game of basketball, and it happens only because of the pressure you put on yourself, really. So the reason why imposter syndrome is a sign of a good thing is that you’re probably outside your comfort zone, and it’s outside of the comfort zone where you learn the most. So what we need to do is change our mindset when it hits, and what I usually change my mindset into is: Good software is built on iterations in pull requests, bug & crash reports, production crashes, and data on how it performs. So there is no need for you to expect that you need to deliver a perfect, top-notch solution; you will get there eventually. Take tiny steps, tiny iterations. It’s really easy to get stuck on the same problem for a long time and not get anywhere. This is what happened when I built the first WordPress site. I was stuck and didn’t get anywhere until my boss helped me. So asking for help can make you take another step, and yes, this even includes asking AI for its opinion. The only important thing with asking AI is to take its response with a grain of salt and not copy any code, but rather understand it. The only person you are battling is yourself. The reason why you have these thoughts that you’ve taken on more than you can handle is that you have set high requirements for yourself. And I can assure you that a project built alone works way worse than a project built with multiple people, so use your team members. You’re not hired to know everything, especially not if you’re a junior developer. Your job as a software engineer is to get a problem, understand it, work on a solution, build the solution, and ship the solution. The part where you learn the most is during the time you understand the problem and work on a solution. Also, tech changes a lot. In 2 years, we will have new ways of working with certain technologies. Just look at how Next.js has evolved from only server-side rendering to building static pages back to rendering on the server-side, but with streaming HTML to the client. See it from the bright side, you’re evolving. It’s a good thing. I hope this post might help someone and maybe teaches you that imposter syndrome is something we need to accept and work with, and it’s really not a bad sign that we get it. It just sucks to have the feeling of not feeling like an imposter. I wrote this article head-banging to Skrillex’s new album while preparing to head to the basketball court, which means it’s about time for me to end this article.

0 views
Emil Privér 11 months ago

It's Time to Move on From Nosql

There are a few things and directions we developers have taken that I think were probably not good things, such as NoSQL or thinking that edge runtime will make our websites faster. Back in 2019 and probably even earlier, many developers looked into Lambda functions because they solved the problems of unexpected traffic spikes for our system - and don’t get me wrong, they probably did. But they didn’t solve the problem of our wallets not printing money on demand, but that’s a different story. When we started to use Lambda functions even more, we realized that if we use Lambda functions which we don’t have a lot of control over, we also need to have a database which can scale up automatically when our traffic scales up too. At the time, the traditional way to use a database was to spin up a MariaDB, MySQL, or Postgres database and set up a connection between the application and the database. But the problem when you use Lambda and a normal relational database is that it’s hard to control how many connections you spin up. It’s possible for the Lambda scheduler to spin up 100 functions which all need connections to the database, and we might not have control over when these connections spin up, so the database gets too much work and probably dies. When we realized this, we probably also realized that we need something which can handle unexpected connections, which became NoSQL. NoSQL worked as an alternative due to its design. NoSQL is essentially an object-storage system that comes with an interface so we can more easily query the data, but it has no rules or structure for the data. It simply stores it and lets you do basic operations on the data such as where, sum, and count. But there is one trade-off - due to its design with files in object storage, you are now forced to do many queries instead of one to fetch all the data you need, and you pay per request you make, and there is no way to control which data you get back; you simply fetch all of the data. The idea of spinning up a database without requirements on our data to use as our main database is, for me, an interesting idea and a weird move we do. Scale in software engineering refers to a system’s ability to handle growing demands while maintaining performance and reliability. If you look at NoSQL databases, they indeed handle the increased amount of traffic for services very well; however, there is more to scale than traffic. I also often see the ability to grow over time in terms of database size and how easy it is to maintain that data as part of scaling, and this is where NoSQL doesn’t scale. I generally believe that in programming, working more strictly makes it easier to maintain software over time, such as using a strictly typed language like Rust. A strictly designed relational database can grow more easily with you over time because you design a schema which your data follows. This is not the case with NoSQL because there is no such thing as a schema in a NoSQL database; sure, the application might have a schema, but not the database. If your product manager comes to you and tells you that the requirements have changed for the application, it can be really hard to achieve a good change with a NoSQL database because you might have a big migration job in front of you. If you simply change a type in the application and you don’t change the data for all documents, it could mean that when you’re requesting a type of string and you get an int, and this can create issues. If you decide to make a big migration job, you will probably need to pull a lot of data and then write it again, which can be pricey, compared to a SQL database where you can change the schema and tell the database how to make a migration if it’s needed. And this is my biggest problem with NoSQL: the amount of work you need to do if you want to maintain a NoSQL database over a period of time and keep it in a good state, especially when the amount of data you store increases. A NoSQL database might solve your needs, but the problem isn’t when things are working fine - it’s when things are not working fine, as it’s much easier to break a good state of a system where there are no rules to follow. Another thing with NoSQL is that we can’t treat a NoSQL database as a relational database, which is obvious. But what I mean is that it will be more expensive if you treat your NoSQL database as a relational database by splitting up the collections (tables) into many and then making many queries to get all the data you need. This is why you need to think of how you build your data model so it’s more easier to fetch data. For instance, if you want to build an API which shows products and want the category information to be on a single product’s model, it is probably better to instead of updating a single category in the category collection, update all products with that category as well because it reduces the amount of documents we need to read in order to get the data we need. However, it’s a massive job to update all these products if we want to update the value for products which have that category. And this is one of the biggest differences I realized when I was working with NoSQL: with NoSQL, you need to prepare the data before you make the request to keep it in a good state and prevent unnecessary costs for your database. Now this question is probably tied a lot to preference in how we work, but I think that from the beginning we shouldn’t have used NoSQL for the cause of handling traffic; I think the problem has always been with the tech stack. But I definitely also think we shouldn’t use NoSQL as our main DB. Either we should have used it as a persistent cache to our relational DB for the service which we expect to have heavy load, or we should have treated it as an alternative database to our relational database when we can’t describe how to store the data so we simply just dump it somewhere, and then a data engineer or someone could use that “dumped” data to do their job. But I also think we should have looked into other options which could be good candidates in the serverless lambda functions world, such as LibSQL. LibSQL is a SQLite fork by the company named Turso. It gives us an HTTP interface we can use to query the data from a lambda function. Something I didn’t mention earlier is that when SSDs improved even more, NoSQL databases also improved in terms of speed, as did LibSQL. LibSQL allows us to achieve the same type of speed and scale as NoSQL but it also comes with requirements on our data.

0 views
Emil Privér 1 years ago

Monolith Is the Heaven, Right?

How many times have we, as developers, discussed the Monolith vs Microservices debate, and every time we talk about it, the conversation revolves around how much easier it is to host a monolith and how developers often over-engineer today? Believe it or not, I totally agree with you. Monoliths are indeed easier to host. Monoliths are cheaper to host. Microservices increase the complexity of the system. However, companies such as DoorDash still adopt Microservices. Why? Because the problem isn’t about hosting or the cost, or that the system is over-engineered and becomes more complicated. The issue is that developers have differing opinions, and this prevents organizations from shipping because we create blockers for each other. When Microservices was introduced was one of the selling points that it allows us to scale apps differently depending on the traffic and that we can work with more languages then 1 to run an application and both this arguments made sense, in 2012 -> 2017. And then, a few years ago, COVID happened, and hardware skyrocketed to a new level of speed and stability. Today’s chips are so fast that the language can be a bottleneck. Some time ago, the problem was often that the CPUs weren’t fast enough to process the code, and we could write slow, inefficient code and deploy it, and still be okay with it. Today, the problem is often that we write slow, inefficient code that doesn’t utilize the full capacity of the server. With the improvements in hardware that we have today, we’ve finally seen that running a monolith isn’t as difficult as it was earlier, thanks to how much better the CPUs are. Just look at DHH, who walked away from the cloud and went into self-hosting a monolith on bare-metal servers, and how well it works for them. Microservices are considered harder to host because when we move out of the context of a single service, we drastically increase the amount of things we need to think about. We’re moving from calling a function to either making an HTTP call or creating a new event in the system. Just imagine that you’re cutting down a tree. When the context is just you, it’s easier because you don’t need to communicate with anyone. But when another person joins you, you need to communicate that you’re cutting the tree, and this is similar to what happens between services. For example, what happens if one service returns a 500 error or the human doesn’t respond when you yell “TIMBER”? But I do understand why we often say that microservices are harder to host, and it’s not just because of what I said in the previous sentence. I started my career as a system engineer back in 2019, and I wanted to learn more about the history of system engineering, mainly because I thought that I could learn some valuable lessons from the “older” days, and I did. I read many horror stories that took the term “microservices” a bit too far, creating tiny services that did very little and then creating their own Function-as-a-Service type of architecture, where they treated different services as functions by switching the code from invoking a function into an HTTP call, and then creating a chain of microservices calling each other just because the type of product differed between two services. And this is where I think we went terribly wrong with microservices. When it comes to developers, we all have different ways of thinking about systems and code. Some ideas are better than others, but it’s often the details where the most issues arise for developers. For example, do you return a pointer or not return a pointer in Go? How do you define clean code and how do you write it? Ask these questions to a group of developers where the number of developers is more than 10. Which answer is correct? You might understand that everyone has different opinions on the subject, and this is also why you might have gotten a lot of different answers, and you might question whether these developers are really smart or not. In a real-world example, we also encounter situations where we want to build the same solution, but we think differently, which is normal. We are humans, not robots. When we build big monoliths, it’s quite easy to get stuck in a loop of feedback where a group of developers can’t decide on how to build the solution, creating blockers for the developers because we spend more time discussing than writing code. This is something that easily happens when there are more than 10-15-20 developers working on the same monolith, especially if we’re stubborn. So when this happens and management notices it, they probably think about creating smaller teams. Fewer developers equals fewer opinions. More code shipped, problem solved. But that’s not how it works if we don’t create good boundaries between the teams. What we do is start to split the monolith into parts, so each team owns different parts. But (always a but) the problem now is that we start to write different types of code. Some teams want to change the database because it makes sense to them, but it doesn’t make sense to the other teams, and we’re back to square one, where we now have discussions and less shipping again. So at this point, it’s time to split the service into apps in a microservices architecture, so we don’t encounter these problems again. I also don’t think we should start talking about the horror stories about the hours of testing that some monoliths require due to the large amount of code they have. So, microservices are not just about technical architecture, but also about organizational structure. Teams can now ship independently without blocking each other as often as they would in a monolith. Discussions are more about the communication between teams, such as API design, and not really about how the code looks or how it operates. So, for many companies, the trade-offs from a monolith to microservices are better than staying with a monolith, even if it’s more expensive to host and harder to run. After hearing all these horror stories, I still would start a project with a microservices architecture in mind, mainly because if we have more developers in the future, it would be easier. However, I would relax a lot on the “splitting the service” part. What I mean by this is that we would have a microservices architecture hosting not big monoliths, but rather monoliths where the requirements of the service are more than just a few things. Let me give you an example. A while ago, a couple of friends and I had a hobby project where the idea was to build something that integrated with Discord, Twitch, YouTube, and so on, and then sent data to different places. Imagine it as an analytics service. When we started, we created three services, all named after Super Mario Bros: Nabbit (holding all the content data), Magikoopa (auth and users), and Toadsworth (listening to events and sending data further to other systems). However, after a while, we questioned ourselves and thought, “Why are Nabbit and Magikoopa split?” Because we thought that it made sense because they did different things, but we realized that this was not the case. So, we merged them, which left us running only two services. Running Nabbit and Toadsworth as two microservices just made life so much easier for us. So, the problem we now solved is that two of us (we’re four friends developing this) own Nabbit, and two own Toadsworth. We have clear boundaries between each of the “teams,” and we ship fast. All system requirements are different, and all of this was just my thoughts on this subject. I’ve been working on big monoliths, over-engineered microservices, a mix of both, and running entire services in cloud functions. Anyway, have a good one. I talk about this type of thing every now and then on X and Bluesky. Follow me! :D

1 views
Emil Privér 1 years ago

The On-Call Driven Development

I’ve been thinking about this topic for a while now. On-call duty is something we rely on, but also dislike. Many businesses believe it’s necessary for reliability in case something goes wrong. My issue isn’t with having someone available to respond to alerts, which is always beneficial. My concern is with how on-call duty affects us engineers. At one of the companies I worked for, I was on-call every six weeks, and sometimes for a few days in between those six-week periods. The extra pay was nice, and we didn’t have much to do while on-call, thanks to our automated systems and well-designed processes. The requirements for on-call developers at that company was clear: If something went down, our job was to get it up and running again, either by restarting a service or applying one of the scripts from the “run-this-scripts-if-prod-is-down” repository. However, we weren’t expected to start debugging to find the root cause of the issue; that was left to the team owning the service on the next working day. I thought this approach was reasonable. My problem with on-call duty isn’t the concept itself, but rather how we engineers use it – often due to our own laziness or because we’re put in situations that encourage laziness. This is my main pain point: the mindset that develops when we know we have an on-call team with a “fire extinguisher” ready to go when production goes down. The issue I have is that we rely on the on-call team as a safety net to ensure the services we’re responsible for are up and running. I think that when we have on-call as a safety net, we might develop worse systems because less is taken care of, possibly because we know that someone has our back. I think that we should write software so that when it’s time to go to bed, we can feel confident that it will continue to run. This includes testing solutions in different ways, such as stress tests and integration tests, and questioning the system, rather than just rushing to get it out there. There are many reasons why rushing out a change happens, such as bad management creating hard deadlines that put developers under stress, or a bad overall structure of the company. Many companies use sprints, SAFe, or other methodologies that can be misused by bad managers to control developers, where the deadlines are short. At the job I spoke about earlier, we had some really good fundamentals about how to operate and how to think. We had on-call, but on-call wasn’t seen as a safety net that we could rely on. Instead, management enforced a culture where it was better to take one more week to work on something rather than shipping it quickly. This allowed us to take the time to think things through thoroughly and test them out before releasing. As a result of this way of working, we had an extremely well-working solution, and new developers to the company were surprised that we never did any fire-fighting. I’ve come up with the term “On-Call driven development” to describe how the presence of on-call support can shape the way developers design and build systems. Essentially, it’s about how having a safety net can influence the level of caution and thoroughness developers bring to their work. To illustrate this, consider rock climbing with a safety rope. With the rope in place, you might feel more at ease taking risks and skipping double-checks, knowing that the rope will catch you if something goes wrong. Similarly, in software development, on-call support can create a similar mindset, where developers might be less inclined to thoroughly test and validate their code, knowing that the on-call team will handle any issues that arise. This approach can lead to a culture where developers prioritize rapid feature deployment over building robust and reliable systems, knowing that someone else will be responsible for addressing any problems that come up. My biggest problem is really not that on-call exists or that we use it, and I totally understand why we use it. My issue is more that we have a team existing to make sure our system works. I think that some systems could improve how they work by removing on-call when it’s not needed. This one was brief, as I struggled to articulate it without expanding the topic too far beyond its original scope. By the way, I’m on Bluesky - you can find me at https://bsky.app/profile/priver.dev .

0 views
Emil Privér 1 years ago

Implementing V7 UUID in Postgres

Everyone likes fast Postgres databases, and so do I. Something developers have been talking about recently is the usage of UUID v7 in Postgres databases because rt is quicker to search on. I wanted to use v7 as IDs for the service I built, but I also didn’t want to generate the UUID in the application layer as I think it’s really nice to use in SQL. This article shows a quick example of how I implemented it for my services as Postgres don’t support V7 yet. If you are unfamiliar with the differences between the various UUID versions, I can provide a quick overview: UUID versions 1, 6, and 7 are generated using a timestamp, monotonic counter, and MAC address. Version 2 is specifically for security IDs. Version 3 is created from MD5 hashes of given data. Version 4 is generated from completely random data. Version 5 is generated from SHA1 hashes of provided data. Version 8 is completely customizable. For most developers, version 4 is sufficient and performs well. However, if you plan to use UUIDs for sorting purposes, you may experience slower sorting queries due to the randomness of the data. In this case, version 7 would be preferred for faster queries. A disclaimer is that I did not write this function myself. I found it on a Github thread. What the function does is it utilizes the existing function, which is the v4 implementation. We use to obtain the current time, extract the epoch time in milliseconds as v7 uses milliseconds, and then convert the millisecond timestamp to a byte sequence using . To incorporate the timestamp byte sequence into the UUID, we use to replace the first part of the UUID with the byte sequence. Additionally, we need to add the version of the UUID by changing the 52nd and 53rd bits in the byte array using . We simply set both the and bits to 1 to indicate version 7. Finally, we use encode to convert it back to a . As you may have noticed, it generates a v7 UUID based on v4, which also explains why it is a bit slower at generating a v7. The most interesting part is when we use the v7 UUID. So what I did was a super simple test just to see if it’s faster. I used in Postgres to see how long the query takes. I also created 2 new tables with 1 column of type UUID and then I inserted 1 million rows into each table with respective UUID versions and queried it with a simple sort. With this test, we can see that v7 is 13.44% faster (42.042 ms). I also performed a quick on v7 and obtained the following results: I hope you enjoyed this article. If you have any suggestions for changes, please let me know. You can reach me at X . Have a great day!

0 views
Emil Privér 1 years ago

Why I Like Ocaml

According to my Linkedin profile, I have been writing code for a company for almost 6 years. During this time, I have worked on PHP and Wordpress projects, built e-commerce websites using NextJS and JavaScript, written small backends in Python with Django/Flask/Fastapi, and developed fintech systems in GO, among other things. I have come to realize that I value a good type system and prefer writing code in a more functional way rather than using object-oriented programming. For example, in GO, I prefer passing in arguments rather than creating a method. This is why I will be discussing OCaml in this article. If you are not familiar with the language OCaml or need a brief overview of it, I recommend reading my post OCaml introduction before continuing with this post. It will help you better understand the topic I am discussing. Almost every time I ask someone what they like about OCaml, they often say “oh, the type system is really nice” or “I really like the Hindley-Milner type system.” When I ask new OCaml developers what they like about the language, they often say “This type system is really nice, Typescript’s type system is actually quite garbage.” I am not surprised that these people say this, as I agree 100%. I really enjoy the Hindley-Milner type system and I think this is also the biggest reason why I write in this language. A good type system can make a huge difference for your developer experience. For those who may not be familiar with the Hindley-Milner type system, it can be described as a system where you write a piece of program with strict types, but you are not required to explicitly state the types. Instead, the type is inferred based on how the variable is used. Let’s look at some code to demonstrate what I mean. In GO, you would be required to define the type of the arguments: However, in OCaml, you don’t need to specify the type: Since expects to receive a string, the signature for will be: But it’s not just for arguments, it’s also used when returning a value. This function will not compile because we are trying to return a string as the first value and later an integer. I also want to provide a larger example of the Hindley-Milner type system: The signature for this piece of code will be: In this example, we create a new module where we expose 3 functions: make, print_car_age, and print_car_name. We also define a type called . One thing to note in the code is that the type is only defined once, as OCaml infers the type within the functions since is a type within this scope. OCaml playground for this code Something important to note before concluding this section is that you can define both the argument types and return types for your function. The next topic is pattern matching. I really enjoy pattern matching in programming languages. I have written a lot of Rust, and pattern matching is something I use when I write Rust. Rich pattern matching is beneficial as it eliminates the need for many if statements. Additionally, in OCaml, you are required to handle every case of the match statement. For example, in the code below: In the code above, I am required to include the last match case because we have not handled every case. For example, what should the compiler do if the is Adam? The example above is very simple. We can also match on an integer and perform different actions based on the number value. For instance, we can determine if someone is allowed to enter the party using pattern matching. OCaml playground But the reason I mention variants in this section is that variants and pattern matching go quite nicely hand in hand. A variant is like an enumeration with more features, and I will show you what I mean. We can use them as a basic enumeration, which could look like this: This now means that we can do different things depending on this type: But I did mention that variants are similar to enumeration with additional features, allowing for the assignment of a type to the variant. Now that we have added types to our variants and included , we are able to adjust our pattern matching as follows: OCaml Playground We can now assign a value to the variant and use it in pattern matching to print different values. As you can see, I am not forced to add a value to every variant. For instance, I do not need a type on so I simply don’t add it. I often use variants, such as in DBCaml where I use variants to retrieve responses from a database. For example, I return if I did not receive any rows back, but no error. OCaml also comes with Exhaustiveness Checking, meaning that if we don’t check each case in a pattern matching, we will get an error. For instance, if we forget to add to the pattern matching, OCaml will throw an error at compile time. The next topic is operators and specific binding operators. OCaml has more types of operators, but binding operators are something I use in every project. A binding could be described as something that extends how works in OCaml by adding extra logic before storing the value in memory with . I’ll show you: This code simply takes the value “Emil” and stores it in memory, then assigns the memory reference to the variable hello. However, we can extend this functionality with a binding operator. For instance, if we don’t want to use a lot of match statements on the return value of a function, we can bind so it checks the value and if the value is an error, it bubbles up the error. This allows me to reduce the amount of code I write while maintaining the same functionality. In the code above, one of the variables is an , which means that the binding will return the error instead of returning the first name and last name. I really like the concept of functional programming, such as immutability and avoiding side-effects as much as possible. However, I believe that a purely functional programming language could force us to write code in a way that becomes too complex. This is where I think OCaml does a good job. OCaml is clearly designed to be a functional language, but it allows for updating existing values rather than always returning new values. Immutability means that you cannot change an already existing value and must create a new value instead. I have written about the Concepts of Functional Programming and recommend reading it if you want to learn more. One example where functional programming might make the code more complex is when creating a reader to read some bytes. If we strictly follow the rule of immutability, we would need to return new bytes instead of updating existing ones. This could lead to inefficiencies in terms of memory usage. Just to give an example of how to mutate an existing value in OCaml, I have created an example. In the code below, I am updating the age by 1 as it is the user’s birthday: What I mean by “it’s functional on easy mode” is simply that the language is designed to be a functional language, but you are not forced to strictly adhere to functional programming rules. It is clear to me that a good type system can greatly improve the developer experience. I particularly appreciate OCaml’s type system, as well as its and types, which I use frequently. In languages like Haskell, you can extend the type system significantly, to the point where you can write an entire application using only types. However, I believe that this can lead to overly complex code. This is another aspect of OCaml that I appreciate - it has a strong type system, but there are limitations on how far you can extend it. I hope you enjoyed this article. If you are interested in joining a community of people who also enjoy functional programming, I recommend joining this Discord server.

0 views
Emil Privér 1 years ago

From Computer to Production With Nix

A while ago, I wrote “ Bye Opam, Hello Nix ” where the topic of that post was that I replaced Opam with Nix as it works much better. This post is about taking this a bit further, discussing how I use Nix for local development, testing, and building Docker images. The core concept of Nix is “reproducible builds,” which means that “it works on my machine” is actually true. The idea of Nix is that you should be able to make an exact copy of something and send it to someone else’s computer, and they should get the same environment. The good thing about this is that we can extend it further to the cloud by building Docker images. Even if Docker’s goal was to also solve the “it works on my machine” problem, it only does so to a certain level as it is theoretically possible to change the content of a tag (I guess that you also tag a new image ;) ? ) by building a new image and pushing it to the same tag. Another thing I like about Nix is that it allows me to create a copy of my machine and send it to production. I can create layers, import them using Docker, and then tag and push them to my registry. This specific post was written after working with and using Nix at work. However, the code in this post won’t be work-related, but I will show code that accomplishes the same task in OCaml instead of Python. The problems I wanted to solve at work were: In this article, we will create a new basic setup for an OCaml project using Nix for building and development. The initial code will be as follows, and it will also be available at https://github.com/emilpriver/ocaml-nix-template The code in this Nix config is for building OCaml projects, so there will be OCaml related code in the config. However, you can customize your config to suit the language and the tools you work with. The content of this config informs us that we can’t use the unstable channel of Nix packages, as it often provides us with newer versions of packages. We also define the systems for which we will build, due to Riot not supporting Windows. Additionally, we create an empty devShells and packages config, which we will populate later. We also specify the formatter we want to use with Nix. It’s important to note that this article is based on , which you can read more about here: https://shopify.engineering/what-is-nix The first thing I wanted to fix is the development environment for everyone working on the project. The goal is to simplify the setup for those who want to contribute to the project and to achieve the magical “one command to get up and running” situation. This is something we can easily accomplish with Nix. The first thing we need to do is define our package by adding the content below to our flake.nix. Here, I tell Nix that I want to build a dune package and that I need Riot, which is added to inputs: This also makes it possible for me to add our dev shell by adding this: So, what we have done now is that we have created a package called “nix_template” which we use as input within our devShell. So, when we run , we now get everything the needs and we get the necessary tools we need to develop, such as LSP, dune, and ocamlformat. This means that we are now working with a configuration like this: When working with Nix, I prefer to use it for running the necessary tools both locally and in the CI. This helps prevent discrepancies between local and CI environments. It also simplifies the process for others to run tests locally; they only need to execute a single command to replicate my setup. For example, I like to run my tests using Nix. It allows me to run the tests, including setting up everything I need such as Docker, with just one command. Let’s add some code into the object in our flake.nix. In the code provided, we create a new package named . This executes to verify our code. To run our tests in the CI or locally, we use . This method could potentially eliminate the need for installing tools directly in the CI and running tests, replacing all of it with a single Nix command. Including Docker as a package and running a Docker container in the buildPhase is also possible. This is just one effective method I’ve discovered during my workflows, but there are other ways to achieve this as well. Additionally, you can execute tasks like linting or security checks. To do this, replace with the necessary command. Then, add the output, such as coverage, to the folder so you can read it later. I have tried to use Nix apps for this type of task, but I have always fallen back to just adding a new package and building a package as it has always been simpler for me. So, time for building for release and this is the part where we make a optimized build which we can send out to production. How this works will depend on what you want to achieve but I will cover 2 common ways of building for release which is either docker image or building the binary. To enable binary building, we only need to add a and an to our default package used for building. This makes our definition appear as follows: This implies that when we construct the project using , we are building the project in an isolated sandbox environment and returning only the required binary. For example, the folder now includes: Here, main.exe is the binary we built. Another way to achieve a release is by building a docker image layers using nix that we later import into docker to make it possible to run it. The benefit of this is that we get a reproducible docker as we don’t use to build our image and that we can reuse a lof of the existing code to build the image and the way we achieve this is by creating a new where I in this case call this And to build our docker image now do we simply only need to run And we can later on load the layers into docker Afterwards, we can tag the image and distribute it. Quite convenient. There are some tools specifically designed for this purpose, which are very useful. For example, can be used to tag and push an image to a container registry, such as in a GitHub action. What Nix does when building a Docker image is that it replaces the Docker build system, often referred to as . Instead, we build layers that we then import into Docker. Not all packages exist on https://search.nixos.org/packages , but it’s not impossible to use that library if it doesn’t. Under the hood, all the packages on the Nix packages page are just Nix configs that build projects, which means that it’s possible to build projects directly from source as well. This is how I do it with the package below: This now allows me to refer to this package in other packages to let Nix know that I need it and that it needs to build it for me. Something to keep in mind when you fetch from sources is that if you use something such as , you use the host machine’s ssh-agent while uses the sandbox environment’s ssh-agent if it has any. This means that some requests don’t work unless you either use something like or add your ssh config during the build step. After all these configurations, we should now have a flake.nix file that matches the code below This code also exist at github.com/emilpriver/ocaml-nix-template I hope this article has helped you with working with Nix. In this post, I built a flake.nix for OCaml projects, but it shouldn’t be too hard to replace the OCaml components with whatever language you want. For instance, packages exist for JavaScript to replace NPM and Rust to replace Cargo. These days, I use Nix for the development environment, testing, and building, and for me, it has been a quite good experience, especially when working with prebuilt flakes. My goal with this post was just to show “a way” of doing it. I’ve noticed that the Nix community tends to give a lot of opinions about how you should do things in Nix. The hard truth is that there are a lot of different ways to solve the same problem in Nix, and you should pick a way that suits you. If you like this type of content and want to follow me to get more information on when I post stuff, I recommend following me on Twitter: https://x.com/emil_priver

0 views
Emil Privér 1 years ago

Bye Opam, Hello Nix

I’ve been writing OCaml since November 2023 and I enjoy the language; it’s fun to write and has some features I really appreciate. However, you may have noticed I only mentioned the “language” in the first sentence. That’s because I have issues with Opam, the package manager for OCaml. It has been a pain in my development workflow and I want to eliminate it. Not long ago, I was browsing Twitch and saw some content on Nix hosted by BlackGlasses ( altf4stream ), Metameeee and dmmulroy . They discussed how to use Nix to manage your workspace, which intrigued me. Around the same time, I started working at CarbonCloud, my current employer, where we use Haskell with Nix for some apps. Having seen how they utilize Nix and its potential, I decided to try it out with OCaml. In short, I’ve experienced numerous frustrations with OPAM when working on multiple projects that use different versions of a library. This scenario often necessitates creating new switches and reinstalling everything. Though I’ve heard that OCaml libraries should be backward compatible, I’ve never found this to be the case in practice. For instance, if we need to modify a function in version 2.0.0 due to a security issue, it challenges the notion of “backward compatibility”. Challenges may arise when opening the same folder in different terminal sessions, such as whether the command needs to be run again to update the terminal to use the local switch instead of the global one. To clarify, a standard OPAM installation puts all packages globally in . To avoid using this global environment, a local environment can be created by running in the desired folder. This command creates an folder in the directory, which can help avoid some complications. Furthermore, OPAM allows you to set a specific package version in your files. This is useful even for packages that strive for backward compatibility, as it allows for two repositories to require different versions of the same package. However, it can also lead to version conflicts as packages are typically installed in the global environment. Another difficulty is the time it takes to release something on the OPAM repository. As a result, you may find yourself installing some packages directly from OPAM, while pinning others directly to a Git reference. Another issue I noticed is that Opam sometimes installs non-OCaml libraries, like PostgreSQL, without asking, if a specific library requires it. This situation feels a bit odd. Therefore, I replaced Opam with Nix. I’ve started transitioning to Nix, which allows me to completely remove Opam from my system, since OCaml can function without it. Another approach to achieve this is by cloning libraries using Git to a folder within the library, as Dune handles monorepo very efficiently. If you’re unfamiliar with Nix, I recommend reading this article: https://shopify.engineering/what-is-nix . It provides a good summary. I use Nix across several projects, but I will demonstrate examples and code from my project “ocamlbyexample”, which is similar to https://gobyexample.com but for OCaml. I am using Nix for two purposes in this project: Nix handles these tasks very efficiently for me. Integrating this is fairly straightforward because the work has already been accomplished in the repository. Additionally, some OCaml packages have already been published to nix . Therefore, I just need to specify the dependencies I require to nix. Please note that this code may not work perfectly as it could be missing some steps. Instead of using , you can replace it with as an argument for nix . This installs the package when I run either or However, not all packages exists on nix yet but it’s possible to install the packages directly from source using as in the example below: Since supports the installation and building of multiple packages, we utilize it here to build the package as seen in . Dune is the build system for OCaml. It uses ocamlc under the hood to run and compile your project. The homepage can be found at: https://dune.build/ There’s a difference between using and installing a package from the source. In the latter, a hash of the downloaded file is required. Fortunately, you can easily obtain this hash. By setting to an empty string, Nix will calculate the hash for you, returning it in the terminal as an error message. Simply copy the entire value and input it into , as demonstrated in the example, to have the correct hash ready. Of course, other developers aren’t required to use Nix if they prefer not to. However, if you want to contribute to the project with as little hassle as possible, simply run . This command provides you with the exact environment I’m using, because Nix operates with reproducible environments. My devShell config for ocamlbyexample.com is This implies that the developer only needs to execute to generate CSS, HTML, and JS files, and then preview the changes at . The advantage of using ’nix’ is that it enables me to build the project both locally and in the CI using . This ensures consistent output, simplifying the entire build process in the CI. Previously, I had to install opam, ocaml, make a switch, and install the library. Now, all these steps are replaced with a simple . Here is my configuration for : This means I can directly use the files in the folder, created by , and publish them to my CDN. Will I use Nix for all my OCaml projects? Not likely, as using Nix can sometimes seem excessive for small, short-term projects. Additionally, the team building Dune is adding package management, which I might use. However, I appreciate the simplicity of in continuous integration (CI). While not all packages are available on Nix, there’s a concerted effort to increase the number of libraries installable with Nix Flakes. For example, there’s now a in the Riot GitHub repo that we can use to add Riot to our stack. The only downside i’ve found so far is that it sometimes take some time to setup a new dev environment when running . I hope this article was inspiring. If you wish to contact or follow me, you can do so on Twitter: https://twitter.com/emil_priver

0 views
Emil Privér 1 years ago

Announcing DBCaml, Silo, Serde Postgres and a new driver for postgres

I’ve spent the last four months working on DBCaml. Some of you may be familiar with this project, while others may not. When I started DBCaml, I had one primary goal: to build a toolkit for OCaml that handles pooling and mapping to types and more. This toolkit would also run on Riot, which is an actor-model multi-core scheduler for OCaml 5. An issue I’ve found in the OCaml space and databases is that most of the existing database libraries either don’t support Postgres version 14 and higher, or they run on the PostgreSQL library, which is a C-binding library. The initial release of DBcaml also used the Postgresl library just to get something published. However, this wasn’t something I wanted as I felt really limited in what I was able to do, and the C-bindings library would also limit the number of processes I could run with Riot, which is something I didn’t want. So, I decided to work hard on the Postgres driver to write a native OCaml driver which uses Riot’s socket connection for the database. This post is to describe the new change by talking about each library. The GitHub repo for this project exists here: https://github.com/dbcaml/dbcaml Before I continue, I want to say a big thank you to Leandro Ostera , Antonio Monteiro and many more in the OCaml community. When I’ve been in need of help, you have provided me with information and code to fix the issues I encountered. Thank you tons! <3 Now that DBCaml has expanded into multiple libraries, I will refer to these as “The DBCaml project”. I felt it was important to write about this project again because the direction has changed since v0.0.1. DBCaml, the central library in this project, was initially designed to handle queries, type mapping, and pooling. As the project expanded, I decided to make DBCaml more developer-friendly. It now aids in pooling and sending queries to the database, returning raw bytes in response. DBCaml’s pool takes inspiration from Elixir’s ecto. Currently, I recommend developers use DBCaml for querying the database and receiving raw bytes, which they can then use to build any desired features. However, my vision for DBCaml is not yet complete. I plan to extract the pooling function from DBCaml and create a separate pool manager, inspired by Elixir’s ecto. This manager can be used by developers to construct features, such as a Redis pool. If you’re interested in learning more about how DBCaml works, I recommend reading these articles: ”Building a Connnection Pool for DBCaml on top of riot ” and ”Introducing DBCaml, Database toolkit for OCaml” . A driver essentially serves as the bridge between your code and the database. It’s responsible for making queries to the database, setting up the connection, handling security, and managing TLS. In other words, it performs “the real job.” The first version of the driver was built for Postgresql, using a C-binding library. However, I wasn’t fond of this library because it didn’t provide raw bytes, which are crucial when mapping data to types. This library has since been rewritten into native OCaml code, using Riot’s sockets to connect to the database. The next library to discuss is Serde Postgres, a Postgres wire deserializer. The Postgres wire is a protocol used by Postgres to define the structure of bytes, enabling us to create clients for Postgres. You can read about the Postgres wire protocol at: https://www.postgresql.org/docs/current/protocol.html With the introduction of Serde Postgres, it’s now possible to deserialize Postgres wire and map the data to types. Here’s an example: By creating a separate library, developers can use Serde, Postgres, and Dbcaml to make queries and later parse the data into types. The final library to discuss is Silo. This is the high-level library I envisioned for DBcaml, one that handles everything for you and allows you to simply write your queries and work with the necessary types. Silo uses DBcaml to make raw queries to the database and then maps the bytes from Postgres to types using Serde Postgres. Here’s an example: Silo is the library I anticipate most developers will use if they don’t create their own database library and need further control over functionality. There is some more stuff I’ve planned for this project, such as building more drivers and deserializers for different databases: I also want to build more tools for you as a developer when you write your OCaml projects, and some of these are: I hope you appreciate these changes. If you’re interested in contributing to the libraries or discussing them, I recommend joining the Discord: https://discord.gg/wqbprMmgaD For more minor updates, follow my Twitter page: https://twitter.com/emil_priver If you find a bug would I love if you create a issue here: https://github.com/dbcaml/dbcaml/issues

0 views
Emil Privér 1 years ago

Running schema/database migrations using Geni

A while ago, I developed Geni , a CLI database migration tool. The goal of the app was to make migrations to Tursos databases easier. When I developed Geni, I also decided to add support for Postgres, MariaDB, MySQL, and SQLite. The goal of this post is to describe what database/schema migrations are and how to perform them via Geni. A schema within a database describes its current structure, which includes elements such as tables, indexes, and constraints. When you connect to a PostgreSQL database and run , you’ll see something like this: The image above shows the current structure of my table. Essentially, migrations are used to modify the database structure. In other words, a migration is a set of instructions for making changes to your database. Suppose you receive a new task at work that requires you to add a table and a new column to another table. One approach is to connect to the database and make these changes manually. Alternatively, you could use a tool, such as Geni, for this task. Using Geni offers several advantages. It provides a reproducible database structure, useful for deploying to different environments or running tests within a CI. Because they are just normal SQL files, you can version control them within your repository. Additionally, Geni allows for programmatically making changes to your database without human input. I’ve applied this within a project running on Kubernetes. When we released a new version of our app within the Kubernetes cluster, we spun up a geni container in the same Kubernetes namespace. This container checked for any migrations and, if found, ran them before terminating itself. You can utilize migrations in your integration tests to evaluate the entire application against its actual structure. There are specific requirements for using this type of tool. Firstly, each migration can only be executed once; if a migration has been applied, it cannot be reapplied. Secondly, migrations need to be run in the correct sequence; for instance, migration 3 cannot be executed before migration 2. Geni handles this by checking if the table exists in your database before executing each migration. If it doesn’t, Geni creates the table and inserts the migration id. This table is used to keep track of which migrations have been applied. When Geni creates migrations, it uses timestamp as the ID, followed by the name of the migration. The format for a migration is . Geni orders each migration based on the ID because it’s incremental, preventing migrations from being run in the wrong order. This is why Geni scales effectively with your projects. It allows you to track each migration within your version control. Simultaneously, you can view the current database structure by reading the schema.sql file that Geni also produces. There are several ways to install Geni: You can download the official binaries directly from Github here: https://github.com/emilpriver/geni/releases You can install Geni using the Homebrew package manager with the following command: You can also install Geni using Cargo with the following command: Alternatively, you can run Geni using PKGX with the following command: Creating a migration is straightforward. Start by running the following command in your repository: Geni will create two new files in the folder. These files end with .up.sql and .down.sql respectively. The .up.sql file is where you write new changes. For instance, if you want to create a new users table, you could add: The .down.sql file is for rolling back changes. Ideally, the SQL code in the .down.sql file should revert the changes made in the .up.sql file. An example is: After creating your new migration, execute to apply the migrations. This command will prompt geni to read the migrations folders and apply them. You can also run Geni directly via GitHub Actions as part of your CI flow: Alternatively, you can include it as a container in your docker-compose file: I use Geni as a container for projects where I want others to easily set up and run an environment. I hope this article has inspired you to use migrations for your databases, and perhaps even try Geni. If you’re interested in tracking the development of Geni, I recommend starring it on GitHub: https://github.com/emilpriver/geni For more of my content, consider following me on Twitter: https://twitter.com/emil_priver

0 views
Emil Privér 1 years ago

My Love/Hate Letter to Copilot

Just as Post Malone expressed a love-hate relationship with alcohol, I’m here to share my mixed feelings about Copilot. This bittersweet tool has been in my toolkit for a while. Despite its frequent frustrations, I find myself relying on it more than I’d like. Only after disabling the AI did I notice the changes in my programming habits due to its influence. For those unfamiliar with Copilot, here’s a quick introduction: Copilot is a tool that attempts to understand your problem and then searches GitHub for matching solutions and suggest it to you. For many is this tool something they love as the can create the solution faster as they can just wait for a suggestion to be recommended and apply it as fast they can and then move on with their life. When I began using Copilot and ChatGPT, I was amazed at how much faster I could create things. However, I hadn’t anticipated how it would change my approach to system creation, or how it might affect my perspective during development and how it made the engineer within me smaller. As software engineers, developers, or programmers, our role is not just to understand a problem and find a solution, but also to write good software. This means that while certain solutions, such as using in Rust, may technically work, they may not be the best practice. For example, it’s generally advisable to minimize the use of in order to conserve memory. If you’re unfamiliar with , it essentially copies existing memory into another part of memory and provides a new reference to this new memory. While this might not sound problematic, it can be quite detrimental. For instance, if you clone something that is 1GB in size, you allocate another 1GB within the memory. This could likely be avoided by properly utilizing reference and ownership rules. Another issue is that you could potentially overload the memory, causing your service to crash. Copilot might suggest using . We could apply this suggestion and proceed. However, instead of doing our job, we may end up relying on a solution provided by an AI that might not understand the real context. This can be problematic. I noticed this happening when I first started learning OCaml. There were instances when I waited for a suggestion, even though the solution was just two lines of straightforward code. One instance that I recall is when I was parsing a URL. Copilot suggested using Regex, which is not ideal due to its potential for bugs. Instead, I solved it using the following code: One of my major issues with Copilot is that it can diminish our problem-solving skills. Instead of analyzing a situation and finding a good solution ourselves, we increasingly delegate this task to AI. Over time, this could diminish our engineering mindset. For instance, when we encounter a bug, our first instinct might be to ask an AI about the bug, rather than trying to figure it out ourselves. Believe it or not, at this stage, the AI is less capable than you. You may find yourself in a loop where you keep telling the AI, “No, this doesn’t work,” and the AI keeps suggesting new code. Ironically, the solution could be as simple as changing a single character in the code. Another issue is that you might create a solution that, although functional, is subpar due to not leveraging your own skills. I believe it’s common for us to become complacent when AI becomes a significant part of our development process. I experienced this myself recently when I was building SASL authentication for Postgres in OCaml and encountered a tricky bug. Instead of manually inserting print statements into the code to debug, I copied the code and the error and handed it over to ChatGPT. The solution came from a combination of reading sqlx code and realizing that I had overlooked a small detail. As software engineers, continuous learning is essential. We often learn by problem-solving, addressing issues, and devising solutions. However, over-reliance on AI in our development process can hinder this learning. We may apply code without fully understanding it, which can be detrimental in the long run. Just because you used an AI to solve a bug doesn’t mean you should rely on it every time a similar issue arises. This can be a significant issue, especially for new developers. It’s crucial that we’re able to “feel” the code we’re working on. Programming is not just about understanding code; it’s about connecting the pieces in the larger puzzle to build a solution and it takes time to understand this and it’s really important in the beginning of the career and also why learning programming takes time. Imagine having a bytestring from which you need to parse values. Ideally, you would do this piece by piece - extract the first value, print it, then move on to the next value. Repeat this process until you’ve gathered all necessary values. It’s common to print the initial data for transparency. However, as AI becomes more integral to development and begins to handle such tasks for us, there may be situations where starting from scratch becomes challenging due to our reliance on these tools and we’re end up stuck. One day, we may face a problem that AI can’t solve, and we might not know where to start debugging. This is because we’re so used to having someone else handle it for us. This is why I always advise new developers to try to solve problems themselves first before asking for help. I want them to understand what they’re doing and not rely on me. I also don’t believe that we necessarily become more effective by using AI. Often, we might find ourselves stuck in a loop, waiting for new suggestions repeatedly. In such situations, we could likely solve the problem faster, and perhaps even better, by using our own brains instead. We often associate efficient programming with the ability to produce large amounts of code quickly. However, this isn’t necessarily true. A single line of code can sometimes be more efficient and easier to work with than ten lines of code. AI is effective at generating boilerplate but often falls short in providing quality solutions. I’ve critiqued Copilot for a while, but it’s worth mentioning that it’s not necessarily bad to use it, provided you choose the appropriate time. I still use Copilot, but only when I’m working on simpler tasks that it can easily handle, like generating boilerplate code. However, I only enable it occasionally. I’ve noticed that it’s crucial not to rely heavily on such tools, as doing so can lead to negative habits, like waiting for a suggestion and hitting enter repeatedly to find a solution. Another area I’ve noticed where it works quite well is when you’re programming in GO. GO is designed to be simple, and Copilot works well with simple tasks, so the code it recommends is mostly okay. AI can pose a significant challenge for new developers. It’s tempting to let AI dictate the path to a solution, rather than using it as one of many potential paths. This often leads to developers accepting the code returned by AI without truly understanding it. However, understanding the code is essential for new developers. The easiest way to contact me is through Twitter: https://twitter.com/emil_priver

0 views
Emil Privér 1 years ago

Introducing DBCaml, Database toolkit for OCaml

It’s time for me to discuss what I’ve been working on recently: Dbcaml. Dbcaml is a database toolkit built for OCaml, based on Riot. Riot is an actor-model multi-core scheduler for OCaml 5. Riot works by creating lightweight processes that execute code and communicate with the rest of the system using message-passing. You can find Riot on GitHub . The core idea of DBCaml is to provide a toolkit that assists with the “boring” tasks you don’t want to deal with, allowing you to focus on your queries. Some examples of these tasks include: This is an example of how you can use DBCaml: During the initial v0.0.1 release, DBCaml can be installed using the following command: I wanted to learn a new language and decided to explore functional programming. I came across OCaml online and found it interesting. When Advent of Code 2023 started, I chose OCaml as the language for my solutions. However, I didn’t build them using a functional approach. Instead, I wrote them in a non-functional way, using a lot of references. My solutions turned out to be so bad that a colleague had to rewrite my code. However, this experience further sparked my interest. One day, I came across Leostera , a developer working on Riot, an actor-model multi-core scheduler for OCaml 5. Riot is similar to Erlang’s beam, which intrigued me. It dawned on me that if I wanted to explore OCaml further, I needed a project to work on. That’s when I made the decision to build a database library for OCaml. I believed that it would be a useful addition to the Rio ecosystem. DBCaml can be categorized into three layers: Driver, Connection pool, and the interface that the developer works with. I have already explained how the connection pool works in a previous post, which you can find here: Building a Connection Pool . However, I would like to provide further explanation on drivers and the interface. The driver is responsible for communicating with the database. It acts as a bridge between DBCaml and the database. The main idea behind having separate drivers as a library is to avoid the need for installing unnecessary libraries. For example, if you are working with a Postgres database, there is no need to install any MySQL drivers. By keeping the drivers separate, unnecessary dependencies can be avoided. Additionally, the driver takes care of all the security measures within the library. DBCaml simply provides the necessary data to the drivers, which handle the security aspects. I will describe the current functionality of everything and explain my vision for how I believe this library will evolve in future releases. Currently, DBCaml provides four methods: start_link, fetch_one, fetch_many, and exec. These methods serve as the highest level of functionality in the package and are the primary interface used by developers for testing purposes in v0.0.1. These methods handle most of the tasks that developers don’t need to worry about, such as requesting a connection from the pool. I have a broad vision for DBCaml, which encompasses three categories: testing, development, and runtime. The specifics of what will be included in the testing and development areas will become clearer as we start working on it. However, currently, the most important aspect is to have a v0.0.1 release for the connection pool. This is the critical component of the system, and we need feedback on its functionality and to identify any potential bugs or issues. Writing effective tests can be challenging, particularly when it is not possible to mock queries. However, one solution to this problem is to utilize DBCaml. DBCaml can help you in writing tests by providing reusable code snippets. This includes the ability to define rows, close a database, and more, giving you control over how you test your application. I believe SQLx by Rust ( https://github.com/launchbadge/sqlx ) has done an excellent job of providing a great developer experience (DX). It allows users to receive feedback on the queries they write without the need to test them during runtime. In other words, SQLx enables the use of macros to execute code against the database during the compilation process. This way, any issues with the queries can be identified early on. It is, of course, optional for users to opt in to this feature. The advantage of this feedback during development is that users can work quickly without having to manually send additional HTTP requests in tools like Postman to trigger the queries they want to test. This saves users valuable time. By allowing users to test queries during compilation, they can skip writing tests for queries. This provides feedback on whether the query works or not during development. During runtime, it is important to have a system that can handle pooling for your application. This ensures that if a connection dies, it is recreated and booted again. Currently, we are in version v0.0.1, which is a small release with limited functionality. However, I have big plans for the future of this package. The purpose of creating v0.0.1, despite knowing that there will be upcoming changes, is to test the connection pool and ensure its functionality. The v0.0.1 release includes the ability to fetch data from the database and use it, along with a connection pool and a PostgreSQL driver. However, I will soon be branching out DBCaml into three new packages: This significant change will be implemented in the v0.0.2 milestone. I want to give a special thank you to Leostera, who has helped me a lot during the development. I wouldn’t argue that this is something I’ve just worked on. This is a joint effort between me, Leostera, and other members of the Riot Discord to make this happen. If you are interested and would like to follow along with the development, I can recommend some links for you:

0 views