Latest Posts (20 found)
Anton Zhiyanov 1 weeks ago

Go 1.26 interactive tour

Go 1.26 is coming out in February, so it's a good time to explore what's new. The official release notes are pretty dry, so I prepared an interactive version with lots of examples showing what has changed and what the new behavior is. Read on and see! new(expr)  • Type-safe error checking  • Green Tea GC  • Faster cgo and syscalls  • Faster memory allocation  • Vectorized operations  • Secret mode  • Reader-less cryptography  • Goroutine leak profile  • Goroutine metrics  • Reflective iterators  • Peek into a buffer  • Process handle  • Signal as cause  • Compare IP subnets  • Context-aware dialing  • Fake example.com  • Optimized fmt.Errorf  • Optimized io.ReadAll  • Multiple log handlers  • Test artifacts  • Modernized go fix  • Final thoughts This article is based on the official release notes from The Go Authors and the Go source code, licensed under the BSD-3-Clause license. This is not an exhaustive list; see the official release notes for that. I provide links to the documentation (𝗗), proposals (𝗣), commits (𝗖𝗟), and authors (𝗔) for the features described. Check them out for motivation, usage, and implementation details. I also have dedicated guides (𝗚) for some of the features. Error handling is often skipped to keep things simple. Don't do this in production ツ Previously, you could only use the built-in with types: Now you can also use it with expressions: If the argument is an expression of type T, then allocates a variable of type T, initializes it to the value of , and returns its address, a value of type . This feature is especially helpful if you use pointer fields in a struct to represent optional values that you marshal to JSON or Protobuf: You can use with composite values: And function calls: Passing is still not allowed: 𝗗 spec • 𝗣 45624 • 𝗖𝗟 704935 , 704737 , 704955 , 705157 • 𝗔 Alan Donovan The new function is a generic version of : It's type-safe and easier to use: is especially handy when checking for multiple types of errors. It makes the code shorter and keeps error variables scoped to their blocks: Another issue with is that it uses reflection and can cause runtime panics if used incorrectly (like if you pass a non-pointer or a type that doesn't implement ): doesn't cause a runtime panic; it gives a clear compile-time error instead: doesn't use , executes faster, and allocates less than : Since can handle everything that does, it's a recommended drop-in replacement for new code. 𝗗 errors.AsType • 𝗣 51945 • 𝗖𝗟 707235 • 𝗔 Julien Cretel The new garbage collector (first introduced as experimental in 1.25) is designed to make memory management more efficient on modern computers with many CPU cores. Go's traditional garbage collector algorithm operates on graph, treating objects as nodes and pointers as edges, without considering their physical location in memory. The scanner jumps between distant memory locations, causing frequent cache misses. As a result, the CPU spends too much time waiting for data to arrive from memory. More than 35% of the time spent scanning memory is wasted just stalling while waiting for memory accesses. As computers get more CPU cores, this problem gets even worse. Green Tea shifts the focus from being processor-centered to being memory-aware. Instead of scanning individual objects, it scans memory in contiguous 8 KiB blocks called spans . The algorithm focuses on small objects (up to 512 bytes) because they are the most common and hardest to scan efficiently. Each span is divided into equal slots based on its assigned size class , and it only contains objects of that size class. For example, if a span is assigned to the 32-byte size class, the whole block is split into 32-byte slots, and objects are placed directly into these slots, each starting at the beginning of its slot. Because of this fixed layout, the garbage collector can easily find an object's metadata using simple address arithmetic, without checking the size of each object it finds. When the algorithm finds an object that needs to be scanned, it marks the object's location in its span but doesn't scan it immediately. Instead, it waits until there are several objects in the same span that need scanning. Then, when the garbage collector processes that span, it scans multiple objects at once. This is much faster than going over the same area of memory multiple times. To make better use of CPU cores, GC workers share the workload by stealing tasks from each other. Each worker has its own local queue of spans to scan, and if a worker is idle, it can grab tasks from the queues of other busy workers. This decentralized approach removes the need for a central global list, prevents delays, and reduces contention between CPU cores. Green Tea uses vectorized CPU instructions (only on amd64 architectures) to process memory spans in bulk when there are enough objects. Benchmark results vary, but the Go team expects a 10–40% reduction in garbage collection overhead in real-world programs that rely heavily on the garbage collector. Plus, with vectorized implementation, an extra 10% reduction in GC overhead when running on CPUs like Intel Ice Lake or AMD Zen 4 and newer. Unfortunately, I couldn't find any public benchmark results from the Go team for the latest version of Green Tea, and I wasn't able to create a good synthetic benchmark myself. So, no details this time :( The new garbage collector is enabled by default. To use the old garbage collector, set at build time (this option is expected to be removed in Go 1.27). 𝗣 73581 • 𝗔 Michael Knyszek In the Go runtime, a processor (often referred to as a P) is a resource required to run the code. For a thread (a machine or M) to execute a goroutine (G), it must first acquire a processor. Processors move through different states. They can be (executing code), (waiting for work), or (paused because of the garbage collection). Previously, processors had a state called used when a goroutine is making a system or cgo call. Now, this state has been removed. Instead of using a separate processor state, the system now checks the status of the goroutine assigned to the processor to see if it's involved in a system call. This reduces internal runtime overhead and simplifies code paths for cgo and syscalls. The Go release notes say -30% in cgo runtime overhead, and the commit mentions an 18% sec/op improvement: I decided to run the CgoCall benchmarks locally as well: Either way, both a 20% and a 30% improvement are pretty impressive. And here are the results from a local syscall benchmark: That's pretty good too. 𝗖𝗟 646198 • 𝗔 Michael Knyszek The Go runtime now has specialized versions of its memory allocation function for small objects (from 1 to 512 bytes). It uses jump tables to quickly choose the right function for each size, instead of relying on a single general-purpose implementation. The Go release notes say "the compiler will now generate calls to size-specialized memory allocation routines". But based on the code, that's not completely accurate: the compiler still emits calls to the general-purpose function. Then, at runtime, dispatches those calls to the new specialized allocation functions. This change reduces the cost of small object memory allocations by up to 30%. The Go team expects the overall improvement to be ~1% in real allocation-heavy programs. I couldn't find any existing benchmarks, so I came up with my own. And indeed, running it on Go 1.25 compared to 1.26 shows a significant improvement: The new implementation is enabled by default. You can disable it by setting at build time (this option is expected to be removed in Go 1.27). 𝗖𝗟 665835 • 𝗔 Michael Matloob The new package provides access to architecture-specific vectorized operations (SIMD — single instruction, multiple data). This is a low-level package that exposes hardware-specific functionality. It currently only supports amd64 platforms. Because different CPU architectures have very different SIMD operations, it's hard to create a single portable API that works for all of them. So the Go team decided to start with a low-level, architecture-specific API first, giving "power users" immediate access to SIMD features on the most common server platform — amd64. The package defines vector types as structs, like (a 128-bit SIMD vector with sixteen 8-bit integers) and (a 512-bit SIMD vector with eight 64-bit floats). These match the hardware's vector registers. The package supports vectors that are 128, 256, or 512 bits wide. Most operations are defined as methods on vector types. They usually map directly to hardware instructions with zero overhead. To give you a taste, here's a custom function that uses SIMD instructions to add 32-bit float vectors: Let's try it on two vectors: Common operations in the package include: The package uses only AVX instructions, not SSE. Here's a simple benchmark for adding two vectors (both the "plain" and SIMD versions use pre-allocated slices): The package is experimental and can be enabled by setting at build time. 𝗗 simd/archsimd • 𝗣 73787 • 𝗖𝗟 701915 , 712880 , 729900 , 732020 • 𝗔 Junyang Shao , Sean Liao , Tom Thorogood Cryptographic protocols like WireGuard or TLS have a property called "forward secrecy". This means that even if an attacker gains access to long-term secrets (like a private key in TLS), they shouldn't be able to decrypt past communication sessions. To make this work, ephemeral keys (temporary keys used to negotiate the session) need to be erased from memory immediately after the handshake. If there's no reliable way to clear this memory, these keys could stay there indefinitely. An attacker who finds them later could re-derive the session key and decrypt past traffic, breaking forward secrecy. In Go, the runtime manages memory, and it doesn't guarantee when or how memory is cleared. Sensitive data might remain in heap allocations or stack frames, potentially exposed in core dumps or through memory attacks. Developers often have to use unreliable "hacks" with reflection to try to zero out internal buffers in cryptographic libraries. Even so, some data might still stay in memory where the developer can't reach or control it. The Go team's solution to this problem is the new package. It lets you run a function in secret mode . After the function finishes, it immediately erases (zeroes out) the registers and stack it used. Heap allocations made by the function are erased as soon as the garbage collector decides they are no longer reachable. This helps make sure sensitive information doesn't stay in memory longer than needed, lowering the risk of attackers getting to it. Here's an example that shows how might be used in a more or less realistic setting. Let's say you want to generate a session key while keeping the ephemeral private key and shared secret safe: Here, the ephemeral private key and the raw shared secret are effectively "toxic waste" — they are necessary to create the final session key, but dangerous to keep around. If these values stay in the heap and an attacker later gets access to the application's memory (for example, via a core dump or a vulnerability like Heartbleed), they could use these intermediates to re-derive the session key and decrypt past conversations. By wrapping the calculation in , we make sure that as soon as the session key is created, the "ingredients" used to make it are permanently destroyed. This means that even if the server is compromised in the future, this specific past session can't be exposed, which ensures forward secrecy. The current implementation only supports Linux (amd64 and arm64). On unsupported platforms, invokes the function directly. Also, trying to start a goroutine within the function causes a panic (this will be fixed in Go 1.27). The package is mainly for developers who work on cryptographic libraries. Most apps should use higher-level libraries that use behind the scenes. The package is experimental and can be enabled by setting at build time. 𝗗 runtime/secret • 𝗣 21865 • 𝗖𝗟 704615 • 𝗔 Daniel Morsing Current cryptographic APIs, like or , often accept an as the source of random data: These APIs don't commit to a specific way of using random bytes from the reader. Any change to underlying cryptographic algorithms can change the sequence or amount of bytes read. Because of this, if the application code (mistakenly) relies on a specific implementation in Go version X, it might fail or behave differently in version X+1. The Go team chose a pretty bold solution to this problem. Now, most crypto APIs will just ignore the random parameter and always use the system random source ( ). The change applies to the following subpackages: still uses the random reader if provided. But if is nil, it uses an internal secure source of random bytes instead of (which could be overridden). To support deterministic testing, there's a new package with a single function. It sets a global, deterministic cryptographic randomness source for the duration of the given test: affects and all implicit sources of cryptographic randomness in the packages: To temporarily restore the old reader-respecting behavior, set (this option will be removed in a future release). 𝗗 testing/cryptotest • 𝗣 70942 • 𝗖𝗟 724480 • 𝗔 Filippo Valsorda , qiulaidongfeng A leak occurs when one or more goroutines are indefinitely blocked on synchronization primitives like channels, while other goroutines continue running and the program as a whole keeps functioning. Here's a simple example: If we call and don't read from the output channel, the inner goroutine will stay blocked trying to send to the channel for the rest of the program: Unlike deadlocks, leaks do not cause panics, so they are much harder to spot. Also, unlike data races, Go's tooling did not address them for a long time. Things started to change in Go 1.24 with the introduction of the package. Not many people talk about it, but is a great tool for catching leaks during testing. Go 1.26 adds a new experimental profile designed to report leaked goroutines in production. Here's how we can use it in the example above: As you can see, we have a nice goroutine stack trace that shows exactly where the leak happens. The profile finds leaks by using the garbage collector's marking phase to check which blocked goroutines are still connected to active code. It starts with runnable goroutines, marks all sync objects they can reach, and keeps adding any blocked goroutines waiting on those objects. When it can't add any more, any blocked goroutines left are waiting on resources that can't be reached — so they're considered leaked. Here's the gist of it: For even more details, see the paper by Saioc et al. If you want to see how (and ) can catch typical leaks that often happen in production — check out my article on goroutine leaks . The profile is experimental and can be enabled by setting at build time. Enabling the experiment also makes the profile available as a net/http/pprof endpoint, . According to the authors, the implementation is already production-ready. It's only marked as experimental so they can get feedback on the API, especially about making it a new profile. 𝗗 runtime/pprof • 𝗚 Detecting leaks • 𝗣 74609 , 75280 • 𝗖𝗟 688335 • 𝗔 Vlad Saioc New metrics in the package give better insight into goroutine scheduling: Here's the full list: Per-state goroutine metrics can be linked to common production issues. For example, an increasing waiting count can show a lock contention problem. A high not-in-go count means goroutines are stuck in syscalls or cgo. A growing runnable backlog suggests the CPUs can't keep up with demand. You can read the new metric values using the regular function: The per-state numbers (not-in-go + runnable + running + waiting) are not guaranteed to add up to the live goroutine count ( , available since Go 1.16). All new metrics use counters. 𝗗 runtime/metrics • 𝗣 15490 • 𝗖𝗟 690397 , 690398 , 690399 • 𝗔 Michael Knyszek The new and methods in the package return iterators for a type's fields and methods: The new methods and return iterators for the input and output parameters of a function type: The new methods and return iterators for a value's fields and methods. Each iteration yields both the type information ( or ) and the value: Previously, you could get all this information by using a for-range loop with methods (which is what iterators do internally): Using an iterator is more concise. I hope it justifies the increased API surface. 𝗗 reflect • 𝗣 66631 • 𝗖𝗟 707356 • 𝗔 Quentin Quaadgras The new method in the package returns the next N bytes from the buffer without advancing it: If returns fewer than N bytes, it also returns : The slice returned by points to the buffer's content and stays valid until the buffer is changed. So, if you change the slice right away, it will affect future reads: The slice returned by is only valid until the next call to a read or write method. 𝗗 Buffer.Peek • 𝗣 73794 • 𝗖𝗟 674415 • 𝗔 Ilia Choly After you start a process in Go, you can access its ID: Internally, the type uses a process handle instead of the PID (which is just an integer), if the operating system supports it. Specifically, in Linux it uses pidfd , which is a file descriptor that refers to a process. Using the handle instead of the PID makes sure that methods always work with the same OS process, and not a different process that just happens to have the same ID. Previously, you couldn't access the process handle. Now you can, thanks to the new method: calls a specified function and passes a process handle as an argument: The handle is guaranteed to refer to the process until the callback function returns, even if the process has already terminated. That's why it's implemented as a callback instead of a field or method. is only supported on Linux 5.4+ and Windows. On other operating systems, it doesn't execute the callback and returns an error. 𝗗 Process.WithHandle • 𝗣 70352 • 𝗖𝗟 699615 • 𝗔 Kir Kolyshkin returns a context that gets canceled when any of the specified signals is received. Previously, the canceled context only showed the standard "context canceled" cause: Now the context's cause shows exactly which signal was received: The returned type, , is based on , so it doesn't provide the actual value — just its string representation. 𝗗 signal.NotifyContext • 𝗖𝗟 721700 • 𝗔 Filippo Valsorda An IP address prefix represents an IP subnet. These prefixes are usually written in CIDR notation: In Go, an IP prefix is represented by the type. The new method lets you compare two IP prefixes, making it easy to sort them without having to write your own comparison code: orders two prefixes as follows: This follows the same order as Python's and the standard IANA (Internet Assigned Numbers Authority) convention. 𝗗 Prefix.Compare • 𝗣 61642 • 𝗖𝗟 700355 • 𝗔 database64128 The package has top-level functions for connecting to an address using different networks (protocols) — , , , and . They were made before was introduced, so they don't support cancellation: There's also a type with a general-purpose method. It supports cancellation and can be used to connect to any of the known networks: However, a bit less efficient than network-specific functions like — because of the extra overhead from address resolution and network type dispatching. So, network-specific functions in the package are more efficient, but they don't support cancellation. The type supports cancellation, but it's less efficient. The Go team decided to resolve this contradiction. The new context-aware methods ( , , , and ) combine the efficiency of the existing network-specific functions with the cancellation capabilities of : I wouldn't say that having three different ways to dial is very convenient, but that's the price of backward compatibility. 𝗗 net.Dialer • 𝗣 49097 • 𝗖𝗟 490975 • 𝗔 Michael Fraenkel The default certificate already lists in its DNSNames (a list of hostnames or domain names that the certificate is authorized to secure). Because of this, doesn't trust responses from the real : To fix this issue, the HTTP client returned by now redirects requests for and its subdomains to the test server: 𝗗 Server.Client • 𝗖𝗟 666855 • 𝗔 Sean Liao People often point out that using for plain strings causes more memory allocations than . Because of this, some suggest switching code from to when formatting isn't needed. The Go team disagrees. Here's a quote from Russ Cox: Using is completely fine, especially in a program where all the errors are constructed with . Having to mentally switch between two functions based on the argument is unnecessary noise. With the new Go release, this debate should finally be settled. For unformatted strings, now allocates less and generally matches the allocations for . Specifically, goes from 2 allocations to 0 allocations for a non-escaping error, and from 2 allocations to 1 allocation for an escaping error: This matches the allocations for in both cases. The difference in CPU cost is also much smaller now. Previously, it was ~64ns vs. ~21ns for vs. for escaping errors, now it's ~25ns vs. ~21ns. Here are the "before and after" benchmarks for the change. The non-escaping case is called , and the escaping case is called . If there's just a plain error string, it's . If the error includes formatting, it's . Seconds per operation: Bytes per operation: Allocations per operation: If you're interested in the details, I highly recommend reading the CL — it's perfectly written. 𝗗 fmt.Errorf • 𝗖𝗟 708836 • 𝗔 thepudds Previously, allocated a lot of intermediate memory as it grew its result slice to the size of the input data. Now, it uses intermediate slices of exponentially growing size, and then copies them into a final perfectly-sized slice at the end. The new implementation is about twice as fast and uses roughly half the memory for a 65KiB input; it's even more efficient with larger inputs. Here are the geomean results comparing the old and new versions for different input sizes: See the full benchmark results in the commit. Unfortunately, the author didn't provide the benchmark source code. Ensuring the final slice is minimally sized is also quite helpful. The slice might persist for a long time, and the unused capacity in a backing array (as in the old version) would just waste memory. As with the optimization, I recommend reading the CL — it's very good. Both changes come from thepudds , whose change descriptions are every reviewer's dream come true. 𝗗 io.ReadAll • 𝗖𝗟 722500 • 𝗔 thepudds The package, introduced in version 1.21, offers a reliable, production-ready logging solution. Since its release, many projects have switched from third-party logging packages to use it. However, it was missing one key feature: the ability to send log records to multiple handlers, such as stdout or a log file. The new type solves this problem. It implements the standard interface and calls all the handlers you set up. For example, we can create a log handler that writes to stdout: And another handler that writes to a file: Finally, combine them using a : I'm also printing the file contents here to show the results. When the receives a log record, it sends it to each enabled handler one by one. If any handler returns an error, doesn't stop; instead, it combines all the errors using : The method reports whether any of the configured handlers is enabled: Other methods — and — call the corresponding methods on each of the enabled handlers. 𝗗 slog.MultiHandler • 𝗣 65954 • 𝗖𝗟 692237 • 𝗔 Jes Cok Test artifacts are files created by tests or benchmarks, such as execution logs, memory dumps, or analysis reports. They are important for debugging failures in remote environments (like CI), where developers can't step through the code manually. Previously, the Go test framework and tools didn't support test artifacts. Now they do. The new methods , , and return a directory where you can write test output files: If you use with , this directory will be inside the output directory (specified by , or the current directory by default): As you can see, the first time is called, it writes the directory location to the test log, which is quite handy. If you don't use , artifacts are stored in a temporary directory which is deleted after the test completes. Each test or subtest within each package has its own unique artifact directory. Subtest outputs are not stored inside the parent test's output directory — all artifact directories for a given package are created at the same level: The artifact directory path normally looks like this: But if this path can't be safely converted into a local file path (which, for some reason, always happens on my machine), the path will simply be: (which is what happens in the examples above) Repeated calls to in the same test or subtest return the same directory. 𝗗 T.ArtifactDir • 𝗣 71287 • 𝗖𝗟 696399 • 𝗔 Damien Neil Over the years, the command became a sad, neglected bag of rewrites for very ancient Go features. But now, it's making a comeback. The new is re-implemented using the Go analysis framework — the same one uses. While and now use the same infrastructure, they have different purposes and use different sets of analyzers: By default, runs a full set of analyzers (currently, there are more than 20). To choose specific analyzers, use the flag for each one, or use to run all analyzers except the ones you turned off. For example, here we only enable the analyzer: And here, we enable all analyzers except : Currently, there's no way to suppress specific analyzers for certain files or sections of code. To give you a taste of analyzers, here's one of them in action. It replaces loops with or : If you're interested, check out the dedicated blog post for the full list of analyzers with examples. 𝗗 cmd/fix • 𝗚 go fix • 𝗣 71859 • 𝗔 Alan Donovan Go 1.26 is incredibly big — it's the largest release I've ever seen, and for good reason: All in all, a great release! You might be wondering about the package that was introduced as experimental in 1.25. It's still experimental and available with the flag. P.S. To catch up on other Go releases, check out the Go features by version list or explore the interactive tours for Go 1.25 and 1.24 . P.P.S. Want to learn more about Go? Check out my interactive book on concurrency a vector from array/slice, or a vector to array/slice. Arithmetic: , , , , . Bitwise: , , , , . Comparison: , , , , . Conversion: , , . Masking: , , . Rearrangement: . Collect live goroutines . Start with currently active (runnable or running) goroutines as roots. Ignore blocked goroutines for now. Mark reachable memory . Trace pointers from roots to find which synchronization objects (like channels or wait groups) are currently reachable by these roots. Resurrect blocked goroutines . Check all currently blocked goroutines. If a blocked goroutine is waiting for a synchronization resource that was just marked as reachable — add that goroutine to the roots. Iterate . Repeat steps 2 and 3 until there are no more new goroutines blocked on reachable objects. Report the leaks . Any goroutines left in the blocked state are waiting for resources that no active part of the program can access. They're considered leaked. Total number of goroutines since the program started. Number of goroutines in each state. Number of active threads. First by validity (invalid before valid). Then by address family (IPv4 before IPv6). Then by masked IP address (network IP). Then by prefix length. Then by unmasked address (original IP). Vet is for reporting problems. Its analyzers describe actual issues, but they don't always suggest fixes, and the fixes aren't always safe to apply. Fix is (mostly) for modernizing the code to use newer language and library features. Its analyzers produce fixes are always safe to apply, but don't necessarily indicate problems with the code. It brings a lot of useful updates, like the improved builtin, type-safe error checking, and goroutine leak detector. There are also many performance upgrades, including the new garbage collector, faster cgo and memory allocation, and optimized and . On top of that, it adds quality-of-life features like multiple log handlers, test artifacts, and the updated tool. Finally, there are two specialized experimental packages: one with SIMD support and another with protected mode for forward secrecy.

0 views
Anton Zhiyanov 1 weeks ago

Fear is not advocacy

AI advocates seem to be the only kind of technology advocates who feel this imminent urge to constantly criticize developers for not being excited enough about their tech. It would be crazy if I presented new Go features like this: If you still don't use the package, all your systems will eventually succumb to concurrency bugs. If you don't use iterators, you have absolutely nothing interesting to build. The job of an advocate is to spark interest, not to reproach people or instill FOMO. And yet that's exactly what AI advocates do. What a weird way to advocate. This whole "devote your life to AI right now, or you'll be out of a job soon" narrative is false. You don't have to be a world-class algorithm expert to write good software. You don't have to be a Linux expert to use containers. And you don't have to spend all your time now trying to become an expert in chasing ever-changing AI tech. As with any new technology, developers adopting AI typically fall into four groups: early adopters, early majority, late majority, and laggards. Right now, AI advocates are trying to shame everyone into becoming early adopters. But it's perfectly okay to wait if you're sceptical. Being part of the late majority is a safe and reasonable choice. If anything, you'll have fewer bugs to deal with. As the industry adopts AI practices, you'll naturally absorb just the right amount of them. You are going to be fine.

0 views
Anton Zhiyanov 2 weeks ago

'Better C' playgrounds

I have a soft spot for the "better C" family of languages: C3, Hare, Odin, V, and Zig. I'm not saying these languages are actually better than C — they're just different. But I needed to come up with an umbrella term for them, and "better C" was the only thing that came to mind. I believe playgrounds and interactive documentation make programming languages easier for more people to learn. That's why I created online sandboxes for these langs. You can try them out below, embed them on your own website, or self-host and customize them. If you're already familiar with one of these languages, maybe you could even create an interactive guide for it? I'm happy to help if you want to give it a try. C3  • Hare  • Odin  • V  • Zig  • Editors An ergonomic, safe, and familiar evolution of C. ⛫  homepage • αω  tutorial • ⚘  community A systems programming language designed to be simple, stable, and robust. ⛫  homepage • αω  tutorial • ⚘  community A high-performance, data-oriented systems programming language. ⛫  homepage • αω  tutorial • ⚘  community A language with C-level performance and rapid compilation speeds. ⛫  homepage • αω  tutorial • ⚘  community A language designed for performance and explicit control with powerful metaprogramming. ⛫  homepage • αω  tutorial • ⚘  community If you want to do more than just "hello world," there are also full-size online editors . They're pretty basic, but still can be useful.

0 views
Anton Zhiyanov 3 weeks ago

Go feature: Modernized go fix

Part of the Accepted! series: Go proposals and features explained in simple terms. The modernized command uses a fresh set of analyzers and the same infrastructure as . Ver. 1.26 • Tools • Medium impact The is re-implemented using the Go analysis framework — the same one uses. While and now use the same infrastructure, they have different purposes and use different sets of analyzers: See the full set of fix's analyzers in the Analyzers section. The main goal is to bring modernization tools from the Go language server (gopls) to the command line. If includes the modernize suite, developers can easily and safely update their entire codebase after a new Go release with just one command. Re-implementing also makes the Go toolchain simpler. The unified and use the same backend framework and extension mechanism. This makes the tools more consistent, easier to maintain, and more flexible for developers who want to use custom analysis tools. Implement the new command: By default, runs a full set of analyzers (see the list below). To choose specific analyzers, use the flag for each one, or use to run all analyzers except the ones you turned off. For example, here we only enable the analyzer: And here, we enable all analyzers except : Currently, there's no way to suppress specific analyzers for certain files or sections of code. Here's the list of fixes currently available in , along with examples. any  • bloop  • fmtappendf  • forvar  • hostport  • inline  • mapsloop  • minmax  • newexpr  • omitzero  • plusbuild  • rangeint  • reflecttypefor  • slicescontains  • slicessort  • stditerators  • stringsbuilder  • stringscut  • stringcutprefix  • stringsseq  • testingcontext  • waitgroup Replace with : Replace for-range over with and remove unnecessary manual timer control: Replace with to avoid intermediate string allocation: Remove unnecessary shadowing of loop variables: Replace network addresses created with by using instead, because host-port pairs made with don't work with IPv6: Inline function calls accoring to the comment directives: Replace explicit loops over maps with calls to package ( , , , or depending on the context): Replace if/else statements with calls to or : Replace custom "pointer to" functions with : Remove from struct-type fields because this tag doesn't have any effect on them: Remove obsolete comments: Replace 3-clause for loops with for-range over integers: Replace with : Replace loops with or : Replace with for basic types: Use iterators instead of / -style APIs for certain types in the standard library: Replace repeated with : Replace some uses of and string slicing with or : Replace / with and / with : Replace ranging over / with / : Replace with in tests: Replace + with : 𝗣 71859 👥 Alan Donovan , Jonathan Amsterdam Vet is for reporting problems. Its analyzers describe actual issues, but they don't always suggest fixes, and the fixes aren't always safe to apply. Fix is (mostly) for modernizing the code to use newer language and library features. Its analyzers produce fixes are always safe to apply, but don't necessarily indicate problems with the code.

0 views
Anton Zhiyanov 3 weeks ago

Detecting goroutine leaks in modern Go

Deadlocks, race conditions, and goroutine leaks are probably the three most common problems in concurrent Go programming. Deadlocks usually cause panics, so they're easier to spot. The race detector can help find data races (although it doesn't catch everything and doesn't help with other types of race conditions). As for goroutine leaks, Go's tooling did not address them for a long time. A leak occurs when one or more goroutines are indefinitely blocked on synchronization primitives like channels, while other goroutines continue running and the program as a whole keeps functioning. We'll look at some examples shortly. Things started to change in Go 1.24 with the introduction of the package. There will be even bigger changes in Go 1.26, which adds a new experimental profile that reports leaked goroutines. Let's take a look! A simple leak  • Detection: goleak  • Detection: synctest  • Detection: pprof  • Algorithm  • Range over channel  • Double send  • Early return  • Take first  • Cancel/timeout  • Orphans  • Final thoughts Let's say there's a function that runs the given functions concurrently and sends their results to an output channel: And a simple test: Send three functions to be executed and collect the results from the output channel. The test passed, so the function works correctly. But does it really? Let's pass three functions to without collecting the results, and count the goroutines: After 50 ms — when all the functions should definitely have finished — there are still three running goroutines ( ). In other words, all the goroutines are stuck. The reason is that the channel is unbuffered. If the client doesn't read from it, or doesn't read all the results, the goroutines inside get blocked on sending the result to . Let's modify the test to catch the leak. Obviously, we don't want to rely on in tests — such check is too fragile. Let's use a third-party goleak package instead: playground ▶ The test output clearly shows where the leak occurs. Goleak uses internally, but it does so quite efficiently. It inspects the stack for unexpected goroutines up to 20 times, with the wait time between checks increasing exponentially, starting at 1 microsecond and going up to 100 milliseconds. This way, the test runs almost instantly. Still, I'd prefer not to use third-party packages and . Let's check for leaks without any third-party packages by using the package (experimental in Go 1.24, production-ready in Go 1.25+): I'll keep this explanation short since isn't the main focus of this article. If you want to learn more about it, check out the Concurrency testing guide. I highly recommend it — is super useful! Here's what happens: Next, comes into play. It tries to wait for all child goroutines to finish before it returns. But if it sees that some goroutines are durably blocked (in our case, all three are blocked trying to send to the channel), it panics: main bubble goroutine has exited but blocked goroutines remain So, here we found the leak without using or goleak. Pretty useful! Let's check for leaks using the new profile type (experimental in Go 1.26). We'll use a helper function to run the profiled code and print the results when the profile is ready: Call with three functions and observe all three leaks: We have a nice goroutine stack trace that shows exactly where the leak happens. Unfortunately, we had to use again, so this probably isn't the best way to test — unless we combine it with to use the fake clock. On the other hand, we can collect a from a running program, which makes it really useful for finding leaks in production systems (unlike ). Pretty neat. This profile uses the garbage collector's marking phase to find goroutines that are permanently blocked (leaked). The approach is explained in detail in the proposal and the paper by Saioc et al. — check it out if you're interested. Here's the gist of it: In the rest of the article, we'll review the different types of leaks often observed in production and see whether and are able to detect each of them (spoiler: they are). Based on the code examples from the common-goroutine-leak-patterns repository by Georgian-Vlad Saioc, licensed under the Apache-2.0 license. One or more goroutines receive from a channel using , but the sender never closes the channel, so all the receivers eventually leak: Notice how and give almost the same stack traces, clearly showing the root cause of the problem. You'll see this in the next examples as well. Fix: The sender should close the channel after it finishes sending. Try uncommenting the ⓧ line and see if both checks pass. The sender accidentally sends more values to a channel than intended, and leaks: Fix: Make sure that each possible path in the code sends to the channel no more times than the receiver is ready for. Alternatively, make the channel's buffer large enough to handle all possible sends. Try uncommenting the ⓧ line and see if both checks pass. The parent goroutine exits without receiving a value from the child goroutine, so the child leaks: Fix: Make the channel buffered so the child goroutine doesn't get blocked when sending. Try making the channel buffered at line ⓧ and see if both checks pass. Similar to "early return". If the parent is canceled before receiving a value from the child goroutine, the child leaks: Fix: Make the channel buffered so the child goroutine doesn't get blocked when sending. Try making the channel buffered at line ⓧ and see if both checks pass. The parent launches N child goroutines, but is only interested in the first result. The rest N-1 children leak: Using (zero items, the parent leaks): Using (multiple items, children leak): Using (zero items, the parent leaks): Using (multiple items, children leak): Fix: Make the channel's buffer large enough to hold values from all child goroutines. Also, return early if the source collection is empty. Try changing the implementation as follows and see if both checks pass: Inner goroutines leak because the client doesn't follow the contract described in the type's interface and documentation. Let's say we have a type with the following contract: The implementation isn't particularly important — what really matters is the public contract. Let's say the client breaks the contract and doesn't stop the worker: Then the worker goroutines will leak, just like the documentation says. Fix: Follow the contract and stop the worker to make sure all goroutines are stopped. Try uncommenting the ⓧ line and see if both checks pass. Thanks to improvements in Go 1.24-1.26, it's now much easier to catch goroutine leaks, both during testing and in production. The package is available in 1.24 (experimental) and 1.25+ (production-ready). If you're interested, I have a detailed interactive guide on it. The profile will be available in 1.26 (experimental). According to the authors, the implementation is already production-ready. It's only marked as experimental so they can get feedback on the API, especially about making it a new profile. Check the proposal and the commits for more details on : P.S. If you are into concurrency, check out my interactive book . The call to starts a testing bubble in a separate goroutine. The call to starts three goroutines. The call to blocks the root bubble goroutine. One of the goroutines executes , tries to write to , and gets blocked (because no one is reading from ). The same thing happens to the other two goroutines. sees that all the child goroutines in the bubble are durably blocked, so it unblocks the root goroutine. The inner test function finishes. Collect live goroutines . Start with currently active (runnable or running) goroutines as roots. Ignore blocked goroutines for now. Mark reachable memory . Trace pointers from roots to find which memory objects (like channels or mutexes) are currently reachable by these roots. Resurrect blocked goroutines . Check all currently blocked goroutines. If a blocked goroutine is waiting for a synchronization resource that was just marked as reachable — add that goroutine to the roots. Iterate . Repeat steps 2 and 3 until there are no more new goroutines blocked on reachable objects. Report the leaks . Any goroutines left in the blocked state are waiting for resources that no active part of the program can access. They're considered leaked. 𝗣 74609 , 75280 👥 Vlad Saioc , Michael Knyszek 𝗖𝗟 688335 👥 Vlad Saioc

0 views
Anton Zhiyanov 1 months ago

Timing 'Hello, world'

Here's a little unscientific chart showing the compile/run times of a "hello world" program in different languages: For interpreted languages, the times shown are only for running the program, since there's no separate compilation step. I had to shorten the Kotlin bar a bit to make it fit within 80 characters. All measurements were done in single-core, containerized sandboxes on an ancient CPU, and the timings include the overhead of . So the exact times aren't very interesting, especially for the top group (Bash to Ruby) — they all took about the same amount of time. Here is the program source code in C: Other languages: Bash · C# · C++ · Dart · Elixir · Go · Haskell · Java · JavaScript · Kotlin · Lua · Odin · PHP · Python · R · Ruby · Rust · Swift · V · Zig Of course, this ranking will be different for real-world projects with lots of code and dependencies. Still, I found it curious to see how each language performs on a simple "hello world" task.

0 views
Anton Zhiyanov 1 months ago

Gist of Go: Concurrency is out!

My book on concurrent programming in Go is finally finished. It walks you through goroutines, channels, select, pipelines, synchronization, race prevention, time handling, signaling, atomicity, testing, and concurrency internals. The book follows my usual style: clear explanations with interactive examples, plus auto-tested exercises so you can practice as you go. I genuinely think it's the best practical guide for everyone learning concurrency from scratch or looking to go beyond the basics. There's a dedicated page with all the book details — check it out !

1 views
Anton Zhiyanov 1 months ago

Go proposal: Secret mode

Part of the Accepted! series, explaining the upcoming Go changes in simple terms. Automatically erase used memory to prevent secret leaks. Ver. 1.26 • Stdlib • Low impact The new package lets you run a function in secret mode . After the function finishes, it immediately erases (zeroes out) the registers and stack it used. Heap allocations made by the function are erased as soon as the garbage collector decides they are no longer reachable. This helps make sure sensitive information doesn't stay in memory longer than needed, lowering the risk of attackers getting to it. The package is experimental and is mainly for developers of cryptographic libraries, not for application developers. Cryptographic protocols like WireGuard or TLS have a property called "forward secrecy". This means that even if an attacker gains access to long-term secrets (like a private key in TLS), they shouldn't be able to decrypt past communication sessions. To make this work, session keys (used to encrypt and decrypt data during a specific communication session) need to be erased from memory after they're used. If there's no reliable way to clear this memory, the keys could stay there indefinitely, which would break forward secrecy. In Go, the runtime manages memory, and it doesn't guarantee when or how memory is cleared. Sensitive data might remain in heap allocations or stack frames, potentially exposed in core dumps or through memory attacks. Developers often have to use unreliable "hacks" with reflection to try to zero out internal buffers in cryptographic libraries. Even so, some data might still stay in memory where the developer can't reach or control it. The solution is to provide a runtime mechanism that automatically erases all temporary storage used during sensitive operations. This will make it easier for library developers to write secure code without using workarounds. Add the package with and functions: The current implementation has several limitations: The last point might not be immediately obvious, so here's an example. If an offset in an array is itself secret (you have a array and the secret key always starts at ), don't create a pointer to that location (don't create a pointer to ). Otherwise, the garbage collector might store this pointer, since it needs to know about all active pointers to do its job. If someone launches an attack to access the GC's memory, your secret offset could be exposed. The package is mainly for developers who work on cryptographic libraries. Most apps should use higher-level libraries that use behind the scenes. As of Go 1.26, the package is experimental and can be enabled by setting at build time. Use to generate a session key and encrypt a message using AES-GCM: Note that protects not just the raw key, but also the structure (which contains the expanded key schedule) created inside the function. This is a simplified example, of course — it only shows how memory erasure works, not a full cryptographic exchange. In real situations, the key needs to be shared securely with the receiver (for example, through key exchange) so decryption can work. 𝗣 21865 • 𝗖𝗟 704615 • 👥 Daniel Morsing , Dave Anderson , Filippo Valsorda , Jason A. Donenfeld , Keith Randall , Russ Cox Only supported on linux/amd64 and linux/arm64. On unsupported platforms, invokes directly. Protection does not cover any global variables that writes to. Trying to start a goroutine within causes a panic. If calls , erasure is delayed until all deferred functions are executed. Heap allocations are only erased if ➊ the program drops all references to them, and ➋ then the garbage collector notices that those references are gone. The program controls the first part, but the second part depends on when the runtime decides to act. If panics, the panicked value might reference memory allocated inside . That memory won't be erased until (at least) the panicked value is no longer reachable. Pointer addresses might leak into data buffers that the runtime uses for garbage collection. Do not put confidential information into pointers.

0 views
Anton Zhiyanov 1 months ago

Gist of Go: Concurrency internals

This is a chapter from my book on Go concurrency , which teaches the topic from the ground up through interactive examples. Here's where we started this book: Functions that run with are called goroutines. The Go runtime juggles these goroutines and distributes them among operating system threads running on CPU cores. Compared to OS threads, goroutines are lightweight, so you can create hundreds or thousands of them. That's generally correct, but it's a little too brief. In this chapter, we'll take a closer look at how goroutines work. We'll still use a simplified model, but it should help you understand how everything fits together. Concurrency • Goroutine scheduler • GOMAXPROCS • Concurrency primitives • Scheduler metrics • Profiling • Tracing • Keep it up At the hardware level, CPU cores are responsible for running parallel tasks. If a processor has 4 cores, it can run 4 instructions at the same time — one on each core. At the operating system level, a thread is the basic unit of execution. There are usually many more threads than CPU cores, so the operating system's scheduler decides which threads to run and which ones to pause. The scheduler keeps switching between threads to make sure each one gets a turn to run on a CPU, instead of waiting in line forever. This is how the operating system handles concurrency. At the Go runtime level, a goroutine is the basic unit of execution. The runtime scheduler runs a fixed number of OS threads, often one per CPU core. There can be many more goroutines than threads, so the scheduler decides which goroutines to run on the available threads and which ones to pause. The scheduler keeps switching between goroutines to make sure each one gets a turn to run on a thread, instead of waiting in line forever. This is how Go handles concurrency. The Go runtime scheduler doesn't decide which threads run on the CPU — that's the operating system scheduler's job. The Go runtime makes sure all goroutines run on the threads it manages, but the OS controls how and when those threads actually get CPU time. The scheduler's job is to run M goroutines on N operating system threads, where M can be much larger than N. Here's a simple way to do it: Take goroutines G11-G14 and run them: Goroutine G12 got blocked while reading from the channel. Put it back in the queue and replace it with G15: But there are a few things to keep in mind. Let's say goroutines G11–G14 are running smoothly without getting blocked by mutexes or channels. Does that mean goroutines G15–G20 won't run at all and will just have to wait ( starve ) until one of G11–G14 finally finishes? That would be unfortunate. That's why the scheduler checks each running goroutine roughly every 10 ms to decide if it's time to pause it and put it back in the queue. This approach is called preemptive scheduling: the scheduler can interrupt running goroutines when needed so others have a chance to run too. System calls The scheduler can manage a goroutine while it's running Go code. But what happens if a goroutine makes a system call, like reading from disk? In that case, the scheduler can't take the goroutine off the thread, and there's no way to know how long the system call will take. For example, if goroutines G11–G14 in our example spend a long time in system calls, all worker threads will be blocked, and the program will basically "freeze". To solve this problem, the scheduler starts new threads if the existing ones get blocked in a system call. For example, here's what happens if G11 and G12 make system calls: Here, the scheduler started two new threads, E and F, and assigned goroutines G15 and G16 from the queue to these threads. When G11 and G12 finish their system calls, the scheduler will stop or terminate the extra threads (E and F) and keep running the goroutines on four threads: A-B-C-D. This is a simplified model of how the goroutine scheduler works in Go. If you want to learn more, I recommend watching the talk by Dmitry Vyukov, one of the scheduler's developers: Go scheduler: Implementing language with lightweight concurrency ( video , slides ) We said that the scheduler uses N threads to run goroutines. In the Go runtime, the value of N is set by a parameter called . The runtime setting controls the maximum number of operating system threads the Go scheduler can use to execute goroutines concurrently. It defaults to the value of , which is the number of logical CPUs on the machine. Strictly speaking, is either the total number of logical CPUs or the number allowed by the CPU affinity mask, whichever is lower. This can be adjusted by the CPU quota, as explained below. For example, on my 8-core laptop, the default value of is also 8: You can change by setting environment variable or calling : You can also undo the manual changes and go back to the default value set by the runtime. To do this, use the function (Go 1.25+): Go programs often run in containers, like those managed by Docker or Kubernetes. These systems let you limit the CPU resources for a container using a Linux feature called cgroups . A cgroup (control group) in Linux lets you group processes together and control how much CPU, memory, and network I/O they can use by setting limits and priorities. For example, here's how you can limit a Docker container to use only four CPUs: Before version 1.25, the Go runtime didn't consider the CPU quota when setting the value. No matter how you limited CPU resources, was always set to the number of logical CPUs on the host machine: Starting with version 1.25, the Go runtime respects the CPU quota: So, the default value is set to either the number of logical CPUs or the CPU limit enforced by cgroup settings for the process, whichever is lower. Note on CPU limits Cgroups actually offer not just one, but two ways to limit CPU resources: Docker's and / set the quota, while sets the shares. Kubernetes' CPU limit sets the quota, while CPU request sets the shares. Go's runtime only takes the CPU quota into account, not the shares. Fractional CPU limits are rounded up: On a machine with multiple CPUs, the minimum default value for is 2, even if the CPU limit is set lower: The Go runtime automatically updates if the CPU limit changes. It happens up to once per second (less frequently if the application is idle). Let's take a quick look at the three main concurrency tools for Go: goroutines, channels, and select. A goroutine is implemented as a pointer to a structure. Here's what it looks like: The structure has many fields, but most of its memory is taken up by the stack, which holds the goroutine's local variables. By default, each stack gets 2 KB of memory, and it grows if needed. Because goroutines use very little memory, they're much more efficient than operating system threads, which usually need about 1 MB each. Their small size lets you run tens (or even hundreds) of thousands of goroutines on a single machine. A channel is implemented as a pointer to a structure. Here's what it looks like: The buffer array ( ) has a fixed size ( , which you can get with the builtin). It's created when you make a buffered channel. The number of items in the channel ( , which you can get with the builtin) increases when you send to the channel and decreases when you receive from it. The builtin sets the field to 1. Sending an item to an unbuffered channel, or to a buffered channel that's already full, puts the goroutine into the queue. Receiving from an empty channel puts the goroutine into the queue. The select logic is implemented in the function. It's a huge function that takes a list of select cases and (very simply put) works as follows: ✎ Exercise: Runtime simulator Practice is crucial in turning abstract knowledge into skills, making theory alone insufficient. The full version of the book contains a lot of exercises — that's why I recommend getting it . If you are okay with just theory for now, let's continue. Metrics show how the Go runtime is performing, like how much heap memory it uses or how long garbage collection pauses take. Each metric has a unique name (for example, ) and a value, which can be a number or a histogram. We use the package to work with metrics. List all available metrics with descriptions: Get the value of a specific metric: Here are some goroutine-related metrics: In real projects, runtime metrics are usually exported automatically with client libraries for Prometheus, OpenTelemetry, or other observability tools. Here's an example for Prometheus: The exported metrics are then collected by Prometheus, visualized, and used to set up alerts. Profiling helps you understand exactly what the program is doing, what resources it uses, and where in the code this happens. Profiling is often not recommended in production because it's a "heavy" process that can slow things down. But that's not the case with Go. Go's profiler is designed for production use. It uses sampling, so it doesn't track every single operation. Instead, it takes quick snapshots of the runtime every 10 ms and puts them together to give you a full picture. Go supports the following profiles: The easiest way to add a profiler to your app is by using the package. When you import it, it automatically registers HTTP handlers for collecting profiles: Or you can register profiler handlers manually: After that, you can start profiling with a specific profile by running the command with the matching URL, or just open that URL in your browser: For the CPU profile, you can choose how long the profiler runs (the default is 30 seconds). Other profiles are taken instantly. After running the profiler, you'll get a binary file that you can open in the browser using the same utility. For example: The pprof web interface lets you view the same profile in different ways. My personal favorites are the flame graph , which clearly shows the call hierarchy and resource usage, and the source view, which shows the exact lines of code. You can also profile manually. To collect a CPU profile, use and : To collect other profiles, use : Profiling is a broad topic, and we've only touched the surface. To learn more, start with these articles: Tracing records certain types of events while the program is running, mainly those related to concurrency and memory: If you enabled the profiling server as described earlier, you can collect a trace using this URL: Trace files can be quite large, so it's better to use a small N value. After tracing is complete, you'll get a binary file that you can open in the browser using the utility: In the trace web interface, you'll see each goroutine's "lifecycle" on its own line. You can zoom in and out of the trace with the W and S keys, and you can click on any event to see more details: You can also collect a trace manually: Flight recording is a tracing technique that collects execution data, such as function calls and memory allocations, within a sliding window that's limited by size or duration. It helps to record traces of interesting program behavior, even if you don't know in advance when it will happen. The type (Go 1.25+) implements a flight recorder in Go. It tracks a moving window over the execution trace produced by the runtime, always containing the most recent trace data. Here's an example of how you might use it. First, configure the sliding window: Then create the recorder and start it: Continue with the application code as usual: Finally, save the trace snapshot to a file when an important event occurs: Use to view the trace in the browser: ✎ Exercise: Comparing blocks Practice is crucial in turning abstract knowledge into skills, making theory alone insufficient. The full version of the book contains a lot of exercises — that's why I recommend getting it . If you are okay with just theory for now, let's continue. Now you can see how challenging the Go scheduler's job is. Fortunately, most of the time you don't need to worry about how it works behind the scenes — sticking to goroutines, channels, select, and other synchronization primitives is usually enough. This is the final chapter of my "Gist of Go: Concurrency" book. I invite you to read it — the book is an easy-to-understand, interactive guide to concurrency programming in Go. Pre-order for $10   or read online Put all goroutines in a queue. Take N goroutines from the queue and run them. If a running goroutine gets blocked (for example, waiting to read from a channel or waiting on a mutex), put it back in the queue and run the next goroutine from the queue. CPU quota — the maximum CPU time the cgroup may use within some period window. CPU shares — relative CPU priorities given to the kernel scheduler. Go through the cases and check if the matching channels are ready to send or receive. If several cases are ready, choose one at random (to prevent starvation, where some cases are always chosen and others are never chosen). Once a case is selected, perform the send or receive operation on the matching channel. If there is a default case and no other cases are ready, pick the default. If no cases are ready, block the goroutine and add it to the channel queue for each case. Count of goroutines created since program start (Go 1.26+). Count of live goroutines (created but not finished yet). An increase in this metric may indicate a goroutine leak. Approximate count of goroutines running or blocked in a system call or cgo call (Go 1.26+). An increase in this metric may indicate problems with such calls. Approximate count of goroutines ready to execute, but not executing (Go 1.26+). An increase in this metric may mean the system is overloaded and the CPU can't keep up with the growing number of goroutines. Approximate count of goroutines executing (Go 1.26+). Always less than or equal to . Approximate count of goroutines waiting on a resource — I/O or sync primitives (Go 1.26+). An increase in this metric may indicate issues with mutex locks, other synchronization blocks, or I/O issues. The current count of live threads that are owned by the runtime (Go 1.26+). The current setting — the maximum number of operating system threads the scheduler can use to execute goroutines concurrently. CPU . Shows how much CPU time each function uses. Use it to find performance bottlenecks if your program is running slowly because of CPU-heavy tasks. Heap . Shows the heap memory currently used by each function. Use it to detect memory leaks or excessive memory usage. Allocs . Shows which functions have used heap memory since the profiler started (not just currently). Use it to optimize garbage collection or reduce allocations that impact performance. Goroutine . Shows the stack traces of all current goroutines. Use it to get an overview of what the program is doing. Block . Shows where goroutines block waiting on synchronization primitives like channels, mutexes and wait groups. Use it to identify synchronization bottlenecks and issues in data exchange between goroutines. Disabled by default. Mutex . Shows lock contentions on mutexes and internal runtime locks. Use it to find "problematic" mutexes that goroutines are frequently waiting for. Disabled by default. Profiling Go Programs Diagnostics goroutine creation and state changes; system calls; garbage collection; heap size changes;

1 views
Anton Zhiyanov 1 months ago

Go proposal: Type-safe error checking

Part of the Accepted! series, explaining the upcoming Go changes in simple terms. Introducing — a modern, type-safe alternative to . Ver. 1.26 • Stdlib • High impact The new function is a generic version of : It's type-safe, faster, and easier to use: is not deprecated (yet), but is recommended for new code. The function requires you to declare a variable of the target error type and pass a pointer to it: It makes the code quite verbose, especially when checking for multiple types of errors: With a generic , you can specify the error type right in the function call. This makes the code shorter and keeps error variables scoped to their blocks: Another issue with is that it uses reflection and can cause runtime panics if used incorrectly (like if you pass a non-pointer or a type that doesn't implement ). While static analysis tools usually catch these issues, using the generic has several benefits: Finally, can handle everything that does, so it's a drop-in improvement for new code. Add the function to the package: Recommend using instead of : Open a file and check if the error is related to the file path: 𝗣 51945 • 𝗖𝗟 707235 Unlike , doesn't use the package, but it still relies on type assertions and interface checks. These operations access runtime type metadata, so isn't completely "reflection-free" in the strict sense.  ↩︎ No reflection 1 . No runtime panics. Less allocations. Compile-time type safety. Unlike , doesn't use the package, but it still relies on type assertions and interface checks. These operations access runtime type metadata, so isn't completely "reflection-free" in the strict sense.  ↩︎

0 views
Anton Zhiyanov 1 months ago

Go proposal: Goroutine metrics

Part of the Accepted! series, explaining the upcoming Go changes in simple terms. Export goroutine-related metrics from the Go runtime. Ver. 1.26 • Stdlib • Medium impact New metrics in the package give better insight into goroutine scheduling: Go's runtime/metrics package already provides a lot of runtime stats, but it doesn't include metrics for goroutine states or thread counts. Per-state goroutine metrics can be linked to common production issues. An increasing waiting count can show a lock contention problem. A high not-in-go count means goroutines are stuck in syscalls or cgo. A growing runnable backlog suggests the CPUs can't keep up with demand. Observability systems can track these counters to spot regressions, find scheduler bottlenecks, and send alerts when goroutine behavior changes from the usual patterns. Developers can use them to catch problems early without needing full traces. Add the following metrics to the package: The per-state numbers are not guaranteed to add up to the live goroutine count ( , available since Go 1.16). All metrics use uint64 counters. Start some goroutines and print the metrics after 100 ms of activity: No surprises here: we read the new metric values the same way as before — using metrics.Read . 𝗣 15490 • 𝗖𝗟 690397 , 690398 , 690399 P.S. If you are into goroutines, check out my interactive book on concurrency Total number of goroutines since the program started. Number of goroutines in each state. Number of active threads.

1 views
Anton Zhiyanov 1 months ago

Gist of Go: Concurrency testing

This is a chapter from my book on Go concurrency , which teaches the topic from the ground up through interactive examples. Testing concurrent programs is a lot like testing single-task programs. If the code is well-designed, you can test the state of a concurrent program with standard tools like channels, wait groups, and other abstractions built on top of them. But if you've made it so far, you know that concurrency is never that easy. In this chapter, we'll go over common testing problems and the solutions that Go offers. Waiting for goroutines • Checking channels • Checking for leaks • Durable blocking • Instant waiting • Time inside the bubble • Thoughts on time 1  ✎ • Thoughts on time 2  ✎ • Checking for cleanup • Bubble rules • Keep it up Let's say we want to test this function: Calculations run asynchronously in a separate goroutine. However, the function returns a result channel, so this isn't a problem: At point ⓧ, the test is guaranteed to wait for the inner goroutine to finish. The rest of the test code doesn't need to know anything about how concurrency works inside the function. Overall, the test isn't any more complicated than if were synchronous. But we're lucky that returns a channel. What if it doesn't? Let's say the function looks like this: We write a simple test and run it: The assertion fails because at point ⓧ, we didn't wait for the inner goroutine to finish. In other words, we didn't synchronize the and goroutines. That's why still has its initial value (0) when we do the check. We can add a short delay with : The test is now passing. But using to sync goroutines isn't a great idea, even in tests. We don't want to set a custom delay for every function we're testing. Also, the function's execution time may be different on the local machine compared to a CI server. If we use a longer delay just to be safe, the tests will end up taking too long to run. Sometimes you can't avoid using in tests, but since Go 1.25, the package has made these cases much less common. Let's see how it works. The package has a lot going on under the hood, but its public API is very simple: The function creates an isolated bubble where you can control time to some extent. Any new goroutines started inside this bubble become part of the bubble. So, if we wrap the test code with , everything will run inside the bubble — the test code, the function we're testing, and its goroutine. At point ⓧ, we want to wait for the goroutine to finish. The function comes to the rescue! It blocks the calling goroutine until all other goroutines in the bubble are finished. (It's actually a bit more complicated than that, but we'll talk about it later.) In our case, there's only one other goroutine (the inner goroutine), so will pause until it finishes, and then the test will move on. Now the test passes instantly. That's better! ✎ Exercise: Wait until done Practice is crucial in turning abstract knowledge into skills, making theory alone insufficient. The full version of the book contains a lot of exercises — that's why I recommend getting it . If you are okay with just theory for now, let's continue. As we've seen, you can use to wait for the tested goroutine to finish, and then check the state of the data you are interested in. You can also use it to check the state of channels. Let's say there's a function that generates N numbers like 11, 22, 33, and so on: And a simple test: Set N=2, get the first number from the generator's output channel, then get the second number. The test passed, so the function works correctly. But does it really? Let's use in "production": Panic! We forgot to close the channel when exiting the inner goroutine, so the for-range loop waiting on that channel got stuck. Let's fix the code: And add a test for the channel state: The test is still failing, even though we're now closing the channel when the goroutine exits. This is a familiar problem: at point ⓧ, we didn't wait for the inner goroutine to finish. So when we check the channel, it hasn't closed yet. That's why the test fails. We can delay the check using : But it's better to use : At point ⓧ, blocks the test until the only other goroutine (the inner goroutine) finishes. Once the goroutine has exited, the channel is already closed. So, in the select statement, the case triggers with set to , allowing the test to pass. As you can see, the package helped us avoid delays in the test, and the test itself didn't get much more complicated. As we've seen, you can use to wait for the tested goroutine to finish, and then check the state of the data or channels. You can also use it to detect goroutine leaks. Let's say there's a function that runs the given functions concurrently and sends their results to an output channel: And a simple test: Send three functions to be executed, get the first result from the output channel, and check it. The test passed, so the function works correctly. But does it really? Let's run three times, passing three functions each time: After 50 ms — when all the functions should definitely have finished — there are still 9 running goroutines ( ). In other words, all the goroutines are stuck. The reason is that the channel is unbuffered. If the client doesn't read from it, or doesn't read all the results, the goroutines inside get blocked when they try to send the result of to . Let's fix this by adding a buffer of the right size to the channel: Then add a test to check the number of goroutines: The test is still failing, even though the channel is now buffered, and the goroutines shouldn't block on sending to it. This is a familiar problem: at point ⓧ, we didn't wait for the running goroutines to finish. So is greater than zero, which makes the test fail. We can delay the check using (not recommended), or use a third-party package like goleak (a better option): The test passes now. By the way, goleak also uses internally, but it does so much more efficiently. It tries up to 20 times, with the wait time between checks increasing exponentially, starting at 1 microsecond and going up to 100 milliseconds. This way, the test runs almost instantly. Even better, we can check for leaks without any third-party packages by using : Earlier, I said that blocks the calling goroutine until all other goroutines finish. Actually, it's a bit more complicated. blocks until all other goroutines either finish or become durably blocked . We'll talk about "durably" later. For now, let's focus on "become blocked." Let's temporarily remove the buffer from the channel and check the test results: Here's what happens: Next, comes into play. It not only starts the bubble goroutine, but also tries to wait for all child goroutines to finish before it returns. If sees that some goroutines are stuck (in our case, all 9 are blocked trying to send to the channel), it panics: main bubble goroutine has exited but blocked goroutines remain So, we found the leak without using or goleak, thanks to the useful features of and : Now let's make the channel buffered and run the test again: As we've found, blocks until all goroutines in the bubble — except the one that called — have either finished or are durably blocked. Let's figure out what "durably blocked" means. For , a goroutine inside a bubble is considered durably blocked if it is blocked by any of the following operations: Other blocking operations are not considered durable, and ignores them. For example: The distinction between "durable" and other types of blocks is just a implementation detail of the package. It's not a fundamental property of the blocking operations themselves. In real-world applications, this distinction doesn't exist, and "durable" blocks are neither better nor worse than any others. Let's look at an example. Let's say there's a type that performs some asynchronous computation: Our goal is to write a test that checks the result while the calculation is still running . Let's see how the test changes depending on how is implemented (except for the version — we'll cover that one a bit later). Let's say is implemented using a done channel: Naive test: The check fails because when is called, the goroutine in hasn't set yet. Let's use to wait until the goroutine is blocked at point ⓧ: In ⓧ, the goroutine is blocked on reading from the channel. This channel is created inside the bubble, so the block is durable. The call in the test returns as soon as happens, and we get the current value of . Let's say is implemented using select: Let's use to wait until the goroutine is blocked at point ⓧ: In ⓧ, the goroutine is blocked on a select statement. Both channels used in the select ( and ) are created inside the bubble, so the block is durable. The call in the test returns as soon as happens, and we get the current value of . Let's say is implemented using a wait group: Let's use to wait until the goroutine is blocked at point ⓧ: In ⓧ, the goroutine is blocked on the wait group's call. The group's method was called inside the bubble, so this is a durable block. The call in the test returns as soon as happens, and we get the current value of . Let's say is implemented using a condition variable: Let's use to wait until the goroutine is blocked at point ⓧ: In ⓧ, the goroutine is blocked on the condition variable's call. This is a durable block. The call returns as soon as happens, and we get the current value of . Let's say is implemented using a mutex: Let's try using to wait until the goroutine is blocked at point ⓧ: In ⓧ, the goroutine is blocked on the mutex's call. doesn't consider blocking on a mutex to be durable. The call ignores the block and never returns. The test hangs and only fails when the overall timeout is reached. You might be wondering why the authors didn't consider blocking on mutexes to be durable. There are a couple of reasons: ⌘ ⌘ ⌘ Let's go back to the original question: how does the test change depending on how is implemented? It doesn't change at all. We used the exact same test code every time: If your program uses durably blocking operations, always works the same way: Very convenient! ✎ Exercise: Blocking queue Practice is crucial in turning abstract knowledge into skills, making theory alone insufficient. The full version of the book contains a lot of exercises — that's why I recommend getting it . If you are okay with just theory for now, let's continue. Inside the bubble, time works differently. Instead of using a regular wall clock, the bubble uses a fake clock that can jump forward to any point in the future. This can be quite handy when testing time-sensitive code. Let's say we want to test this function: The positive scenario is straightforward: send a value to the channel, call the function, and check the result: The negative scenario, where the function times out, is also pretty straightforward. But the test takes the full three seconds to complete: We're actually lucky the timeout is only three seconds. It could have been as long as sixty! To make the test run instantly, let's wrap it in : Note that there is no call here, and the only goroutine in the bubble (the root one) gets durably blocked on a select statement in . Here's what happens next: Thanks to the fake clock, the test runs instantly instead of taking three seconds like it would with the "naive" approach. You might have noticed that quite a few circumstances coincided here: We'll look at the alternatives soon, but first, here's a quick exercise. ✎ Exercise: Wait, repeat Practice is crucial in turning abstract knowledge into skills, making theory alone insufficient. The full version of the book contains a lot of exercises — that's why I recommend getting it . If you are okay with just theory for now, let's continue. The fake clock in can be tricky. It move forward only if: ➊ all goroutines in the bubble are durably blocked; ➋ there's a future moment when at least one goroutine will unblock; and ➌ isn't running. Let's look at the alternatives. I'll say right away, this isn't an easy topic. But when has time travel ever been easy? :) Here's the function we're testing: Let's run in a separate goroutine, so there will be two goroutines in the bubble: panicked because the root bubble goroutine finished while the goroutine was still blocked on a select. Reason: only advances the clock if all goroutines are blocked — including the root bubble goroutine. How to fix: Use to make sure the root goroutine is also durably blocked. Now all three conditions are met again (all goroutines are durably blocked; the moment of future unblocking is known; there is no call to ). The fake clock moves forward 3 seconds, which unblocks the goroutine. The goroutine finishes, leaving only the root one, which is still blocked on . The clock moves forward another 2 seconds, unblocking the root goroutine. The assertion passes, and the test completes successfully. But if we run the test with the race detector enabled (using the flag), it reports a data race on the variable: Logically, using in the root goroutine doesn't guarantee that the goroutine (which writes to the variable) will finish before the root goroutine reads from . That's why the race detector reports a problem. Technically, the test passes because of how is implemented, but the race still exists in the code. The right way to handle this is to call after : Calling ensures that the goroutine finishes before the root goroutine reads , so there's no data race anymore. Here's the function we're testing: Let's replace in the root goroutine with : panicked because the root bubble goroutine finished while the goroutine was still blocked on a select. Reason: only advances the clock if there is no active running. If all bubble goroutines are durably blocked but a is running, won't advance the clock. Instead, it will simply finish the call and return control to the goroutine that called it (in this case, the root bubble goroutine). How to fix: don't use . Let's update to use context cancellation instead of a timer: We won't cancel the context in the test: panicked because all goroutines in the bubble are hopelessly blocked. Reason: only advances the clock if it knows how much to advance it. In this case, there is no future moment that would unblock the select in . How to fix: Manually unblock the goroutine and call to wait for it to finish. Now, cancels the context and unblocks the select in , while makes sure the goroutine finishes before the test checks and . Let's update to lock the mutex before doing any calculations: In the test, we'll lock the mutex before calling , so it will block: The test failed because it hit the overall timeout set in . Reason: only works with durable blocks. Blocking on a mutex lock isn't considered durable, so the bubble can't do anything about it — even though the sleeping inner goroutine would have unlocked the mutex in 10 ms if the bubble had used the wall clock. How to fix: Don't use . Now the mutex unlocks after 10 milliseconds (wall clock), finishes successfully, and the check passes. The clock inside the buuble won't move forward if: ✎ Exercise: Asynchronous repeater Practice is crucial in turning abstract knowledge into skills, making theory alone insufficient. The full version of the book contains a lot of exercises — that's why I recommend getting it . If you are okay with just theory for now, let's continue. Let's practice understanding time in the bubble with some thinking exercises. Try to solve the problem in your head before using the playground. Here's a function that performs synchronous work: And a test for it: What is the test missing at point ⓧ? ✓ Thoughts on time 1 There's only one goroutine in the test, so when gets blocked by , the time in the bubble jumps forward by 3 seconds. Then sets to and finishes. Finally, the test checks and passes successfully. No need to add anything. Let's keep practicing our understanding of time in the bubble with some thinking exercises. Try to solve the problem in your head before using the playground. Here's a function that performs asynchronous work: And a test for it: What is the test missing at point ⓧ? ✓ Thoughts on time 2 Let's go over the options. ✘ synctest.Wait This won't help because returns as soon as inside is called. The check fails, and panics with the error: "main bubble goroutine has exited but blocked goroutines remain". ✘ time.Sleep Because of the call in the root goroutine, the wait inside in is already over by the time is checked. However, there's no guarantee that has run yet. That's why the test might pass or might fail. ✘ synctest.Wait, then time.Sleep This option is basically the same as just using , because returns before the in even starts. The test might pass or might fail. ✓ time.Sleep, then synctest.Wait This is the correct answer: Since the root goroutine isn't blocked, it checks while the goroutine is blocked by the call. The check fails, and panics with the message: "main bubble goroutine has exited but blocked goroutines remain". Sometimes you need to test objects that use resources and should be able to release them. For example, this could be a server that, when started, creates a pool of network connections, connects to a database, and writes file caches. When stopped, it should clean all this up. Let's see how we can make sure everything is properly stopped in the tests. We're going to test this server: Let's say we wrote a basic functional test: The test passes, but does that really mean the server stopped when we called ? Not necessarily. For example, here's a buggy implementation where our test would still pass: As you can see, the author simply forgot to stop the server here. To detect the problem, we can wrap the test in and see it panic: The server ignores the call and doesn't stop the goroutine running inside . Because of this, the goroutine gets blocked while writing to the channel. When finishes, it detects the blocked goroutine and panics. Let's fix the server code (to keep things simple, we won't support multiple or calls): Now the test passes. Here's how it works: Instead of using to stop something, it's common to use the method. It registers a function that will run when the test finishes: Functions registered with run in last-in, first-out (LIFO) order, after all deferred functions have executed. In the test above, there's not much difference between using and . But the difference becomes important if we move the server setup into a separate helper function, so we don't have to repeat the setup code in different tests: The approach doesn't work because it calls when returns — before the test assertions run: The approach works because it calls when has finished — after all the assertions have already run: Sometimes, a context ( ) is used to stop the server instead of a separate method. In that case, our server interface might look like this: Now we don't even need to use or to check whether the server stops when the context is canceled. Just pass as the context: returns a context that is automatically created when the test starts and is automatically canceled when the test finishes. Here's how it works: To check for stopping via a method or function, use or . To check for cancellation or stopping via context, use . Inside a bubble, returns a context whose channel is associated with the bubble. The context is automatically canceled when ends. Functions registered with inside the bubble run just before finishes. Let's go over the rules for living in the bubble. The following operations durably block a goroutine: The limitations are quite logical, and you probably won't run into them. Don't create channels or objects that contain channels (like tickers or timers) outside the bubble. Otherwise, the bubble won't be able to manage them, and the test will hang: Don't access synchronization primitives associated with a bubble from outside the bubble: Don't call , , or inside a bubble: Don't call inside the bubble: Don't call from outside the bubble: Don't call concurrently from multiple goroutines: ✎ Exercise: Testing a pipeline Practice is crucial in turning abstract knowledge into skills, making theory alone insufficient. The full version of the book contains a lot of exercises — that's why I recommend getting it . If you are okay with just theory for now, let's continue. The package is a complicated beast. But now that you've studied it, you can test concurrent programs no matter what synchronization tools they use—channels, selects, wait groups, timers or tickers, or even . In the next chapter, we'll talk about concurrency internals (coming soon). Pre-order for $10   or read online Three calls to start 9 goroutines. The call to blocks the root bubble goroutine ( ). One of the goroutines finishes its work, tries to write to , and gets blocked (because no one is reading from ). The same thing happens to the other 8 goroutines. sees that all the child goroutines in the bubble are blocked, so it unblocks the root goroutine. The root goroutine finishes. unblocks as soon as all other goroutines are durably blocked. panics when finished if there are still blocked goroutines left in the bubble. Sending to or receiving from a channel created within the bubble. A select statement where every case is a channel created within the bubble. Calling if all calls were made inside the bubble. Sending to or receiving from a channel created outside the bubble. Calling or . I/O operations (like reading a file from disk or waiting for a network response). System calls and cgo calls. Mutexes are usually used to protect shared state, not to coordinate goroutines (the example above is completely unrealistic). In tests, you usually don't need to pause before locking a mutex to check something. Mutex locks are usually held for a very short time, and mutexes themselves need to be as fast as possible. Adding extra logic to support could slow them down in normal (non-test) situations. It waits until all other goroutines in the bubble are blocked. Then, it unblocks the goroutine that called it. The bubble checks if the goroutine can be unblocked by waiting. In our case, it can — we just need to wait 3 seconds. The bubble's clock instantly jumps forward 3 seconds. The select in chooses the timeout case, and the function returns . The test assertions for and both pass successfully. There's no call. There's only one goroutine. The goroutine is durably blocked. It will be unblocked at certain point in the future. There are any goroutines that aren't durably blocked. It's unclear how much time to advance. is running. Because of the call in the root goroutine, the wait inside in is already over by the time is checked. Because of the call, the goroutine is guaranteed to finish (and hence to call ) before is checked. The main test code runs. Before the test finishes, the deferred is called. In the server goroutine, the case in the select statement triggers, and the goroutine ends. sees that there are no blocked goroutines and finishes without panicking. The main test code runs. Before the test finishes, the context is automatically canceled. The server goroutine stops (as long as the server is implemented correctly and checks for context cancellation). sees that there are no blocked goroutines and finishes without panicking. A bubble is created by calling . Each call creates a separate bubble. Goroutines started inside the bubble become part of it. The bubble can only manage durable blocks. Other types of blocks are invisible to it. If all goroutines in the bubble are durably blocked with no way to unblock them (such as by advancing the clock or returning from a call), panics. When finishes, it tries to wait for all child goroutines to complete. However, if even a single goroutine is durably blocked, panics. Calling returns a context whose channel is associated with the bubble. Functions registered with run inside the bubble, immediately before returns. Calling in a bubble blocks the goroutine that called it. returns when all other goroutines in the bubble are durably blocked. returns when all other goroutines in the bubble have finished. The bubble uses a fake clock (starting at 2000-01-01 00:00:00 UTC). Time in the bubble only moves forward if all goroutines are durably blocked. Time advances by the smallest amount needed to unblock at least one goroutine. If the bubble has to choose between moving time forward or returning from a running , it returns from . A blocking send or receive on a channel created within the bubble. A blocking select statement where every case is a channel created within the bubble. Calling if all calls were made inside the bubble.

0 views
Anton Zhiyanov 2 months ago

Go proposal: Context-aware Dialer methods

Part of the Accepted! series, explaining the upcoming Go changes in simple terms. Add context-aware, network-specific methods to the type. Ver. 1.26 • Stdlib • Low impact The type connects to the address using a given network (protocol) — TCP, UDP, IP, or Unix sockets. The new context-aware methods ( , , , and ) combine the efficiency of the existing network-specific functions (which skip address resolution and dispatch) with the cancellation capabilities of . The package already has top-level functions for different networks ( , , , and ), but these were made before was introduced, so they don't support cancellation: On the other hand, the type has a general-purpose method. It supports cancellation and can be used to connect to any of the known networks: However, if you already know the network type and address, using is a bit less efficient than network-specific functions like due to: Address resolution overhead: handles address resolution internally (like DNS lookups and converting to or ) using the network and address strings you provide. Network-specific functions accept a pre-resolved address object, so they skip this step. Network type dispatch: must route the call to the protocol-specific dialer. Network-specific functions already know which protocol to use, so they skip this step. So, network-specific functions in the package are more efficient, but they don't support cancellation. The type supports cancellation, but it's less efficient. This proposal aims to solve the mismatch by adding context-aware, network-specific methods to the type. Also, adding new methods to the lets you use the newer address types from the package (like instead of ), which are preferred in modern Go code. Add four new methods to the : The method signatures are similar to the existing top-level functions, but they also accept a context and use the newer address types from the package. Use the method to connect to a TCP server: Use the method to connect to a Unix socket: In both cases, the dialing fails because I didn't bother to start the server in the playground :) 𝗣 49097 • 𝗖𝗟 657296 Address resolution overhead: handles address resolution internally (like DNS lookups and converting to or ) using the network and address strings you provide. Network-specific functions accept a pre-resolved address object, so they skip this step. Network type dispatch: must route the call to the protocol-specific dialer. Network-specific functions already know which protocol to use, so they skip this step.

1 views
Anton Zhiyanov 2 months ago

Go proposal: Compare IP subnets

Part of the Accepted! series, explaining the upcoming Go changes in simple terms. Compare IP address prefixes the same way IANA does. Ver. 1.26 • Stdlib • Low impact An IP address prefix represents a IP subnet. These prefixes are usually written in CIDR notation: In Go, an IP prefix is represented by the type. The new method lets you compare two IP prefixes, making it easy to sort them without having to write your own comparison code. The imposed order matches both Python's implementation and the assumed order from IANA. When the Go team initially designed the IP subnet type ( ), they chose not to add a method because there wasn't a widely accepted way to order these values. Because of this, if a developer needs to sort IP subnets — for example, to organize routing tables or run tests — they have to write their own comparison logic. This results in repetitive and error-prone code. The proposal aims to provide a standard way to compare IP prefixes. This should reduce boilerplate code and help programs sort IP subnets consistently. Add the method to the type: orders two prefixes as follows: This follows the same order as Python's and the standard IANA convention . Sort a list of IP prefixes: 𝗣 61642 • 𝗖𝗟 700355 First by validity (invalid before valid). Then by address family (IPv4 before IPv6). Then by masked IP address (network IP). Then by prefix length. Then by unmasked address (original IP).

1 views
Anton Zhiyanov 3 months ago

High-precision date/time in C

I've created a C library called vaqt that offers data types and functions for handling time and duration, with nanosecond precision. Works with C99 (C11 is recommended on Windows for higher precision). is a partial port of Go's package. It works with two types of values: and . Time is a pair (seconds, nanoseconds), where is the 64-bit number of seconds since zero time (0001-01-01 00:00:00 UTC) and is the number of nanoseconds within the current second (0-999999999). Time can represent dates billions of years in the past or future with nanosecond precision. Time is always operated in UTC, but you can convert it from/to a specific timezone. Duration is a 64-bit number of nanoseconds. It can represent values up to about 290 years. provides functions for common date and time operations. Creating time values: Extracting time fields: Calendar time: Time comparison: Time arithmetic: Formatting: Marshaling: Check the API reference for more details. Here's a basic example of how to use to work with time: If you work with date and time in C, you might find useful. See the nalgeon/vaqt repo for all the details.

0 views
Anton Zhiyanov 3 months ago

Gist of Go: Atomics

This is a chapter from my book on Go concurrency , which teaches the topic from the ground up through interactive examples. Some concurrent operations don't require explicit synchronization. We can use these to create lock-free types and functions that are safe to use from multiple goroutines. Let's dive into the topic! Non-atomic increment • Atomic operations • Composition • Atomic vs. mutex • Keep it up Suppose multiple goroutines increment a shared counter: There are 5 goroutines, and each one increments 10,000 times, so the final result should be 50,000. But it's usually less. Let's run the code a few more times: The race detector is reporting a problem: This might seem strange — shouldn't the operation be atomic? Actually, it's not. It involves three steps (read-modify-write): If two goroutines both read the value , then each increments it and writes it back, the new will be instead of like it should be. As a result, some increments to the counter will be lost, and the final value will be less than 50,000. As we talked about in the Race conditions chapter, you can make an operation atomic by using mutexes or other synchronization tools. But for this chapter, let's agree not to use them. Here, when I say "atomic operation", I mean an operation that doesn't require the caller to use explicit locks, but is still safe to use in a concurrent environment. An operation without synchronization can only be truly atomic if it translates to a single processor instruction. Such operations don't need locks and won't cause issues when called concurrently (even the write operations). In a perfect world, every operation would be atomic, and we wouldn't have to deal with mutexes. But in reality, there are only a few atomics, and they're all found in the package. This package provides a set of atomic types: Each atomic type provides the following methods: reads the value of a variable, sets a new value: sets a new value (like ) and returns the old one: sets a new value only if the current value is still what you expect it to be: Numeric types also provide an method that increments the value by the specified amount: And the / methods for bitwise operations (Go 1.23+): All methods are translated to a single CPU instruction, so they are safe for concurrent calls. Strictly speaking, this isn't always true. Not all processors support the full set of concurrent operations, so sometimes more than one instruction is needed. But we don't have to worry about that — Go guarantees the atomicity of operations for the caller. It uses low-level mechanisms specific to each processor architecture to do this. Like other synchronization primitives, each atomic variable has its own internal state. So, you should only pass it as a pointer, not by value, to avoid accidentally copying the state. When using , all loads and stores should use the same concrete type. The following code will cause a panic: Now, let's go back to the counter program: And rewrite it to use an atomic counter: Much better! ✎ Exercise: Atomic counter +1 more Practice is crucial in turning abstract knowledge into skills, making theory alone insufficient. The full version of the book contains a lot of exercises — that's why I recommend getting it . If you are okay with just theory for now, let's continue. An atomic operation in a concurrent program is a great thing. Such operation usually transforms into a single processor instruction, and it does not require locks. You can safely call it from different goroutines and receive a predictable result. But what happens if you combine atomic operations? Let's find out. Let's look at a function that increments a counter: As you already know, isn't safe to call from multiple goroutines because causes a data race. Now I will try to fix the problem and propose several options. In each case, answer the question: if you call from 100 goroutines, is the final value of the guaranteed? Is the value guaranteed? It is guaranteed. Is the value guaranteed? It's not guaranteed. Is the value guaranteed? It's not guaranteed. People sometimes think that the composition of atomic operations also magically becomes an atomic operation. But it doesn't. For example, the second of the above examples: Call 100 times from different goroutines: Run the program with the flag — there are no races: But can we be sure what the final value of will be? Nope. and calls are interleaved from different goroutines. This causes a race condition (not to be confused with a data race) and leads to an unpredictable value. Check yourself by answering the question: in which example is an atomic operation? In none of them. In all examples, is not an atomic operation. The composition of atomics is always non-atomic. The first example, however, guarantees the final value of the in a concurrent environment: If we run 100 goroutines, the will ultimately equal 200. The reason is that is a sequence-independent operation. The runtime can perform such operations in any order, and the result will not change. The second and third examples use sequence-dependent operations. When we run 100 goroutines, the order of operations is different each time. Therefore, the result is also different. A bulletproof way to make a composite operation atomic and prevent race conditions is to use a mutex: But sometimes an atomic variable with is all you need. Let's look at an example. ✎ Exercise: Concurrent-safe stack Practice is crucial in turning abstract knowledge into skills, making theory alone insufficient. The full version of the book contains a lot of exercises — that's why I recommend getting it . If you are okay with just theory for now, let's continue. Let's say we have a gate that needs to be closed: In a concurrent environment, there are data races on the field. We can fix this with a mutex: Alternatively, we can use on an atomic instead of a mutex: The type is now more compact and simple. This isn't a very common use case — we usually want a goroutine to wait on a locked mutex and continue once it's unlocked. But for "early exit" situations, it's perfect. Atomics are a specialized but useful tool. You can use them for simple counters and flags, but be very careful when using them for more complex operations. You can also use them instead of mutexes to exit early. In the next chapter, we'll talk about testing concurrent code (coming soon). Pre-order for $10   or read online Read the current value of . Add one to it. Write the new value back to . — a boolean value; / — a 4- or 8-byte integer; / — a 4- or 8-byte unsigned integer; — a value of type; — a pointer to a value of type (generic).

0 views
Anton Zhiyanov 3 months ago

Go proposal: Hashers

Part of the Accepted! series, explaining the upcoming Go changes in simple terms. Provide a consistent approach to hashing and equality checks in custom data structures. Ver. 1.26 • Stdlib • Medium impact The new interface is the standard way to hash and compare elements in custom collections: The type is the default hasher implementation for comparable types, like numbers, strings, and structs with comparable fields. The package offers hash functions for byte slices and strings, but it doesn't provide any guidance on how to create custom hash-based data structures. The proposal aims to improve this by introducing hasher — a standardized interface for hashing and comparing the members of a collection, along with a default implementation. Add the hasher interface to the package: Along with the default hasher implementation for comparable types: Here's a case-insensitive string hasher: And a generic that uses a pluggable hasher for custom equality and hashing: The helper method uses the hasher to compute the hash of a value: This hash is used in the and methods. It acts as a key in the bucket map to find the right bucket for a value. checks if the value exists in the corresponding bucket: adds a value to the corresponding bucket: Now we can create a case-insensitive string set: Or a regular string set using : 𝗣 70471 • 𝗖𝗟 657296 (in progress)

2 views
Anton Zhiyanov 3 months ago

Write the damn code

Here's some popular programming advice these days: Learn to decompose problems into smaller chunks, be specific about what you want, pick the right AI model for the task, and iterate on your prompts . Don't do this. I mean, "learn to decompose the problem" — sure. "Iterate on your prompts" — not so much. Write the actual code instead: You probably see the pattern now. Get involved with the code, don't leave it all to AI. If, given the prompt, AI does the job perfectly on first or second iteration — fine. Otherwise, stop refining the prompt. Go write some code, then get back to the AI. You'll get much better results. Don't get me wrong: this is not anti-AI advice. Use it, by all means. Use it a lot if you want to. But don't fall into the trap of endless back-and-forth prompt refinement, trying to get the perfect result from AI by "programming in English". It's an imprecise, slow and terribly painful way to get things done. Get your hands dirty. Write the code. It's what you are good at. You are a software engineer. Don't become a prompt refiner.

1 views
Anton Zhiyanov 3 months ago

Go is #2 among newer languages

I checked out several programming languages rankings. If you only include newer languages (version 1.0 released after 2010), the top 6 are: ➀ TypeScript, ➁ Go, ➂ Rust, ➃ Kotlin, ➄ Dart, and ➅ Swift. Sources: IEEE , Stack Overflow , Languish . I'm not using TIOBE because their method has major flaws. TypeScript's position is very strong, of course (I guess no one likes JavaScript these days). And it's great to see that more and more developers are choosing Go for the backend. Also, Rust scores very close in all rankings except IEEE, so we'll see what happens in the coming years.

0 views
Anton Zhiyanov 3 months ago

Go proposal: new(expr)

Part of the Accepted! series, explaining the upcoming Go changes in simple terms. Allow the built-in to be called on expressions. Ver. 1.26 • Language • High impact Previously, you could only use the built-in with types: Now you can also use it with expressions: If the argument is an expression of type T, then allocates a variable of type T, initializes it to the value of , and returns its address, a value of type . There's an easy way to create a pointer to a composite literal: But no easy way to create a pointer to a value of simple type: The proposal aims to fix this. Update the Allocation section of the language specification as follows: The built-in function creates a new, initialized variable and returns a pointer to it. It accepts a single argument, which may be either an expression or a type. ➀ If the argument is an expression of type T, or an untyped constant expression whose default type is T, then allocates a variable of type T, initializes it to the value of , and returns its address, a value of type . ➁ If the argument is a type T, then allocates a variable initialized to the zero value of type T. For example, and each return a pointer to a new variable of type int. The value of the first variable is 123, and the value of the second is 0. ➀ is the new part, ➁ already worked as described. Pointer to a simple type: Pointer to a composite value: Pointer to the result of a function call: Passing is still not allowed: 𝗣 45624 • 𝗖𝗟 704935 , 704737 , 704955 , 705157

0 views