Emulating old junk from yesteryear – or my obsession making native resolution PS2 emulation look good
Lately I’ve been on a kick to tackle the latter part of PS2 graphics emulation that never seems to come up, analog TV emulation. 99% of the PS2 emulation space is all about upscaling to the max, cranking out 4K with polygons so sharp they can cleave mountains in half, but that’s not particularly to my taste, which is why paraLLEl-GS focuses more on the super-sampling aspects rather than raw upscaling. Earlier blog post on the paraLLEl-GS here if you have no idea what I’m referring to. For example, a raw native render with progressive scan (640×448) with paraLLEl-GS backend: With Vulkan HW renderer in PCSX2, a basic 2x upscale would look like: which is obviously higher resolution, but does not attempt to resolve aliasing either, and as upscaling factors go up, the mismatch between texture resolution, polygon counts and output resolution create a jarring effect for me. The approach I tend to favor is super-sampling and keeping the resolution native, even if it means a more blurred resolve. Here’s how it’d look with 16x SSAA. This is quite overkill for most cases, but why not. There is a special mode in paraLLEl-GS that can scanout the 16x super-samples at double the resolution, which is effectively 4x SSAA at double the resolution. It can look great when it works well, but the key word is when . This approach is not playing to the strengths of paraLLEl-GS at all, but it’s a thing. The main problem is that 2D images like HUD elements remain at native resolution with integer scale, which can create a jarring look, and post processing passes can mess up things in some cases. My main focus is on the native resolution, super sampled output, since it has very few gotchas compared to other upscaling styles. For a while, I’ve relied on FSR1 to do the upscaling, which is just a temporary hack. While properly anti-aliased 3D can look surprisingly good with FSR1 when blown up to large resolutions (what it is designed for), 2D game elements still create questionable artifacts. Text upscaling starts looking like those old HQ2x, SuperEagle and xBR filters that used to be popular with SNES emulators. That’s likely because FSR1 is “just” edge-aware Lanczos filtering at its core. Still, at SD resolutions there’s only so much you can do to blow it up to a modern display. The result here is of course quite blurry, but looking at it from a reasonable distance, it can look sort-of okay on a good day. The only proper way to display SD content is in my opinion to use a CRT, or in lieu of getting a nerd tan, CRT shaders. Simulating CRTs has been done to death to the point Hel is blushing, so I feel a bit uncomfortable even trying to write about it, but this post wouldn’t be complete without it. RetroArch has like 23428323 CRT shader presets already, and there’s nothing novel about any of this. However, there are some considerations for PS2 that most CRT shaders don’t target: A lot of CRT shaders assume a high quality VGA-style monitor. Nothing wrong with that of course, but I find most of them a bit “too” good for PS2. Where we’re going, we’ll need that analog fuzz too. I’ve been deep down the rabbit hole looking up old specifications to get a deeper appreciation for the brilliant engineering involved in making color TV work all those years ago, and the PS2 era was the last hurrah for SD CRTs, 480i warts and all. I’ve sort of gone through all of this stuff before back in my early days of programming, but I’d like to think I’m a little smarter than I was back then, and I learned far more details than I knew before. From a graphics programming PoV this is hardly “difficult” stuff like debugging random GPU hangs at 10pm on a Friday coming from the latest and greatest AAA game, but hey, sometimes you just gotta relax with some good old DSP coding to stay sane. The signal processing for NTSC and PAL is fairly straight forward and it’s actually a good entry point into signal processing for graphics programmers since it combines very visual things with actual real world problems. Component is the highest quality analog cable (actually, 3 cables!) available to consumers. It supports progressive scan which very few PS2 games natively supported. Simulating this cable is quite trivial. Composite is the yellow cable “everyone” used. This one is the hardest since separating luma from chroma in the same single-channel signal is actually kinda complicated to do well. Most of the time I spent was trying various strategies for this problem and try to come up with a look that seems authentic and “tastefully shitty” S-Video is very similar to composite, except that luma and chroma are separate signals. Chroma is packed into one cable and chroma decoding is basically same as for composite signals, where phase and amplitude dictate the hue and saturation. Main difference to component is that bandwidth for chroma should be quite low, so more color smearing should be present. Us Euro-bros technically had RGB SCART too, but I’ve never seen those cables myself for a Sony console. I believe our old N64 and GameCube had those, but I’d have to double check next time I visit the family … While it is very easy to find a ton of content online about old TV standards, your random YouTube video is not going to have the minute detail needed to implement much. I found “Video Demystified – A Handbook for the Digital Engineer (2005)” by Keith Jack which has ton of great detail that is often left out to make sure my implementation stays grounded. BT.601 defined how to take NTSC and PAL signals and turn them into the digital domain. It is the foundation for all digital video today. The primary interest for us is that the standard defines a 13.5 MHz sampling rate. This was supposedly chosen since it was convenient for both NTSC and PAL and filtering requirements. This scheme works out to 720 horizontal pixels for NTSC and PAL when you account for 525 lines at 29.97 Hz and 625 lines at 30 Hz. The H-sync part of the analog signal pads out to a bit over 800 pixels, but that’s irrelevant here. I also learned just recently why it’s called YUV444, 422, 420, etc: The “4:2:2” notation now commonly used originally applied to NTSC and PAL video, 480i and 480p Systems implying that Y, U and V were sampled at 4×, 2× and 2× the color subcarrier frequency, respectively. The “4:2:2” notation was then adapted to BT.601 digital component video, implying that the sampling frequencies of Y, Cb and Cr were 4×, 2× and 2× 3.375 MHz, respectively. Now you know! The Nyquist frequency of SD video is thus 6.75 MHz. Talking about video like this is a little weird, but it will make more sense later since analog signals like NTSC and PAL have a certain bandwidth, which limits how much horizontal resolution we can cram into it. The number of video lines is hardcoded. The video signal is really just a 1D signal if you look at it in an oscilloscope (haven’t seen those since my Bachelors …). The next question was to determine the sampling rate of the PS2 CRTC. Given it generates analog signals, it’s not obvious that PS2 would use a standard frequency here. However, several old forum threads I found did indeed suggest that PS2 is based around the 13.5 MHz rate. This is likely because it could use off-the-shelf video DACs at the time. The nominal maximum horizontal resolution of PS2 is 640 pixels, but it seems like overscan is supposed to stretch the 640 pixels out to fill the screen anyway. Even if 480 lines are “visible” in NTSC, games typically just render 448 lines since the top and bottom is eaten by overscan by most TVs. The actual CRTC clock seems to be 54 MHz, because when programming the CRTC, you’re supposed to set some dividers which ends up determining the resolution. E.g. 640 pixel width is a divider of 4, 512 pixels a divider of 5 and so on. It conveniently supports most relevant horizontal resolutions like this. My armchair theory for how this works is that the CRTC runs at 54 MHz and uses the divider to do a zero order hold (aka nearest filtering) which is then fed into the DAC. Either that or the console is really doing horizontal linear filtering up to 640 pixels and does composite video generation at 13.5 MHz. I’m not sure what really happens, so I just have to guess. I’m not quite obsessed enough to have an oscilloscope to suss out micro-details like these from a real PS2. The PS2 is technically able to emit 1080i and 720p signals, so there’s no reason why it couldn’t do analog video processing at the full 54 MHz. To debug any of this we need test images. I don’t exactly have test signal generators that TV engineers had back in the day, so I had to synthesize something. The use for these is to validate various edge cases in the NTSC and PAL decoding process. The basic idea is to have some 75% color bars at the top to validate the color pipeline. The middle section is a sweeping increase of horizontal frequency up to Nyquist of 6.75 MHz. Each 1 MHz section is delimited by a single line color bar. The lower portion is a sweep with increasing frequencies. This is used to test the comb filter. Doing FIR filter design by hand is uh … not something I have time for. Anything beyond the basic windowed sinc design process gets annoying very quickly. GNU Octave is a free alternative to Matlab that does what we need for these tasks. While we can do composite video generation at the 13.5 MHz rate, it is not easy to avoid aliasing, especially since we will be modulating with a chroma subcarrier that will shift the spectrum all over the place, potentially creating aliases out of thin air. The handbook calls for a 2x oversampling to avoid this. First, the image is padded out to fill 720 horizontal pixels for BT.601 reasons. Then, integer scale the image up to 2880xN (to match with how I understand the CRTC to work), and downsample that with a low-pass to 1440xN which completes a clean 2x oversampling. A polyphase upsampling filter is of course possible too to avoid the intermediate upscale, but that’s needless complexity for something that literally takes 10 microseconds on the GPU The immediate upscale is not written to memory at least, the filter just assumes a 2880 pixel input and it’s just sampling the input texture redundantly instead. Since these are fundamentally analog signals, we only filter horizontally. The natural image type for these is 1D Array. All the processing shaders are 1D as well. (The 1504 width is just extra padding for convolutions.) It’s all the same thing really. NTSC is the oddball one with YIQ but in practice this difference is completely irrelevant. The original idea of IQ was to take the blue and red difference signals and rotate them by 33 degrees to make it so that I would align with skin tones better, and give I more bandwidth than Q . However, the handbook doesn’t seem to give it too much consideration. Especially not for composite signals which are not meant to be sent over the air. The implementation of this is really just doing a 3×3 matrix multiply with RGB, nothing special. NTSC and PAL have fairly narrow bandwidth defined for their broadcast signals, but composite signals don’t really have those strict limits. However, since the digital input has Nyquist at 6.75 MHz the handbook calls for bandlimiting the signal to about 6 MHz anyway. Super sharp falloff filters just lead to ringing. Component video is specified by BT.1358 and doubles the sampling rate to 27 MHz. Y should fall off at about 11 MHz and chroma half that. Interpreting that in interlaced terms, the falloff should start at 5.5 MHz, getting close-ish to the Nyquist limit for SD video. After filtering luma and chroma, just decode back to RGB and we have a nice signal, done. Chroma is nuked down to ~1.3 MHz or so for composite / S-Video. Sometimes, 0.6 MHz is called for it seems, but it’s quite unclear … S-Video can ignore the modulation + demodulation part if we just want to simulate the smear, but it was easier to let it go through the full chain for completeness. Despite the chroma bandwidth being so horrible, it sort of looks good. Amazing how terrible our eyes are at seeing color detail. I wonder if this is there the esoteric 4:1:1 YCbCr subsampling mode comes from now that I think about it … E.g. NTSC luma filter, which starts falling off at about 4.2 MHz and stops at 6.75 MHz-ish. Broadcast NTSC is capped well below this bandwidth. Chroma encode, with ~1.3 MHz passband: Main difference for PAL is that luma passband is a bit higher. PAL-B (which Norway used) seems to specify 5 MHz rather than 4.2 MHz. Generating these filters can be done easily with firls and friends in Octave. The handbook calls for a gaussian kernel for chroma to avoid any ringing, but I missed that memo Either way, these are implemented with trivial convolutions which GPUs eat up like butter. This is the first point where NTSC and PAL differ significantly. While NTSC and PAL have different color subcarrier frequencies, PAL is also a bit more sophisticated. The “sign” of V is where Phase Alternate Line comes in. It flips every scanline. In broadcasting, this was meant to fix bad colors being introduced during broadcasting through complicated terrain (and boy do we have that over here in Norway). Bad phase shifts being introduced by NTSC will manifest as hue shifts. The basic idea behind PAL is that if phase shifts are introduced by signal reflections the sign flipping of V every scanline ensures that the decoded hue is complementary from scanline to scanline. This manifests as the Hanover Bar artifact. By averaging chroma from scanlines, the errors cancel out and we recover the correct color with slightly less saturation. The cost is of course reduced vertical chroma precision, but given how comically smeared chroma is horizontally, I’m not sure this matters, and digital video uses 4:2:0 subsampling anyway. Now, broadcasting considerations are kind of irrelevant for something like composite video (I would think), but I’m not sure if PAL TVs skip the filter for anything not coming from an antenna. I kept the vertical chroma filter in my implementation because it’s neat to have. The NTSC chroma subcarrier is constructed such that every scanline completes 227.5 cycles. Every line flips chroma phase which is very convenient and makes luma and chroma separation less complicated. The NTSC chroma pattern is a checkerboard as a result. PAL is more annoying here. A half cycle method would not work since V is already flipping every line, so PAL chose 287.75 cycles. On top of that a tiny 1 / 625 cycle offset per line is added for … reasons. The V flipping leads to a chroma pattern where U takes a diagonal pattern while V has a pattern along the other diagonal. Frame progression is also a concern. As fields are scanned, each time the same field is drawn, the chroma subcarrier should have opposite phases. This follows naturally from how fields are drawn. The period of NTSC is 4 fields: As we can see, things repeat after 4 fields and this point is easy to miss. The artifacts introduced by composite video should ideally cancel themselves out over time which manifests itself as flickery noise rather than a horribly glitched image. PAL is annoying and has a longer cycle of 8 fields due to the three-quarter of a cycle setup. After completing 2500 lines (625 * 4), the chroma subcarrier has completed an integer number of cycles, and the sign of V is back to where it started. After modulation, the NTSC color bar signal looks more like: and the next line flips phase as expected: Some old Nintendo consoles (and likely others) emit NTSC and PAL in non-standard ways. E.g. NES is infamous for shifting the chroma carrier by 120 degrees instead of 180 which leads to very particular artifacts. See NESdev Wiki for more detail. It’s easy to mess up RGB to YUV conversions and the compositing process. The handbook had reference outputs for RGB inputs in NTSC and PAL where I could confirm that the math was indeed correct. The things to check are that minimum and maximum of the signal are what they should be. NTSC, at least the US variant of it maps black to 7.5 IRE (just think of it as some abstract voltage) and white to 100.0 IRE, and it tripped me up a bit at first since the NTSC color bars were defined in terms of this shifted and scaled IRE value. Looking at the peaks and valleys of the generated composite signal in RenderDoc is enough since we just need to eyeball it. Close enough is good enough. This is non-trivial and a source of endless head scratching. Chroma information lives in the frequency spectrum around the carrier, but so does higher frequency luma detail. The basic theory for comb filters is to take advantage of the opposing phase of the chroma carrier. For NTSC, I used this basic structure from the handbook: In code, this translates to: Ideally, by adding two neighbor lines, chroma should cancel out and only luma remains. Subtracting, we cancel luma and only chroma remains. We know that chroma won’t (or at least shouldn’t) exist outside its bandwidth, so the result is run through a bandpass filter that centers around the carrier frequency, and we have an estimate for the modulated chroma signal. Since the composite signal is Y + C, we subtract the chroma estimate from composite signal to form a Y estimate. Chroma can now be demodulated and low-passed to remove the harmonics introduced by demodulation. This filter works “perfectly” for regions where the chroma is constant, but not so much where there are discontinuities. This results in “chroma dots” where the color subcarrier bleeds into decoded luminance. Notice the dot pattern on the bottom of the image. Thus, different colors modulate the luminance intensity differently, creating a “dot” pattern on the scan line between two col- ors. To eliminate these “hanging dots,” a chroma trap filter is sometimes used after the comb filter. In the real world of analog circuitry, having a perfectly locked signal like this is probably also not too realistic to assume. The literature also calls out for notch filtering as an approach. I attempted combining a comb filter with a notch filter on top to reduce the artifacts, but it is quite tricky to create a notch filter that works well. A simple FIR notch filter with zeroes is easy enough to make: This filter is convolved with a simple low-pass to complete the luma decoding filter. This approach just leads to severe blurring for NTSC and a band-stop filter approach just lead to less severe blurring and ringing instead, so I’m not sure what should be done. Unfortunately, the handbook isn’t clear on what kind of filtering is called for here. IIR notch filter designs can be super sharp to surgically carve out the carrier, but IIR filtering is also a massive pain in the ass on GPUs. It’s also likely to ring heavily too, which I found rather annoying in my testing. E.g. 3-line comb NTSC without the notch (integer nearest upscale from 640×240): and with notch: Yikes. There’s no way this notch approach is correct. It’s like we’re getting double vision here. It does clean up the chroma dots though, so … yay? Going beyond these base techniques there’s adaptive filtering where the filtering strategy changes based on which kind of case we’re dealing with. And even more sophisticated is taking advantage of temporal information (look ma’, TAA has been a thing since forever :D) since N fields in the past we have complementary chroma phase perfectly aligned to our pixel grid. Very cool stuff, but I doubt consumer TVs at the time would have those. The added latency for doing this kind of analysis doesn’t sound like something you’d want for games at least … Either way, I’m not designing high-end TV circuitry in the late 90s/early 2000s here. We can just flip on S-Video to simulate the perfect Y/C separator, so at some point I have to decide that I’ve done enough filter masturbation and move on. This was way harder than expected and I had to bang my head against the wall for a while to come up with a good solution. The ~90 degree shift every scanline means the basic comb filter for NTSC won’t work at all. The handbook has two main strategies here. Either a delay line which is slightly longer than a scanline to align the phases: Or use a highly magical “PAL modifier”: The function of this modifier is esoteric as all hell, but I think the purpose of it is to phase shift the signal by 90 degrees to “realign” the carrier somehow (it will still be off by 0.6 degrees). This filter path with two bandpass filters just got so messy, and I couldn’t figure out how to debug the thing effectively (were the inevitable visual artifacts my bugs or just the filter being bad?) that I eventually gave up and designed my own filter. That’s more fun anyway. I started from first principles and designed a 3×3 kernel that should be able to perfectly pass a chrominance signal and 100% reject any luminance signal that is DC in either horizontal or vertical direction. To make things simple I started with the assumption of 4 samples per subcarrier cycle to make the examples easier. Given a constant value of 1.0 for U, a signal would look something like: U = sin(wt) with N + 0.75 cycles per line here. A kernel that satisfies the criteria is: The sum of all rows and column is 0, meaning that if the signal is DC either horizontally or vertically, the result is completely filtered out. The filter also rejects V signals. V = +/- cos(wt) and looks like This is just the same signal flipped horizontally, so: Then a combined filter is made that accepts U and V signals together. U and V can be perfectly split later during demodulation so that is okay. This just boils down to a simple diagonal edge detection filter (high pass vertically and horizontally), but actually works quite well. To deal with the actual 2x oversampled rate of 27 MHz and the PAL subcarrier at ~4.433 MHz, the ~90 degree shift per line is about 1.51 samples, so to make this sort of work, I stretched out the horizontal kernel to a 5-tap filter: The vertical kernel remains the same of Some error is introduced since we’re not sampling the signal 100% correctly anymore (theoretically we need a sinc to reconstruct the signal perfectly), but I measured the reconstructed error to be -40 dB, which is good enough I think. The measured error for U and V were also similar, which indicates no weird artifacts from the V flips. With this 15-tap kernel, we get a pretty good chroma estimate even in PAL. From here the same ideas as NTSC apply, bandpass the estimate and subtract it from composite signal to get luma. Notch filtering to cleanup the chroma dots worked way, way better for PAL than NTSC, likely because the carrier has a much higher frequency on PAL, so the low pass behavior of the notch isn’t as devastating to image quality as it is on NTSC. In the end I think I prefer 3-line comb + notch for PAL and just plain 3-line comb for NTSC. These screenshots are just one still frame (or rather, field). The color fringing will cancel out the next field and it’s hard to show the effect without seeing it at full 60 fields per second. While PlayStation 2 didn’t support this hack mode, GameCube did back in the day. It’s a non-standard video mode that has same refresh rate as NTSC and vertical resolution, but retains the bandwidth and chroma encoding system of PAL, the best of both worlds! My implementation can trivially implement this by just enabling PAL on 60 Hz games. Only thing I’m not quite clear on is how the 1 / 625 subcarrier offset per line is supposed to work, but it’s a non-standard mode anyway, so eeeeeh. With comb filter and notch on NTSC: As expected, luma detail is murdered around ~3.58 MHz carrier. Also serious color fringing due to the extreme high frequency diagonal patterns. In the pattern at the bottom, no fringing is observed since the comb filter did its job as expected. PAL is similar, but the carrier and notch moves to ~4.43 MHz instead: The main feature of PAL is being robust against phase shifts during analog broadcasts. It’s a little unclear if composite inputs cared about this case, but for completeness sake, I implemented it. This path is naturally skipped for S-Video and Component outputs since I can’t imagine a TV caring about that for the more luxurious inputs. It doesn’t take many degrees of error in the phase to get quite different colors for NTSC. For PAL, the phase error manifests as complementary errors every line. However, by averaging out chroma vertically, we can recover the original chroma almost perfectly. At worst, a little less saturation. This topic is kind of unfortunate in that it’s done to death already, and it’s a 100% subjective topic meaning that everyone has some kind of opinion, none which agree with each other. Holy wars have been fought over less. Trying to write anything fresh about this topic is futile in 2026 – the heyday of CRT hobbyist shader development was in the early 2010s – but I felt the need to explain what I did at the very least. If anything, it’s a useful intro to writing your own shader. https://nyanpasu64.gitlab.io/blog/crt-appearance-tv-monitor/ is a good read too for more background information. The most obvious part of a CRT filter is scanlines, however, the idea that CRT images should have clearly visible scanlines is actually an artifact of 240p. For PS2 games, we’re operating at either 480i or 480p for NTSC. For interlaced video, we expect each individual field to have clearly visible scanlines, but the complete image (fused together by our brains) should not. The beam profile should be tuned as such. What most shaders do is for each scanline to take a gaussian distribution in the Y direction, sampled for the neighbor lines to cover the useful portion of the kernel. Another common effect is that very bright scanlines are smeared out, supposedly due to the electron guns not being as stable when they’re driven at high voltages. This can be simulated by varying the standard deviation. It can be subtle, but creates a neat effect I think. Exactly how to come up with the beam profile for a given input voltage is purely up to taste I suppose, I doubt there is a linear relationship between R’G’B’ value and standard deviation of the beam The sampled RGB value is in gamma-space still, since the CRT gamma curve is due to the phosphor response, not the CRT itself adjusting the gamma curve. The BT.1886 standard calls for a 2.4 gamma for SD content, which is the default and looks good. I also added options to use 2.2 (NTSC legacy) and 2.8 (?!, PAL legacy) for fun. Most CRT shaders I’ve seen apply gamma in this way: I think it would depend on whether or not the phosphor’s response is a function of how many electrons hit it, or individual “particles” respond to the energy of the electrons hitting them, where the gaussian beam profile is just a distribution of how many particles light up. In the latter interpretation, the code as-is makes sense, while the first interpretation would call for apply a gamma function on the gaussian profile. The visual output as-is looks good to me at any rate. After this point, all color math happens in linear light, so floating point render targets is a must. Color CRTs get their colors by having colored phosphors that are arranged in some kind of grid. The venerable Trinitrons use vertical stripes of RGB, and I like that look. While CRTs don’t really have a horizontal resolution, there is a “dot pitch”, which sort of dictates resolution. This part is the key to create the “texture” of a CRT. From what I read online a typical dot pitch for consumer TVs was 0.5 – 1.0 mm, and for a typical 20″ CRT, I estimated a reasonable number of RGB triads to be ~640 or so. Close enough to BT.601 standard horizontal resolution, neat. From what I understand, this value is also referred to as “TVL”, and these values seem ballpark reasonable. When looked at a distance from the screen, the dots blend together nicely as we’d expect. It’s basically just LCD subpixels, just larger. The dot layout I used was mostly lifted from Lotte’s CRT shader, but the approach can probably be found in a million shaders already. Just alternating stripes of R, G and B. I suppose a perfect mask of 0.0 should be used here, but it doesn’t end up looking as good as I’d like, even after adding glow effects, so I think the intent behind passing through a portion of other colors is more of a pragmatic decision. At least I cannot think of a physical interpretation of why we’d want to do it. The signal we are creating has a ton of high frequency information and we need to be very careful sampling it such that obvious aliasing is avoided. The common mistake is to just render this shader at output resolution and hoping for the best. This will almost surely lead to terrible aliasing patterns in the image. Bad aliasing of the scanlines in Y direction leads to a low frequency pumping pattern in the image which is extremely distracting, and bad aliasing in the X direction leads to a horribly noisy pattern due to uneven sampling of the dot mask. The easy fix is to render the effect at an integer multiple. E.g. if input image is 240 lines, render to a height that is an integer multiple of that. For the color dot mask, make sure the horizontal resolution is e.g. 3 times the dot resolution (one pixel for R, G, B dots). I ended up with something like width = 640 * 3, and height = 240 * 6 (3x sampling for progressive). A nebulous effect of the CRT is the glow aspect to it. Anyone can tell it’s there, but it’s not entirely clear to me why this happens. Google searches don’t turn up anything useful either. Without knowing the physical reason for it, it’s hard to emulate accurately. Could it be scattering effects inside the thick CRT glass perhaps? Either way, the common way to emulate this effect is to compute a gaussian blur (lots of those around here) and composite it over the original image. Very similar to the usual HDR + bloom effect that games in the 2010s loved to overuse. The main effect here is that the phosphor dots end up blending together nicely, yet retain the added “texture” that the aperture grille pattern gives. Humans like to see some high frequency detail, even if that detail is completely bogus. That’s the common trick behind video compression after all. The glow component, boosted up a ton to make it very visible: With the typical HDR effect in HD games, only very bright pixels participate in the effect, effectively spreading excess light energy over a larger area of pixels. It makes little sense to do anything like this for a CRT shader unless there is a physical threshold where phosphors just randomly start to glow more than they should, but all of this is purely up to taste anyway. Here’s from Soul Calibur II with progressive scan, component cable emulation and 16x SSAA, without any glow added. The look is very harsh to me. (The full-screen image is needed to see it without the extreme aliasing caused by thumbnailing.) Some glow on top and it looks like: Purely up to taste how much to add of course. I like a decent amount of glow. Phosphors don’t turn off right away when they’re lit. It’s very quick though, but adding a few percent of feedback between frames seems to help a bit with making 480i games look better with less flicker. This is not really how things work in the real world I think, but a reasonable approximation. We need a high quality rescaler to get the integer sampled CRT to the screen without introducing significant aliasing from the aperture grille or scanlines. The way to go here is simply to use a proper windowed sinc or something like that. I don’t like it, so I don’t implement it. If you do, make sure to consider resampling it properly to not introduce more aliasing. A point many shaders miss is that the RGB of a modern monitor is not the same as RGB on an old CRT TV. What we think of RGB today is usually BT.709 sRGB which defines a set of color primaries and white point. Old SDTV era video uses BT.601 which is a bit narrower than BT.709. In linear RGB space, this transform is a trivial 3×3 matrix multiply. I actually learned that the very old NTSC 1953 standard defines a set of primaries that were extremely saturated compared to the standards of today. While the primaries of 1953 were aspirational, it was clearly way too early. SMPTE refined NTSC to use more reasonable primaries as part of BT.601. PAL primaries are almost exactly the same (TVs tended to use the same phosphor formulation across the world I suppose), but there’s a theoretical difference so I added both for good measure. Supposedly, Japan kept the use of legacy NTSC 1953 primaries, so that opens an interesting question if the same games actually looked vastly different in Japan compared to the rest of the world? I’ve never heard anything mention this before, so who knows. Either way, I support enabling those primaries for fun. The look of it is quite … something? It would need a HDR monitor with solid gamut to do justice. Here’s with standard BT.601 primaries: (From Legaia 2, which is a purely field rendered game, hence the scanlines) and with NTSC 1953 primaries. Almost like the “Interpret sRGB as Display P3” bullshit that phones do these days to “pop”. Given all the masking we’re doing which lower average brightness, it’s beneficial to support HDR10 rendering. Now that HDR is widely available on Linux too, enjoying some HDR CRT shaders is a good time I added a few modes where I can target a specified maximum nit level. KDE at least respects MaxCLL HDR metadata and disables tonemapping if MaxCLL falls within bounds of the display. I also added a no-tonemap option where MaxCLL = 0 (unknown), which makes KDE tonemap how it wants to. Black Frame Insertion and their friends have been a thing in emulation for ages, and it works quite well with CRT simulation, especially to sell the de-interlacing effect properly. In my implementation, I query EXT_present_timing and decide how many frames I should insert in-between. There’s a gentler falloff between the frames. The overall screen brightness decreases a lot as expected, but with HDR, we can crank the brightness of the proper frame up to compensate. It’s still very experimental and any missed frame leads to horrrrrrrible flicker at the moment (big epilepsy warning), so it’s not something I actually recommend, but it’s fun to experiment with. While simulating each field independently with scanlines, we get de-interlacing the same way a CRT would in theory. This is not free from flicker of course, but most games had mitigation strategies for this. The PS2 supported blending two frames being sent to the video output. Most games render internally at e.g. 448p @ 30 FPS, but since they cannot output that resolution without component cables, the frame is output interlaced over two fields where the CRTC scans every other line rather than every line. That tends to look quite flickery if done as-is given how aliased PS2 graphics are, so what pretty much every game did was using the two CRTCs to blend the two frames vertically before sending it by programming a 1 pixel offset with duplicate inputs. By shifting the offset every field, the 30 FPS progressive image could be scanned out nicely into a 60 FPS interlaced image. This is the “flicker filter” that some games allowed toggling. Here’s a RenderDoc capture showing that CRTC 1 and 2 are configured with one pixel offset in Y: After merging and blending the frames together, a smoothed image is sent. PCSX2 GSdx and paraLLEl-GS have modes to detect this pattern and just scan out the full 480p of course, without the added blurring. Most games fall into this pattern, which is fortunate if we want to avoid interlacing shenanigans, but not all games are so nice to deal with. The “Anti-Blur” option is designed precisely for disabling this filter. I also added an option to force-disable the automatic progressive scan, mostly to test the video output that the games would actually have output back in the day, which is interlaced video. Some games decided that they wanted to render at 60 FPS and sacrifice half the vertical resolution to do so, jittering the rendering to stay in sync with the interlaced output. These are painful to deal with to this day since they absolutely require some kind of de-interlacing solution to look good. I never got satisfactory results with a typical de-interlacer, but the CRT simulation does a quite good job at it I think. It’s not perfect (interlacing wasn’t exactly perfect on CRTs either), but it’s usable for me to the point now that I can play interlaced games as intended. It’s not really possible to demonstrate this with screenshots. Some games break if you try to promote them to progressive scan, because the games might decide for some stupid reason to use the SCANMSK feature to discard pixels every other line, and rely on the FRAME scanout mode to exactly scan out the pixels that were not masked. Kings’s Field IV is an example of this absolute insanity. I maintain a patch for PCSX2-git which supports parallel-gs and now this CRT/analog thing. This is super niche stuff that I don’t really expect many people to actually use, but it’s there for those who are interested. It does what I need at least, and that’s what’s important to me. I put together some test video clips in HEVC/PQ/4:4:4/1440p at ridiculously high bitrates. I tried AV1 but my CPU could not decode it in real time, so it is what it is ._. Clip 1 is: Raw RGB passed into the CRT. It uses the game’s native 480p and widescreen support. Super sampling is 4x SSAA. Clip 2 is: “PAL60” with 3-line comb + notch. It also uses the more default 4:3 and interlaced video. It has some frame drops which completely botch the interlace and even at comically large bitrates, the aperture grill effect doesn’t translate well, so it is what it is, but it’s a rough approximation of what it should look like. Anyway, I’m happy with the results. Time to actually play something instead of debugging stuff Analog TV input path 480i instead of 240p High refresh rate simulation Field 0: 262 lines, chroma carrier starts in + phase, ends in + phase due to even number of lines Field 1: 263 lines, chroma carrier starts in + phase, ends in – phase due to odd number of lines Field 2: 262 lines, chroma carrier starts in – phase, …