Latest Posts (20 found)
Ginger Bill 1 months ago

Package Managers are Evil

n.b. This is a written version of a dialogue from a YouTube video: 2 Language Creators vs 2 Idiots | The Standup Package managers (for programming languages) are evil 1 . To start, I need to make a few distinctions between concepts a lot of programmers mix up: These are all separate and can have no relation to one another. I have nothing wrong with packages, in fact Odin has packages built into the language. I have nothing wrong with repositories, as that’s how a lot of people discover new packages—a search engine, something I think everyone uses on a daily basis 2 . Build systems are usually language dependent/specific, and for Odin I have tried minimize the need for a build system entirely (at least as a separate thing) where most projects will build with , which works due to the linking information being defined in the source code itself with the system. This leaves package managers ; what do they do? Package managers download packages from a repositories, handles the dependencies and tries to fix them, and then it downloads its dependencies, and its dependencies, and its dependencies… and you can probably see where my criticism is going. This is the automation of dependency hell . The problem is that not everything needs to be automated, especially hell. Dependency hell is a real thing which anyone who has worked on a large project has experienced. Projects having thousands, if not tens of thousands, of dependencies where you don’t know if they work properly, where are the bugs, you don’t how anything is being handled—it’s awful. This the wrong thing to automate. You can do this manually, however it doesn’t stop you getting into hell, rather just slow you down, as you can put yourself into hell (in fact everyone puts themselves into hell voluntarily). The point is it makes you think how you get there, so if you have to download manually, you will start thinking “maybe I don’t want this” or “maybe I can do this instead”. And when you need to update packages, being manual forces you to be very careful. That’s my general criticism: the unnecessary automation. Most packages managers usually have to define what a package is, because the language itself does not have a well defined concept of a package in the language. JavaScript is great example of this as there are multiple different package managers for the language ( being one of the most popular), but because each package manager defines the concept of a package differently, it results in the need for a package manager manager . Yes… this is a real thing. This is why I am saying it is evil , as it will send you to hell quicker. When using some languages, such as Go, most people don’t seem to need many third-party packages even though Go has a built-in package manager. The entrance to hell seems to far and hard to get to 3 . The reason such languages don’t fall into this trap as quickly is that those languages have a really good core/standard library—batteries included. When using Go for example, you don’t need any third-party libraries to make a web server, Go has it all there and you are done. Go even has Go compiler built into the standard library; in fact it has two, a high level one for tooling and one which is the actual compiler itself 4 . In real life, when you have a dependency, you are responsible for it 5 . If the thing that is dependent on you does something wrong, like a child or business, you might end up in jail, as you are responsible for that. Package dependencies are not that far different but people trust them with little-to-no verification. And when something goes wrong, you are on the hook to maintain it. It is a thing you should worry about and take care of. A common thing that people bring up about package managers are security risks. There are indeed serious problems, especially when you blindly trust things you have just randomly started depending from off the internet. However for my needs, those are not even the biggest worries for what I work on, but they might be for you! For me at work, we use currently use SDL2 for our windowing stuff at work, and we have found a huge amount of bugs and we hate it to the point that I/we will probably write our own window and input handling system from scratch for each Operating System we target. At least it is our code and we can depend on it and correct it when things go wrong; we are not having an extra dependency. I know SDL2 is used by millions of people, but we keep hitting all of the bugs. “But it’s great though!”. SDL3 might fix it all but the time to integrate SDL3 would be the same time I could write it from scratch. I am not advocating to write things from scratch. I wish there were libraries I could say that they “Just Work™”, but I still have to depend on them, and they are a liability; not just security liabilities but just bug liabilities. Each dependency is a potential liability. People rarely, if ever, vet their code, especially third-party code. Most people assume random code off the internet works . This is a societal issue where programmers are very high trusting in a place where you should have the least amount of trust possible. To put it bluntly, a lot of programmers come from a highly developed countries which are in general high trust societies, and then they apply that to the rest of their online world. This means you only need one person to do something malicious to something millions depend on to screw everything up. It doesn’t even have to be malicious but a funny bug, where if you clicked one pixel on the screen, it is Rick Rolling you. n.b. This argument was made by ThePrimeagen; not myself We’ve had an explosion of engineers over the past ten years, which have come just into the advent of all of these package managers coming out, for all of these languages, all at the same time. So programming felt very daunting; when you don’t know how something works, it feels very daunting, especially when you first start out. The thing that is confusing, especially the high-trust argument that was being made, there is this weird Gell-Mann amnesia effect going on. You read one page and it’s all about horses and you feel “man I know a lot about horses”. Then flip to the next page and it’s about Javascript and you go “man they got everything wrong about Javascript”. Then you flip the next page and “man I know a lot about beetles”. You’ve just forgot that they are super wrong on the thing you understood, but you think everything else is correct. You’ll find engineers who will go “some of my coworkers are so horrible, hey, let me download this library off the internet, this is going to be awesome”. It’s crazy as if they look and go “wow, one third of our staff cannot program anything, also I am going to trust every open source package I’ve downloaded”. So there is this Gell-Man amnesia in programming code, where people who do open source or open things are viewed as the best of the engineers when that isn’t true. Most people assume programming is like every other industry, like actual engineering which has been around for thousands of years, or modern science which has been around for about half a millenium. People trust who they perceive to be the “experts”, as you see all of these articles, books, conference videos, etc, and they all tell you stuff but for the most part which does not necessarily seem true. I remember trusting those who were perceived to be “experts” which were esposing “wisdom”. However, as I have programmed more over the years, I realized there is very very little wisdom in this industry. This industry is 70–75 years old at best, and that is not old enough to have any good evolutionary selection pressure. It is not old enough to get rid of the bad things—it hasn’t evolved quick enough. We will find out in a few HUNDRED years, and I mean hundreds , what is actual good wisdom is in programming. There are some laws we know like Conway’s Law , where “organizations which design systems (in the broad sense used here) are constrained to produce designs which are copies of the communication structures of these organizations”. Or to rephrase it in programming terms, the structure of the code will reflect the company that programs it. But that is one of the only laws that we know to exist. My general view is that package managers (and not the things I made distinctions about) are probably in general a net-negative for the entire programming landscape, and should be avoided if possible. Excerpt from the Odin FAQ: https://odin-lang.org/docs/faq/#how-do-i-manage-my-code-without-a-package-manager Through manual dependency management. Regardless of the language, it is a very good idea that you know what you are depending on in your project. Copying and vendoring each package manually, and fixing the specific versions down is the most practical approach to keeping a code-base stable, reliable, and maintainable. Automated systems such as generic package managers hide the complexity and complications in a project which are much better not hidden away. Not everything that can be automated ought to be automated. The automation of dependency hell is a case which should not encouraged. People love to put themselves in hell, dragging others down with them, and a package manager enables that. Another issue is that for other languages, the concept of a package is ill-defined in the language itself. And as such, the package manager itself is usually trying to define the concept of what a package is, which leads to many issues. Sometimes, if there are multiple competing package managers with different definitions of what a package is, the monstrosity of a package-manager-manager arises and the hell that brings with it. The term “evil” is being used partially hyperbolic to make a point.  ↩︎ I primarily use DuckDuckGo , but I also use Google and many others because they are pretty much all bad.  ↩︎ This sentence is a quote from ThePrimeagen from that video.  ↩︎ ThePrimeagen mentioned the “Klingon Approach” as a joke. This refers to Klingons (a species of humanoids in Star Trek) which have redundant organs. A very nerdy joke.  ↩︎ This comment was from José Valim (the creator of the Elixir programming language) and this was a very good point which I wanted to add to this article.  ↩︎ Package Repositories Build Systems Package Managers The term “evil” is being used partially hyperbolic to make a point.  ↩︎ I primarily use DuckDuckGo , but I also use Google and many others because they are pretty much all bad.  ↩︎ This sentence is a quote from ThePrimeagen from that video.  ↩︎ ThePrimeagen mentioned the “Klingon Approach” as a joke. This refers to Klingons (a species of humanoids in Star Trek) which have redundant organs. A very nerdy joke.  ↩︎ This comment was from José Valim (the creator of the Elixir programming language) and this was a very good point which I wanted to add to this article.  ↩︎

0 views
Ginger Bill 2 months ago

If Odin Had Macros

I sometimes get asked if Odin has any plans to add hygienic macros or some other similar construct. My general, and now (in)famous, answer is to many such questions is: No . I am not against macros nor metaprogramming in general, and in fact I do make metaprograms quite often (i.e. programs that make/analyse a program). However my approach with the design of Odin has been extremely pragmatic . I commonly ask people who ask for such things what are they specifically trying to solve? Usually they are not trying to do anything whatsoever and are just thinking about non-existent hypotheticals. However, in the cases when people are trying to do something that they believe they need a macro for, Odin usually has a different approach to doing it and it is usually much better and more suited for the specific need of the problem. This is the surprising thing about designing Odin, I was expecting I would need some form of compile-time metaprogramming at the language level, be it hygienic macros, compile-time execution, or even complete compile-time AST modification. But every problem I came across, I find a better solution that could be solved with a different language construct or idiom. My original hypothesis in general has been shown to be wrong. n.b. This is all hypothetical, and the construct is very unlikely to happen in Odin One of the best cases for a macro-like construct that I can think of which Odin does not support would be push-based iterators. Since Go 1.23, it now has a way of doing push-based iterators . I’ve written on this topic before ( Why People are Angry over Go 1.23 Iterators ) and how they work. The main issue I have with them, and which many other individuals have complaints with too, is that they effectively rely on 3-levels deep of closures. You have a function that returns a closure that in turn takes in a closure which is automatically and implicitly generated from the body of a loop. This is honestly not going to be the best construct for performance by any stretch because it is relying on so heavily on closures. For Odin since it has no concept of a closure, being that it is a language with manual-memory-management 1 , it would not be possible to add the same approach that Go does. However there is a way of achieving this in a similar way which requires no closures, and would produce very good code due to its imperative nature. Using the example from the previous Go article I wrote, consider the following pseudo-syntax: The internally expanded code in the backend would look something similar to this: This is of course a restricted form of a hyigenic macro only applicable to iterators. However this is only way to achieve such a construct in the language. However the way you’d write the macro is still extremely imperative in nature. The push-iterator approach allows you to store state/data in the control flow itself without the need for explicit state which a pull-iterator would require. A more common example would be the iteration over a custom hash map: This approach to iterators is very imperative and not very “composable” in the functional sense. You cannot chain multiple iterators together using this approach. I personally don’t have much need for composing iterators in practice and I usually just want the ability to iterate across a custom data structure and that’s it. I honestly don’t think the composability of iterators is an actual need most programmers have, but rather something that “seems cool” 2 to use. I don’t think I can think of a case when I’ve actually wanted to use reusable composable iterators either, and when I’ve had something near to that, I’ve just written the code in-line since it was always a one-off. It’s unlikely that I would ever add a construct like this to Odin. Not because I don’t think it isn’t useful (it obviously is) but rather it is a slippery slope construct. A lot of features and constructs that have been proposed to me over the years usually fall into this category. A slippery slope is rarely a fallacy in my opinion 3 when it comes to design and that if you allow it at one point, there isn’t much justification to stop it later on. Giving into one whim does lead to giving into another whim. In this case, I’d argue the slippery slope is the design for more hygienic macros throughout the language, not just to modify pre-existing control flow but to add new control flow, or add other constructs to the language. This is why I have to say no . To keep Odin the language that is loved by so many people, I have to hold the reins and steer it with a clear vision in the right direction. If I allow anyone’s and everyone’s desire to slip into the language, it will become worse than a design by committee language such as C++. The road to hell is not paved with good intentions but rather the lack of intention. So my current answer at the time of writing to this construct is this: No . I’d argue that actual closures which are unified everywhere as a single procedure type with non-capturing procedure values require some form of automatic-memory-management. That does not necessarily garbage collection nor ARC, but it could be something akin to RAII. This is all still automatic and against the philosophy of Odin.  ↩︎ Remember, you’re a bunch of programmers–you’re not cool.  ↩︎ I’d easily argue if it actually is slippery slope, it cannot be a fallacy.  ↩︎ I’d argue that actual closures which are unified everywhere as a single procedure type with non-capturing procedure values require some form of automatic-memory-management. That does not necessarily garbage collection nor ARC, but it could be something akin to RAII. This is all still automatic and against the philosophy of Odin.  ↩︎ Remember, you’re a bunch of programmers–you’re not cool.  ↩︎ I’d easily argue if it actually is slippery slope, it cannot be a fallacy.  ↩︎

0 views
Ginger Bill 5 months ago

Unstructured Thoughts on the Problems of OSS/FOSS

Originally from replies to a Twitter thread: https://x.com/TheGingerBill/status/1914389352416993395 This is not a structured argument against FOSS/OSS but my uncommon thoughts on the topic. I am not sure if I agree [that FOSS/OSS derives from the same thinking process as the ideology of communism], but I understand the sentiment. The fundamental issue is that software is trivially copyable. I have loads of issues with FOSS and OSS 1 . And part of this “ideology” (as presented in the original post) is naïvety coupled with only first-order thinking and a poor understanding of ownership . Software isn’t property in the normal sense of that word, so trying to apply the same rules to it is why it starts to break down; coupled with the first-order thinking that many FOSS advocates do, it leads to unsustainable practices and single sources of dependency failure. However the simplest argument (which is technically wrong) is due to what I already said: it’s trivially copyable, and thus not really “property” in any traditional sense of that word. Software at the end of the day is a very complicated and complex implementation of an algorithm. It is neither physical property nor intellectual property , the latter of which is closer to an honour system, rather than a property rights system even if has the term “property” in its phrase. So in a weird sense, it fits into a different category (or maybe a subcategory) because it’s not the algorithm/idea itself which is “precious” but rather the implementation of it—the software itself. The foundations of the assumptions of FOSS/OSS about the “right” to “redistribute” is blending a few different aspects together. The licence itself is an honour system applied to the “software”. But the question is, is that an applicable in the first place? There are a lot of oddities when it comes to copyright and trademark law, which are mostly done due to practicality rather than based on principles. A good example that recipes 2 cannot be copyrightable, but from a principled aspect there isn’t any reason why not. Recipes have been passed around all over the place by numerous people over the years, so the origins are hard to trace, and even harder to enforce. This is why many industries have “trade secrets”, to protect their place in industry. Letting people know your “secrets” means that they are “secrets” no more. Even legally (secret) recipes are classed as the same as “trade secrets”. You could argue that letting people have more knowledge is a “net benefit for society” but that is the first-order thinking I am talking about. The assumption that “the truth with set you free” is adding the assumption that everyone should know it in the first place. I am not making a pseudo- gnostic argument, but rather some secrets are best kept, well, secret. It also makes sense from a business sense too to not let your competitors know how you do things if you want some sort of technical advantage. But this is still first-order thinking. To go second-order (which is still not that deep), it also means that people tend to rely on those “ideas” rather than evolving and generating more. It means that people just rely on the free things rather than trying to come up with other approaches. To clarify further what I mean by first-order thinking, it’s thinking about the immediate results rather than the long-term more complex and complicated results which require thinking in higher-orders. A good analogy would be a Taylor series . In this analogy, first-order is a linear approximation, whilst second-order would be quadratic, etc. As you add more terms, you get a more accurate approximation of the actual function, but with this fitting approach, it still has numerous limitations and flaws. And in the case of thinking, more and more orders might not be easy (or even possible) to think about. Ideas have virtually no cost to them, or even negative cost, and as such, when something is free, people cannot rationally calculate a cost benefit analysis of such ideas. It’s assuming that a “marketplace of ideas” is possible, when a market requires a price mechanism to work. A rational marketplace requires a way to rationally calculate costs from prices (as costs are determined by prices, not the other way around). There is a reason the XKCD 2347 meme exists. People will rely on something just because it is free and “forget” (I am using that term loosely) that everything has a cost to it. And I do run an Open Source (not FOSS) project: the Odin Programming Language . If there was a possibility of actually selling a compiler nowadays, I would, but because the expected price of a compiler for a new language is now free, it makes that nigh impossible. You have to either rely on charity or companies that rely on the product to pay for support. I am grateful for the amount of bug reports, Pull Requests, and general usage of Odin. It is extremely surreal that I work with a company that uses my language for all of their products , and I get paid to do it. Some of my time is spent working on the Odin compiler and Odin core library, but a lot of it is actually just working on the products themselves. And that’s what I made Odin for in the first place: a language I could actually program to make things in—a means to an end; not an end into itself. There does seem to be a common feeling of guilt programmers have that they should give their knowledge to the world freely. But why are they feeling guilty? Is that a correctly placed emotion? Is that even valid? And why should you give your knowledge and wisdom away for free? Just because you got it for free? I could also add, but I am not going to make this argument in general, that FOSS is specifically “forcing charity” on others, which the act itself is not virtuous but vicious. This is why I assume the original poster is kind of saying this it similar to the “ideology of communism”, if I am guessing. He is viewing that as the “forced charity” aspect of the “ideology”. It is also a very specific conception of charity , the view that charity is free-as-in-beer rather than a virtue of the friendship of man (or in the traditional theological conception, “the friendship of man for God”). A lot of charity is “free” but a lot is not. You can still be beneficial to others whilst making a profit. There is nothing intrinsically wrong about making a profit in-itself 3 . I’d trivially argue that people who release their source code when you pay for it when they don’t have to are still being charitable. Yes it does have a monetary cost, but that does not mean it isn’t a form of charity. But OSS/FOSS as a practice encourages, not forces, by telling people ought to work for free and give the services for free. To be clear, my position on this topic is far from a common one, and I don’t think it’s a “ strawman ” by any stretch of the imagination. There are many “definitions” of what OSS and FOSS are, but I’d argue most are idealistic forms which are on the whole (except in certain instances) do not bring forth the ideals that they set out. To use a common cybernetics phrase ( POSIWID ): The purpose of a system is what it does , and there is not point claiming that the purpose of a system is to do what it constantly fails to do. “Of course” you can hypothetically make money from OSS/FOSS but that doesn’t mean it is either possible nor sustainable or even desirable. And I know this from first-hand experience. I am always grateful for all of the donations received for the Odin programming language through people’s act of charity, and all of that money does go towards the development of Odin. However, it’s not a sustainable nor maintainable model—and I’d argue has never been nor could ever be. And for every purported exception to this rule, I’d try to argue none of them are because of the OSS/FOSS model and purely to the individual programmer/developer. The OSS/FOSS dream is that, a dream that cannot live up to its “ideals”. For every hypothetical benefit it has, it should be stated that it is a hypothesis and not a theory—theories require evidence to be classed as such. Most of the benefits and freedoms of OSS/FOSS are doubled-edge swords which are also huge attack vectors. Vectors for security, sustainability, maintainability, and redistribution. Most of the industry is based on blind-faith without anyway to verify that blind-trust. Regardless of whether I am correctly or incorrectly “defining” OSS/FOSS to your preferred “definition”, the multi-order effects are rarely considered. And to bastardize Lysander Spooner : this much is certain—that it has either authorized such a tech oligarchy as we have had, or has been powerless to prevent it. In either case, it is unfit to exist 4 . All of these lists of ideals of essential freedoms—and I’d argue they are not principles in the slightest—have not aided in anything in the long run. To use the GNU’s list for example: As to my previous statement, none of these are principles . “Freedom 0” the foundation for the rest isn’t even foundational. It pretty much relies on time preference between using pre-made software and home-made software. Software could be a service, but it’s also, again, an implementation of an algorithm/idea. Of course I know these “ideals” only apply to some FOSS advocates, and not necessarily any OSS advocates, but it’s still the general perception of it. To conclude from this very unstructured brain-dump, I can understand the original sentiment that a lot of the mentality of advocating for OSS/FOSS is from a similar standpoint of the “ideology of communism”, but I do not conceptualize it that way. I don’t think OSS nor FOSS has been a good thing for software, and probably a huge acceleration towards why software has been getting worse over the years. I am making a distinction between OSS (Open Source Software) and FOSS (Free [and] Open Source Software) in this article but most of my critiques are related to both.  ↩︎ I’d argue patents are fundamentally in the same-ilk as “recipes”, and software-patents even more so. I personally don’t like patents that much except in a few cases, but I really do dislike software-patents with a passion. However that’s a different topic for a different day.  ↩︎ Unless you are of course a communist which views that it is, but that’s a different discussion for a different day.  ↩︎ The original quote: “But whether the Constitution really be one thing, or another, this much is certain—that it has either authorized such a government as we have had, or has been powerless to prevent it. In either case it is unfit to exist.” — Lysander Spooner, No Treason: The Constitution of No Authority.  ↩︎ Assuming you can even run it in the first place. This statement is also kind of vague but I won’t go into it too much. This is actually two “freedoms” combined together. The first is access to source code, and the second is that “secrets” should not exist (i.e. secret sauce/source). And why should that be free-as-in-beer in practice ? I understand the “ideal” here is not suggesting it ought to be free-as-in-beer but that is end-result. And I’d argue the vast majority of FOSS advocates would say paying for source (i.e. Source Available) is not Open Source . Why?! Do we allow this for forms of intellectual property? If software is intellectual property, why is it different? I know I’ve made the argument it is in a separate category, but if it is actually a form of intellectual property, then our legal systems do not and should not allow for this. This “freedom” is probably the most egregious with respect to the “ideology of communism” proclamation. This viral nature of licences like GPL are a fundamentally pernicious aspect of the FOSS (not necessarily OSS) movement. It’s this idea of “forcing” charity on others. Of course you are “free” to not use the software, but if there are virtually no other options, you either have to write the thing yourself, or have the potential to have virtually no business model. This point is also an extension of (freedom 2) and as such, not helping the case. I am making a distinction between OSS (Open Source Software) and FOSS (Free [and] Open Source Software) in this article but most of my critiques are related to both.  ↩︎ I’d argue patents are fundamentally in the same-ilk as “recipes”, and software-patents even more so. I personally don’t like patents that much except in a few cases, but I really do dislike software-patents with a passion. However that’s a different topic for a different day.  ↩︎ Unless you are of course a communist which views that it is, but that’s a different discussion for a different day.  ↩︎ The original quote: “But whether the Constitution really be one thing, or another, this much is certain—that it has either authorized such a government as we have had, or has been powerless to prevent it. In either case it is unfit to exist.” — Lysander Spooner, No Treason: The Constitution of No Authority.  ↩︎

0 views
Ginger Bill 11 months ago

OpenGL is not Right-Handed

The original Twitter thread: https://x.com/TheGingerBill/status/1508833104567414785 I have a huge gripe when I read articles/tutorials on OpenGL: most people have no idea what they are talking about when it comes to coordinate systems and matrices. Specifically: OpenGL is NOT right-handed; the confusion over column-major “matrices”. Let’s clear up the first point. Many people will say OpenGL uses a right-handed coordinate system. Loads of articles/tutorials will keep repeating the view that OpenGL uses a right-handed coordinate system. So the question is, why do people think this? Modern OpenGL only knows about the Normalized Device Coordinates (NDC) , which is treated as if it is a left-handed 3D coordinate space. This means that OpenGL “is” left-handed, not right-handed as many articles will tell you. The origin of “right-handed” comes from the fixed-function days, where Z entries in the functions and were negated, which flips the handedness. Back in those days, it forced users to pretty much had to use a right-handed world space coordinate system. The Z-axis flip is to convert from the conventional world/entity space to the left handed Normalized Device Coordinates (NDC). And modern OpenGL only knowns the NDC. PLEASE READ THE SPEC BEFORE MAKING TUTORIALS! Now for the “column-major” part. This part is actually overloaded and means two things: what is the default vector kind (column-vector vs row-vector), and what is the internal memory layout of a matrix (array-of-column-vectors vs array-of-row-vectors). A good article on this has been written by @rygorous regarding this so I won’t repeat it too much. Row major vs. column major, row vectors vs. column vectors . It works because (A B)^T = B^T A^T (where T is the transpose). But where does this difference in convention come from? My best hypothesis is the physics vs mathematics distinction (just like everything else). In physics you default to using column-vectors and in mathematics you default to using row-vectors. It’s just a convention. It’s just another distinction between OpenGL and Direct3D which makes very little difference at the end of the day, especially since in both GLSL and HLSL, you can specify whether a matrix is or if necessary. I personally prefer column-vectors because of my background in physics but all that is important is that you are consistent/coherent in your codebase, and make sure that all conventions are make extremely clear: handedness, axis direction, “Euler”-angle conventions, units, etc. GLSL also doesn’t help it in that the matrix types are written “backwards” to most other conventions. in GLSL means a 3 row by 2 column matrix. GLSL does not work for this well known mnemonic: Then there is the other related issue of having matrices that are stored in the array-of-column-vector layout: many programming languages don’t have a built-in concept of a matrix & an array-of-row-vectors will be easier to write as an array-of-column-vectors because it is text. It is common to write a matrix in a language like the following and if you are trying to adhere to “column-major” memory layout, then you will have to write it down transposed compared to how you would write it down on paper. The Odin programming language has both built-in array programming (vectors & swizzling) and matrices, and as a result, this textual issue is completely removed! It even allows you to treat vectors as if they are column-vectors or row-vectors, and things will “just work”! You can even specify the memory layout of the matrices if needed too with (the default) or . It “was” true. It distinguishes it from Direct3D. People just repeat things w/o understanding it. Most who use OpenGL don’t really know anything about the GPU nor what the NDC is. (A, B) x (B, C) = (A, C) e.g. (3 by 4) x (4 by 1) = (3 by 1)

0 views
Ginger Bill 1 years ago

Marketing the Odin Programming Language is Weird

[Originally from a Twitter Thread] Original Twitter Post Odin is a weird programming language to advertise/market for. Odin is very pragmatic in terms of its design and overall philosophy. Unlike all popular languages out there, it has no “killer feature”. I’ve tried to design it to solve actual problems with actual solutions. Those languages with a “killer feature” to them do make them “standout” and consequentially more “hypeable”. The problem is that those “killer features” are usually absolute nonsense, very niche, or they rarely have any big benefit. Hype doesn’t make software better. Odin isn’t a “big idea” language but rather make an alternative to C on modern systems; it tries to solve the problems that other systems languages have failed to address. The problems are usually small but unrelated to each other; not solveable with a single “big idea” 1 . And before people say: “Odin’s ‘killer feature’ is that it has none”, how the heck do you market that? That seems like an anti-marketing feature. There isn’t an example of a popular programming language out there which hasn’t got a “killer feature”, even C’s was that in many way it is a “portable assembly” (even if that is not actually true). I know I have a bad habit when people ask me “why should I use Odin?”: I ask them what their actual needs are, and then tell them to try Odin as well as the other “competition” to see which they prefer. I know that’s honest English politeness to a tee, but that’s me as a man. I want Odin to be as best as it can be, but without trying to sell the world to someone in the process. I want to show people the trade-offs that each design has, even in other languages, and not ignore those trade-offs. There are no solutions, only trade-offs. The lack of “hypeableness” that Odin offers is kind of reflected in the people that seem to be attracted to Odin in the first place. They seem very pragmatic, and just want to get on with programming. And as such, they are the kind of people who don’t even share/hype Odin. I guess I don’t want to easily attract the people who are more driven by hype the pragmatic concerns. I want to make software a better place, and attracting such people is detrimental to that endeavour in the first place. Just look at all the hyped JavaScript frameworks out there, and do they really make better software? Or do they just optimize for a mythical “Developer Experience (DX)” 2 which just results in more crap which gets slower, bulkier, and offer less for the actual user? This is probably why some of my “hot takes” have been doing the rounds every now and then. I am trying to find out what the actual problems are and see what possible options there are to either solve or mitigate them on a case-by-case basis. A lot of those “hot takes” have been a form of marketing, and I am trying to at least give myself some exposure. Every single of them is just my opinion, which I usually think is quite mundane too. The web is huge, and thus there will be people who think those takes are shocking. And to be clear I don’t want to make Odin “hypeable” in the first place. I am glad with the steady, stable, and albeit slow, growth that Odin has been getting. The people who try Odin out, pretty much always stay for the longhaul as they fall in love with the language since it does bring them the “joy of programming” back. Something which I do advertise on the Odin website is the “joy of programming” aspect, but that is something that cannot be explained in words, but rather has to be experienced to believe. Another issue with advertising/marketing a systems-level programming language is that it is niche. It has manual memory management, high control over memory layout, S I M D and SOA support, etc. Great for people who need that level of control, but not needed for the general webdev. Obviously that isn’t the intended audience for Odin, but the problem is that in the social media landscape, those voices are the loudest and many will actually shutdown the voices of people who disagree with them just because they are not in the webdev domain. A minor issue that people are starting to think that Odin is “just” for gamedev. It makes me laugh because gamedev is pretty much the most wide domain possible where you will do virtually every area of programming possible. So “just” is a huge compliment but clearly the wrong image. It’s like saying C++ is “just” for gamedev, when obviously it can be used for anything. The same as Odin, because it’s a systems programming language. Odin does bundle with many game/application oriented packages but they are just that: packages. This is another problem. “Odin” can be thought of in a few different ways: When people speak of Python, they usually think of the entire ecosystem. I’ve worked with people who honestly thought Python was Numpy etc, and that you just had to download them. They had no distinction between any of the concepts, “Python” was just the “tool itself”. Since I am an originally C programmer (and language designer), all of those distinctions are made obviously clear to me. There is no single C compiler, and they are all different. There stdlib is dreadful and you want to replace it with your own thing straightaway. But C still prevails. I make those distinctions because I believe it makes things a lot clearer about programming itself, and helps you understand what the flaws are in the tool; thus know what you can do to mitigate/workaround those issues. But this does require a higher quality standard that than of the norm. Another issue is that Odin is free . As weird as it sounds but since about 20 years ago, it’s nigh impossible to sell a compiler. People expect a programming language and compiler to be free; without caring how much time, money, and effort that goes into building a tool. Odin does have a GitHub Sponsors page ( https://github.com/sponsors/odin-lang/ ) but we make very little, and definitely not enough to pay anyone full-time yet. We will pay for the odd paid work when we have the money, but only for a few weeks here and there. I would love to have a few people working full-time on Odin, but it’s something we cannot afford. It’s also one of the main motivations too: to actually pay people for their work. So I ask you fellow internet users: How the heck do you advertise/market Odin (a systems-level programming language) when it does not have a discernable “killer feature” nor is “hypeable” by its very nature of being a pragmatic language? [This bit is rhetorical; I won’t reply to it]. Name a popular language out here today, and I can name the “killer feature” for that language and why it became popular because of it. People maybe complain about the “feature” after many years but it’s what brought them to it.  ↩︎ Developer Experience (DX) is probably a psyop that makes software even worse; at expense of making the programmer think he is being more productive when in reality he is being less so because the DX is optimizing for that dopamine hit of “felt-productivity” than actual productivity and quality.  ↩︎ The language itself The language+the compiler The language+the+compiler+core library+vendor library The entire ecosystem [This bit is rhetorical; I won’t reply to it]. Name a popular language out here today, and I can name the “killer feature” for that language and why it became popular because of it. People maybe complain about the “feature” after many years but it’s what brought them to it.  ↩︎ Developer Experience (DX) is probably a psyop that makes software even worse; at expense of making the programmer think he is being more productive when in reality he is being less so because the DX is optimizing for that dopamine hit of “felt-productivity” than actual productivity and quality.  ↩︎

0 views
Ginger Bill 1 years ago

Why People are Angry over Go 1.23 Iterators

NOTE: This is based on, but completely rewritten, from a Twitter post: https://x.com/TheGingerBill/status/1802645945642799423 TL;DR It makes Go feel too “functional” rather than being an unabashed imperative language. I recently saw a post on Twitter showing the upcoming Go iterator design for Go 1.23 (August 2024). From what I can gather, many people seem to dislike the design. I wanted to give my thoughts on it as a language designer. The merged PR for the proposal can be found here: https://github.com/golang/go/issues/61897 It has a in-depth explanation of the design explaining why certain approaches were chosen instead, so I do recommend reading it if you are familiar with Go. Here is the example from the original Tweet I found: This example is clear enough in what it does, but the entire design of it is a bit crazy to me for the general/majority use case. From what I understand it appears that the code will be transformed into something like the following: This means that Go’s iterators are much closer to what some languages have with a “for each” method (e.g. in JavaScript) and passing a callback to it. And fun fact, this approach is already possible in Go <1.23 but it does not have the syntactic sugar to use it within a statement. I will try to summarize the rationale Go 1.23 iterators, but it seems that they ware wanting to minimize a few factors: As Russ Cox (rsc) explains in the original proposal: Note regarding push vs pull iterator types: The vast majority of the time, push iterators are more convenient to implement and to use, because setup and teardown can be done around the yield calls rather than having to implement those as separate operations and then expose them to the caller. Direct use (including with a range loop) of the push iterator requires giving up storing any data in control flow, so individual clients may occasionally want a pull iterator instead. Any such code can trivially call Pull and defer stop. Russ Cox goes into more detail in his article Storing Data in Control Flow about why he likes this approach to design. NOTE: Do not worry about what this actually does, I just wanted to show an example of the clean-up needed with something like . An example from the original PR shows an example of a much more complex approach requiring clean-up where values which are pulled directly: NOTE: I am not suggesting Go does this whatsoever. When designing Odin , I wanted the ability for the user to design their own kind of “iterators”, but have them be very simple; in fact, just normal procedures. I didn’t want to add a special construct to the language just for this—this would complicate the language too much which is what I wanted to minimize in Odin. One possible pseudo-proposal I could give for Go iterators would look like the following: This pseudo-proposal would operator like this: This is similar what I do in Odin BUT Odin does not support stack-frame-scope-capturing closures, only non-scope-capturing procedure literals. Because Go is garbage-collected, I see little need to no utilize them like this. The main difference is that Odin does not try to unify the ideas into one construct. I know some people will this think this approach is a lot more complicated. It is doing the opposite of what Cox prefers with storing data in control flow, and store the data outside of it. But this is usually what I want from an iterator rather than what Go is going to do. And this is the problem: it removes the elegance of storing the data in the control flow–the push/pull distinction that Cox explains. NOTE: I am very much an imperative programmer and I like to know how things actually execute, rather than trying to make it “elegant looking” code. So the approach I wrote above is fundamentally about thinking with regards to execution. n.b. The typeclass/interface route would not work in Go because this would not be an orthogonal design concept and actually be more confusing than necessary—this is why I did not originally propose it. Different languages have different requirements as to what works in them. The approach that Go 1.23 takes seems to go in the face of the apparent philosophy of making Go for the general (frankly mediocre) programmer at Google who doesn’t want to (nor can) use a “complex” language like C++. To quote Rob Pike : The key point here is our programmers are Googlers, they’re not researchers. They’re typically, fairly young, fresh out of school, probably learned Java, maybe learned C or C++, probably learned Python. They’re not capable of understanding a brilliant language but we want to use them to build good software. So, the language that we give them has to be easy for them to understand and easy to adopt. I know many people are offended by this comment, but it’s brilliant language design by understanding who you are designing the language for. It is not insulting, but rather the a matter of fact statement as Go was originally for the people who work(ed) at Google, and similar industries. You might be a “better and more capable” programmer than the average Googler but that doesn’t matter. There is a reason people love Go: it’s simple, opinionated, and most people can pick it up very quickly. However this iterator design does seem out of character to Go, especially for the someone like proposer Russ Cox (assuming he was actual the original proposer) on the Go team. It makes Go a lot more complicated, and even more “magical” too. I understand how the iterator system works because I am literally a language design and compiler implementer. It also has the possible issue to it won’t be a well performing approach either because of its need for closures and callbacks. Maybe the argument for its design is that the average Go programmer is not meant to implement iterators but just use them. And that the majority of the iterators that people will need will be already available in Go’s standard library, or by the third-party package itself. So the onus is put on the package writer , not the package user . This is why I think a lot of people seem to be “angry” over the design. It goes against everything Go was originally “meant to be” in the eyes of a lot of people, and it seems like a really complicated “mess”. I understand the “beauty” that it looks like a generator with the and inline code approach, but I do not think that is necessarily in the vein of what Go is to a lot of people. Go does hide a lot of how the magic works under the scenes, especially with garbage collection, routines, statement, and many other constructs. However, I think this is a little too magical in that it exposes the magic to the user a little too much, whilst looking overly complex to the average Go programmer. The other aspect where people find this “confusing” is that it’s a that returns that takes a as an argument. And that the body of the is transformed into a and all s (and other escaping control flow) are converted into a . It’s just three levels of procedures deep, which again feels like a functional language rather than an imperative language. NOTE: that I am not suggesting they replace the iterator design with what I am suggesting, but rather a generalized iterator approach may have not been a good thing Go in the first place. For me at least, Go is an unapologetic imperative language with first-class CSP-like constructs. It’s not trying to be a functional-like language. Iterators are in that weird place where they do exist in imperative languages, but they are very “functionally” as a concept. Iterators can be very elegant in functional languages, but in some many unabashed imperative languages, they always feel “weird” somehow because they are being unified into a separate construct rather than separating out the parts of it (initialize+iterator+destroy). As I alluded to previously, in Odin an iterator is just a procedure call where the last value of the multiple return is just a boolean indicating whether to continue or not. And because Odin does not support closures, the equivalent Go iterator in Odin is a little more code to type. NOTE: Before people say “that looks even more complex”, please continue reading the article. Most Odin iterators are not like this, and I would never recommend writing such an iterator where a trivial for-loop would be preferred for both the reader and the writer of the code. This does appear to be a lot more complicated that the Go approach because it requires you to write a lot more code. However, it’s actually a hell of a lot simpler to understand, comprehend, and even faster to execute. The iterator does not call the for-loop’s body, rather the body calls the iterator. I know Cox loves the ability of Storing Data in Control Flow , and I do agree it is nice but does not fit well within Odin, especially with the lack of closures (because Odin is a manual memory managed language). An “iterator” is just syntactic sugar for the following: Odin’s approach is just removing the magic and making it extremely clear what is going on. “Construction” and “destruction” must be handled manually with explicit procedures. And that iteration is just a simple procedure which is called each loop. All three constructs are handled separately rather merged into one confusing one like in Go 1.23. Odin does not hide the magic, whilst Go’s approach is actually very magical. Odin makes you handle the “closure-like” values manually along with the construction and destruction of the “iterator” itself. Odin’s approach also trivially allows you to have as many multiple return values as wanted too! A good example of this is Odin’s package where the can be treated like an iterator: I will try to not get into a huge rant about C++ “iterators” in this article. C++ iterators are much more than mere iterators whilst at least Go’s approach is still a mere iterator . I completely understand why C++ “iterators” do what they do, but 99.99% of the time I just want a mere iterator ; not something that has all of the algebraic properties that allow it to be utilized in more “general” places. For people who don’t know C++ very well, an iterator is a custom / which requires it to have overloaded on it to make it act like a “pointer”. Historically, a C++ “iterator” would look like this: And would be wrapped in a “macro” before C++11 ranged-for loop syntax (and ) became a thing. The biggest issue is that C++ “iterators” require you to define 5 different operations at a minimum. The following three operator overloads: Along with two stand-alone procedures or bound methods which return an iterator value: If I was to design C++ mere iterators , I would have just add a simple method to a / called or something. And that’s it. Yes it does mean the other algebraic properties are lost, but I honestly do not need these EVER for any problem I am working on. When I am working on those kinds of problems, they will always be either a contiguous array, or I will implement the algorithm manually because I want to guarantee the performance will be good for that data-structure. However, there is a reason I made my own language (Odin) because I completely disagree with the entire C++ philosophy, and I want to get away from that madness. C++ “iterators” are a hell of a lot more complicated than Go’s iterators but they are much more “direct” with the local operation. At least with Go, you don’t need to construct a type with 5 different properties rather I feel like Go’s iterators do make sense with the design principles applied to them BUT seem antithetical to what most people view Go as being. I know Go “has had to” get more complex over the years, especially with the introduction of Generics (which I do think are actually well designed, with only a few syntax quibbles), but the introduction of iterators of this ilk feels wrong. I think the short of it is that it feels like it goes against the apparent philosophy of Go that many people believe, coupled with it being a very functional way of doing things rather than imperative. And because of those reasons, I think that is why people don’t like the iterator stuff, even if I completely understand the design choices made. It doesn’t “feel” like what Go original was to many people. Maybe the concerns of mine (and others) are overblown and most people will never actually implement them and just use them, and that them being this complicated to implement. Second to last controversial take: Maybe Go needed to “gate-keep” even more and just tell the “functional-bros” to go away and stop asking for such features which make Go a much more complicated and complex language. Last controversial take: if it was me, I would just not have allowed custom iterators into Go whatsoever, but I am not on the Go team (nor do I want to be). Make the iterator look/act like a generator from other languages (thus the ) Minimize the need for sharing too many stack frames Allow for clean-up with Reduce data being stored outside of the control flow

0 views
Ginger Bill 1 years ago

String Type Distinctions

[Originally from a Twitter Thread] Original Twitter Post One thing many languages & API designers get wrong is the concept of a string. I try to make a firm distinction between: They are not equivalent even if you can theoretically use them as such, and so many garbage collected language use them as such. They have different use cases which don’t actually overlap in practice. Most of the issues with strings come from trying to merge concepts into one. In Odin , the distinction between a string value and byte array is very important. is semantically a string and not an array of 8-bit unsigned integers. There is an implied character encoding (UTF-8) as part of the value. is also an immutable value in Odin. Having a string be immutable allows for a lot of optimizations, but in practice, you never want to mutate the string value itself once it has been created. And when you do mutate it, it most definitely a bug. This is why it is important to make a distinction between #1 and #3 and separate the concepts. Another way to conceptualize the ideas is as the following: Coupled with Run-Time Type Information (RTTI), having a distinction between []byte and string allows for a lot of really decent (de)serialization tooling, especially for “magical” printing (e.g. fmt.println). P.S. Even C makes a distinction between a string an array of integers string value ( or ) string builder ( or ) Backing buffer for a string ( or ) (3) is the “backing data”, an arena of sorts (fixed or dynamic) (2) are the operations on that buffer (fixed or dynamic) (1) is the final value that points to (3) and produced by (2)

0 views
Ginger Bill 1 years ago

The Video That Inspired Me To Create Odin

[Originally from a Twitter Thread] Original Twitter Post Many people may not know this but this video by Sean Barrett @nothings is partially the reason why I made the Odin programming language . And I’ll explain what insights it gave me in this thread 🧵. A lot of these seem “so obvious” but for some reason it never clicked to me before this video. A lot of the “higher level” “scripting” languages are not that much higher level than C. Those languages just have garbage collection and much better standard libraries than C. Those languages are “easier to use” than C just because they have loads of built-in data structures like arrays, hash maps, and built-in syntax for them (how “extravagant”, I know). I was already well aware of the STB libraries that Sean developed, and the single-header-file style too of them (which is mostly a cope for C’s flaws), but it never clicked to me that the library that Sean used the most was the “stb.h” one. This lead me to organize code I had used in previous projects already and put that into a single and easy to use library which I could use in any of my C projects going forward in 2015/2016: https://github.com/gingerBill/gb/blob/master/gb.h However by around March, I began experimenting with just making an “augmented” C compiler so that I could add constructs to C which I found to be some of the most annoying things C lacked. The first, and most important, two constructs were: I had already been using in C++ for many years already. I really wished the concept was native to C. (My opinion has now changed now for C for many reasons). P.S. I wrote an article about what I had been doing back in 2015 for defer in C++11 Eventually I realized I couldn’t just add constructs and features to C to “improve” my personal projects, and that any of my professional projects could not use them either. This then lead me to starting my own compiler to fix those issues I had with C. Interestingly, “Odin” (which was the original codename which just stuck) started life as a Pascal-like language with begin/end too, but I was trying to make it “C-like” enough for what I wanted. About 3 months later, I had implemented ~70% of Odin; the other ~30% took 7 years. Odin wasn’t my first programming language that I had designed nor implemented, but it was the first where I actually went “you know what?—this will actually be useful for my general needs”. And now I use it for my living at JangaFX , where all of our products are written in it. Other than slices and , the other things I did for Odin were to solve a few practical issues I had in C: I won’t discuss the benefits of slices here but I will discuss the latter 3 things of that list. In C, pretty much my only use case for variadic parameters were -style things. Adding a decent form of that + coupled with RTTI, allowed me to have a runtime-typesafe . As for “actual libraries”, they need their own namespace. When the user imports it, he can change import name itself, rather than relying on the library to prefix everything (or C++ namespace) to minimize namespace collisions. This eventually lead to Odin’s package system . I highly recommend everyone to watch this talk by Sean Barrett! It was a huge inspiration for me, and just a general eye-opener for something which should have been obvious to me at the time but wasn’t. A slice-based type Decent variadic parameters (a slice-based one) RTTI (not bug-prone format verbs in printf, and not my janky generated stuff) Actual libraries

0 views
Ginger Bill 1 years ago

Why I Hate Language Benchmarks

[Originally from a Twitter Thread] Original Twitter Post I don’t know if I have “ranted” about this here before but: I absolutely HATE comparing programming languages with “benchmarks”. Language benchmarks rarely ever actually test for anything useful when comparing one language against another. This goes for ANY language. Even in the best case scenario: you are comparing different compilers for the same language (and the same input). This means that you are just comparing how well the optimizing backends work for those compilers. Comparing different languages is not even in the same category. When comparing languages, you are not just comparing the optimizing backend of the compiler (assuming it even is compiled), but a completely different input. And most benchmarks rarely use the semantically equivalent code to test against. The implementations vary widely. And even in the case where the input is semantically equivalent AND the compiler backends for each language use the same “library” (e.g. LLVM), even then the semantics of each language may not allow for certain passes. LLVM is a good example which assumes C and C++ semantics. If your language does not adhere to C and C++ semantics, then most of the passes in LLVM cannot be used. And some compilers may have different default “flags” too, which makes dumb comparisons rarely equal (e.g. native vs portable microarchitectures). Clarifying the semantically equivalent aspect: a printing procedure in one language may be drastically different too. Runtime vs compile-time type information or none at all, flushing after each call or not, richer formatting or not, etc. Are you even comparing the same thing? There is also the “idiomatic” aspect which I hate too. “Idiomatic” in one language is a subjective and personal construct, and may produce very different results compared to “non-idiomatic” code. “Idiomatic” styles might produce slower code in general; the tests won’t show this. One of the most egregious websites for this is: https://programming-language-benchmarks.vercel.app . I recommend anyone to compare two languages of the same ilk and actually read the differences between the code in the tests; note how they are nothing alike most of the time. Different implementations & logic. n.b. I do personally try to make distinctions between the language, the compiler, the core library, and the ecosystem, as much as I can. I know most people do not and just lump everything together as a single package. This is due to most languages having a single implementation. But if you come from a C or C++ background, like my own, then you are/were confronted with the selection of different toolchains from the start (MSVC, Clang, GCC, Intel, tcc, 8cc, etc). And you are usually forced to write/import your own core library too (e.g. C’s is awful). For a language like Lua, there are different implementations, but they pretty much offer the same “ecosystem”, but just differ in how they are ran (i.e. VM vs JIT). And for many people, the choice of which to use is dictated by the use case. In summation, metrology is hard. You actually need to know what you are comparing against; if that thing is even measurable (quantitatively or qualitatively) in the first place; that the things you are comparing are actually useful or valid for what you want to know. Comparing multivariate things against each other and going “yep, that entire ’language’ is faster than this one” is misguided at best, and idiotic at worse. Please don’t treat “benchmarks” such as these as mostly pseudo-science, not science. Just because it has loads of numbers and “measurements”, does not make it “scientific”.

0 views
Ginger Bill 3 years ago

Reverse Engineering Alembic

For my work at JangaFX , we require the use of the Alembic interchange file format. We have been using other libraries which wrap reading the Alembic file format but because it is not the native library, it has numerous issues due to the generic interface. I spent nearly 4 days trying to get the official Alembic C++ API, https://github.com/alembic/alembic/ , to compile correctly and then use the API itself. Numerous times the compilation would get corrupted (it compiled but none of the tests even ran) and when I got it work (on another machine), the API itself was a labyrinth to navigate. After this ordeal I decided to see how difficult it would be to create my own reader, and then writer, for the format from scratch. I don’t really like to “reinvent the wheel” when I don’t have to, but in this case, the C++ API was an absolute nightmare and I’ve spent more time trying to get that to work rather than actually solving the problem I had. Making my own library for Alembic turned out to be a little more work than I expected. Even though it is an “open source” format, it is effectively completely undocumented, and the bits that are documented refer mostly to specific (not all of them) schemas rather than the internal format. This article will be a basic document of the internals of the Alembic file format. Through my journey into “discovering” the Alembic format, I found out that Alembic is not actually a file format but masquerades as one. It is in fact two file formats with different memory layouts which can be determined based on the magic file signature : HDF5 is a hierarchical data format which is commonly used to store and organize large amounts of data in a hierarchical fashion. It’s commonly used in the scientific field rather than the visual effects industry, and as such, this internal format is rarely used for storing data that would be useful for our tasks for importing (e.g. meshes, cameras, animations, etc). HDF5 is effectively a storage format for database-like data, it’s a very good format in terms of usage (I will admit I have no idea about its internal design). It appears that the vast majority of “Alembic” files are not in the HDF5 format but in Ogawa. The main format of concern is named Ogawa . It’s a little-endian binary format (thank goodness) which was designed to be readable in-place for efficient multi-threaded data reading. This part of the file format is luckily documented 2 , and small enough that I could write it in a single tweet . Similar to HDF5, Ogawa is a hierarchical data format that is simple to read, but differing from HDF5, it is completely uncompressed. Note: When decoding binary formats, it is very useful if have a hex-viewer to see the bytes in a readable way. On Windows, I would recommend either Hex Editor Neo or 010 Editor . Even though the Ogawa format is mostly documented on that GitHub Wiki page, I still required a hex-viewer from time to time to ensure everything was laid out as I expected, especially for the more complex embedded data. Its header is very simple: The magic signature is just 5 bytes containing the ASCII characters , followed by a byte-wide flag which states whether the file is open or closed to writing, followed by 2 bytes specifying the version of the format (of which will always be ), and finally followed by an offset to the root “group”. The is there to allow writing to the file and prevent other applications from trying to read it whilst it is not finished. It begins off as and then once the file is finished, it becomes . For my Alembic writer, I was storing the data in a memory buffer and then writing the entire data to the file at once, meaning this byte-wide flag was not that useful. All grouped-based offsets in the Ogawa format are stored in a 64-bit unsigned little endian integer ( ) representing the number of bytes from the base of the file 3 . These group-based offsets come in two different flavours: an Ogawa Group or an Ogawa Data (byte stream) . The offset encodes a flag in its highest bit (63rd bit) to indicate which flavour: Group if 0 and Data if 1. The remaining 63-bits represent the offset to this specific kind of value; if the remaining 63-bits are all zero, it means it is a terminating value (not pointing to any actual memory). A Group begins with a representing the number of children following it in memory (of which all are another offset, ). If the value is 0, this represents an empty group that has no children. A Byte-Stream begins with a representing the number of bytes following it in memory. This basic file format is very simple to read but it has zero semantic information to make it useful for something like an interchange format. The semantic structure is what I am calling the Meta-Schema for Alembic . To begin decoding the format, I created a basic Ogawa to JSON viewer to see the tree structure of the file format. As I began to understand more of the format, the JSON structure became much more complex but richer in semantic data. Note: I highly recommend converting any hierarchical format into something like JSON just as a debug view alone, since there are many tools such as Dadroit that allow for easy viewing. But if you want to make your own format and viewer with your own tools, go ahead! After a bit of exploring, the meta-schema begins with the following data for the root group: From this point on, it can be assumed that a lot of the data is unaligned to its natural alignment and must be read accordingly. This data is only stored at the root group (child ) and referenced elsewhere by index. Each value containing the samples is stored in contiguous memory: This memory needs to be read in order in order to determine the number of samples. All metadata is stored as a string. Each entry is a key and value separated by the ASCII character , and each entry is separated by . Deserializing this is a very simple operation to do in most languages. These strings are NOT NUL terminated like in C because they are bounded by a length that usually prefixes it. As I am using Odin , its type is length-bounded making this process a lot easier to use compared to C. The indexed metadata is only stored at the root group (child ) and referenced elsewhere by index. Each value is stored in contiguous memory: The maximum number of index metadata entries allowed is ( ). The root group stores an offset to the first group (child ) representing Object Data . Object data contains a list of the nested children as well properties for its “this” Object . The Object Header are stored in contiguous memory for each entry but depending on the value of its (see below) will determine whether the metadata value is stored inline or stored in the indexed metadata at the root group. There are three different forms of properties that may be stored within Alembic: Scalar, Array, and Compound. Child of the Object Data group must represent a Compound property data. Compound property data is a group which stores children for properties. It’s effectively a hash map (like a JSON object) containing a keys, properties, and the values of more property data. In the compound property headers byte-stream, it contains a list of the nested of each property header. n.b. The property headers are the weirdest aspect of the entire format and not simply expressible in terms of basic data structures and require a lot of logic for the encoding. is an which encodes information about the property: For the property headers containing inline metadata: Scalar property data is constant if in the property header. Number of samples is equal to the . The indexed samples are stored in the children groups on the property data stored beginning at the relative offset 0. Depending on the whether the samples are constant or not, the actual index needs to be mapped using the following procedure: The data for each sample contains both the key/hash of the data and the uncompressed data. The hash is stored at the beginning; is represented as a 16-byte (128 bit) SpookyHash . The data after that is stored after this hash. Array property data is constant if in the property header. Number of samples is equal to the . Indexing into the samples is a little more complex than scalar data. The number groups stored for the sample data is doubled up, where the pair of groups represent the sample data and then the data for the dimensions. The sample data is similar to that of the scalar but the extra group that represents the dimensions is an array of values which represent the tensor shape of the data. Most of the time, the dimensions is just a single element represent the number of elements in the array. The data group has the same kind of hash prefixing the data (using SpookyHash). I will discuss the schemas of Alembic at a later date since they could fill their own article. For the need we have, we only really care about reading in Cameras and Xforms (Transforms), and writing out Particles/Points. I think the general format Ogawa format is absolute fine but because of its lack of semantic structure, it requires applying one to it. Even JSON has some semantic meaning by virtue of being a text-format (of which I have been using as a format to display to debug the Ogawa data). Note: I would never recommend using JSON when another file format is better fit (especially if you have the luxury of using a custom format), JSON is too general for the vast majority of use cases. If you have the choice of using an interchange format, always use a binary one (even a custom one) since it is pretty much never the case you will need to modify the file in a text editor. The design of the property header is very strange to me. It seems like it is trying to be as compact as possible but the reset of the file format is uncompressed. I cannot see why this bit would need to be compressed/compacted when the property data it refers to is uncompressed. The size hint bit is really bizarre and I personally would have just stuck with a single integer type (i.e. ) to keep things better. The use of indexed metadata seems a bit bizarre since because the entire format is offset-based, the indexed stuff could have pointed to the same memory address to save memory. I know this means that the hierarchy is now not strict, but for metadata, I personally don’t think it would be much of an issue. Speaking of the offset-based file formats, one potential issue for malicious files is form cycles in the hierarchy, which could cause many readers to crash. I am not sure why Spooky-Hash was used. It is not really an easy hashing algorithm to implement (~330 LOC of Odin) and not necessarily better than other hashes out there. It is also not that commonly used compared to simpler (and maybe better) hashes. After a lot of work, I got a fully functioning Alembic reader and writer that was ~2000 LOC of Odin (including the necessary schemas we required at JangaFX ). This is a hell of a lot smaller and simpler than the official Alembic C++ API for the same amount of functionality. Technically it’s Ogawa-Flavour Alembic with its own specific meta-schema, of which I will get to that later in the article.  ↩︎ https://github.com/alembic/alembic/wiki/Ogawa-Specification   ↩︎ I’m a huge fan of file binary formats designed this way where they use effectively internal relative pointers , of which I should explain my preferred approach to designing binary file format in the future.  ↩︎ 6 children minimum Every file that I tested always began with 6 children Child is a byte-stream which stores an containing something akin to the archive version (always appeared to ) Child is a byte-stream which stores an containing the file version which must be From the reverse engineering, it appears that the value stored in decimal represents the file format: e.g. represents Child is a group to the first Objects based at the root Child is a byte-stream which stores the file’s metadata Child is a byte-stream which stores the time samples and max samples information Child is a byte-stream which stores indexed metadata which nested properties may use later In my Alembic writer, I never used any indexed metadata and just did everything inline since it was easier to handle Child stores the data for the properties in the form of Compound Property Data Children stores the objects Child stores the data for the child Object Headers If is equal to , the metadata is stored directly after it If is any other value, it represents the index to the metadata stored at the root group (this value must be less than the number of index metadata entries otherwise it is a invalid file) Children represent the nested property data which is denoted by the property headers Child is a byte-stream which contains the compound property headers Bits : represents the property type: = Bits : represents the size hint = Bits : represents the plain old data type: = (byte) = string I am yet to see this in the wild yet Bit states whether the and is set for a Compound type Bit if set, and equal If bits and are not set, , Bit states if the property data is homogenous Bits : represents the “extent” of the value. e.g. Bits : represents the . This follows the same rules as Object Headers If is equal to , the metadata is stored directly after it If is any other value, it represents the index to the metadata stored at the root group (this value must be less than the number of index metadata entries otherwise it is a invalid file) Bits : I have no idea what these represent, if anything Technically it’s Ogawa-Flavour Alembic with its own specific meta-schema, of which I will get to that later in the article.  ↩︎ https://github.com/alembic/alembic/wiki/Ogawa-Specification   ↩︎ I’m a huge fan of file binary formats designed this way where they use effectively internal relative pointers , of which I should explain my preferred approach to designing binary file format in the future.  ↩︎

0 views
Ginger Bill 3 years ago

Multiple Return Values Research

I have recently been thinking about multiple return values as a concept, and wondering if there has been any kind of literature into the topic of “true” multiple return values which are not emulated through tuples . My current working hypothesis is that I think I have come to the conclusion (unless there is evidence to the contrary) Odin has invented a new kind of type system, something to the akin of polyadic expressions . It appears that Go and Odin are the only ones with what I call “true” multiple return values, but Odin’s approach is a lot more sophisticated than Go’s and extends the concept further. Odin’s idea is pretty simple: an expression may have n -values, where n is a natural number ≥0 . Meaning an expression may have 0, 1, 2, 3, 4, etc values, of which each value has a type, not that each expression has a type. An expression with 0-values has no (final) type to it (not even like in some languages). A way of thinking about this is that Odin is fundamentally a polyadic language rather than a monadic language, like pretty much everything else. However most languages seem to be polyadic in one aspect: input parameters to procedures/functions. In most languages, you have multiple (monadic) inputs but only one (monadic) output, e.g. ( ). Odin’s approach is to just extend this to the rest of the language’s type system, not just procedure inputs. In languages with tuples , they can emulate the ability to have multiple return values, especially if the language has sugar encouraging it. Let’s take Python as this basic example where tuples can be used to emulate multiple return values: Other languages may not use tuples to achieve multiple return values; languages such as Common Lisp and Lua do things slightly differently by pushing multiple values BUT the handling of the number of values is handled dynamically rather than statically: This is similar to in Common Lisp, Scheme, etc, of which it can suffer from exactly the same issues in terms of dropping of value bugs etc, especially due to its dynamic typing approach. A good example of this is to show that is the same as in Common Lisp. However, dynamically typed languages are not my area of research for this specific question, statically and strongly typed ones are. And in that case, all of the languages that allow for something like multiple return values (except for Odin and Go) use tuples. However tuples can still be passed around as a singular/monadic intermediary value. Tuples are effectively a form of monad which wrap values of (possibly) different types (essentially an ad-hoc record/struct type), which some languages may allow for explicit or implicit tuple destruction. Odin does differ from Go on multiple return values in one important aspect: Go’s approach is a bit of hack which only applies in the context of variable assignments. The following is valid in Odin but not valid in Go: The reason being is that Go’s approach for multiple returns only works in the context of assignment, and the assignment assumes that there is either N-expressions on the RHS or 1-expression on the RHS to match the N-variables on the LHS. Odin’s approach is a little more sophisticated here, the assignment assumes that there are N-values on the RHS for each of the N-variables on the LHS; where the N-values may be made from M-expressions (N ≥ M). Another fun thing about Odin because of this fact is that you can do the following (which is also not possible in Go): It also means something like this is possible in Odin: Internally, Odin’s approach could be implemented identically to as if a procedure was an input tuple + an output tuple and then automatically destructed, but semantically Odin does not support first-class tuples. n.b. For this research, implementation details are not of concern, only usage/grammar. Polyadic expressions are also used in another areas of the Odin programming language. One of the most common is the optional-ok idiom 1 : I am wondering if this approach to multiple return values is all Rob Pike ’s fault. Since the only languages I can find that are like what I speak of are: Alef and Limbo had multiple return values through tuples, but tuples were still not really first-class data types in those languages. In many ways, Alef and Limbo were the main precursors to Go (not just Oberon and Modula ). Limbo’s declaration syntax comes from Newsqueak which is also the precursor to Odin’s very own declaration syntax 2 . So maybe the reason there is little literature (that I can fine) into this topic is purely because it is limited to languages related-to/influenced-by Rob Pike. It might be the approach taken by Odin may be extremely unique to Odin, however it is also one of the my favourite things about Odin too. And I am really glad I added it so early on, since it has shown its utility really quickly. Coupled with named return values , bare statements, and , it is an absolute pleasure to work with. If anyone has any more information about this topic, please feel free to contact me with it! P.S. If there is no research done into this area, it is a good sign since there is so much left to discover in Programming Language Theory. This is technically 3 different addressing modes depending on the context ( map-index, optional-ok, or optional-ok-ptr )  ↩︎ https://www.gingerbill.org/article/2018/03/12/on-the-aesthetics-of-the-syntax-of-declarations/   ↩︎ Alef , Limbo , Go (Rob Pike related, the first two being non-destructing tuples) Odin ( me , who has been influenced by many of Pike’s languages) This is technically 3 different addressing modes depending on the context ( map-index, optional-ok, or optional-ok-ptr )  ↩︎ https://www.gingerbill.org/article/2018/03/12/on-the-aesthetics-of-the-syntax-of-declarations/   ↩︎

0 views
Ginger Bill 3 years ago

Memory Allocation Strategies - Part 6

In the previous article, we discussed the free list allocator and how it is commonly implemented with a linked list or a red-black tree . In this article, the Buddy Algorithm and how it applies to memory allocation strategies. In the previous article, the red black tree approach was briefly discussed as a way to improve the time complexity for searching for a free memory block, and get best-fit as a consequence. One of the big problems with free lists is that they are very susceptible to internal memory fragmentation due to allocations being of any size. If we still require the properties of free lists but want to reduce internal memory fragmentation, the Buddy algorithm 1 works in a similar principle. The Buddy Algorithm assumes that the backing memory block is a power-of-two in bytes. When an allocation is requested, the allocator looks for a block whose size is at least the size of the requested allocation (similar to a free list). If the requested allocation size is less than half of the block, it is split into two (left and right), and the two resulting blocks are called “buddies” 2 . If this requested allocation size is still less than half the size of the left buddy, the buddy block is recursively split until the resulting buddy is as small as possible to fit the requested allocation size. When a block is released, we can try to performance coalescence on buddies (contiguous neighbouring blocks). Similar to free lists , there are particular conditions that are needed. Coalescence cannot be performed if a block is has no (free) buddy, the block is still in use, or the buddy block is partially used. Each block in the buddy allocator will have a header (similar to our free list in the previous article) which stores information about it inline . It stores its size and whether it is free. We do not need store a pointer to the next buddy block as we can calculate it directly from the stored size. n.b. Many implementations of a buddy allocator use a doubly linked list here and store explicit pointers, which allows for easier coalescence of neighbouring buddies and forward and backwards traversal. However this does add an extra cost of increasing the size of the allocation header for the memory block. As described above, to get the best fitting block a recursive splitting algorithm is required. We need to continually split a block until it is the optimal size for the allocation of the requested size. Searching for a free block that matches the requested allocation size can be achieved by traversing an (implicit) linked list bounded by and pointers 3 . If a block for the requested allocation size cannot be found, but there is a larger free block, the above splitting algorithm is used. If there is no free block available, the following procedure with return to represent that the allocator is (possibly) out of memory 4 . This algorithm can suffer from undue internal fragmentation. As an exercise for the reader, you can coalesce on neighbouring free buddies 5 as you iterate. Initialization of the buddy allocator itself is relatively simple. The allocator itself stores three pieces of information: the block (same the backing memory data), a sentinel pointer which represents the upper memory boundary of the backing memory data ( , which means it is not a “real” block), and the alignment for each allocation. The procedure below does some minor checks for the data itself with ions. n.b. This implementation of a buddy allocator does require that all allocations must have the same alignment in order to simplify the code a lot. Buddy allocators are usually a single strategy as part of a more complicated allocator and thus the assumption of alignment is less of an issue in practice. Allocation is relatively straightforward since we have set everything else up already now. We first need increase requested allocation size to a fit the header and align forward before we find a best fitting block. If one is found, we then need to offset from the header to the usable data. If a block cannot be found, we can keep coalescing any free blocks until we cannot coalesce any more and then try to look for a usable block again. If not block is found, we return to signify that we are out of memory with this particular allocator. The general time-complexity of this allocation algorithm is O(N) on average but a space complexity of O(log N) . n.b. As buddy allocators are still susceptible to internal fragmentation; it is less than a normal free list allocator but because of the power-of-two restriction, it is less severe in practice. Freeing memory is very trivial with this algorithm since all we need to do is mark the header (which is stored before the passed pointer) as being free. The general time-complexity of freeing memory is O(1) . If you wanted to, could be performed straight after this free to aid in minimizing internal fragmentation. The buddy allocator is a powerful allocator and a conceptually simple algorithm but implementing it efficiently is a lot harder than all of the previous allocators that have been discussed in this series. In the next set of articles, I will discuss the a lot about virtual memory; how it works, how we can utilize it, and its benefits. The Wikipedia article is not that easy to understand, especially from the basic table diagram given in the Example section. (Accessed 2021-12-01)  ↩︎ Just like Jackie Chan and Chris Tucker in Rush Hour .  ↩︎ The tail is just of the backing memory buffer, representing a sentinel value of the memory boundary, it is not a true block.  ↩︎ The allocator may have enough memory left but none of it is contiguous due to too much internal fragmentation.  ↩︎ All becoming a single buddy, trying to be someone else: https://www.imdb.com/title/tt0120601/   ↩︎ The Wikipedia article is not that easy to understand, especially from the basic table diagram given in the Example section. (Accessed 2021-12-01)  ↩︎ Just like Jackie Chan and Chris Tucker in Rush Hour .  ↩︎ The tail is just of the backing memory buffer, representing a sentinel value of the memory boundary, it is not a true block.  ↩︎ The allocator may have enough memory left but none of it is contiguous due to too much internal fragmentation.  ↩︎ All becoming a single buddy, trying to be someone else: https://www.imdb.com/title/tt0120601/   ↩︎

0 views
Ginger Bill 3 years ago

Memory Allocation Strategies - Part 5

In the previous article, we looked at the pool allocator , which splits the supplied backing buffer into chunks of equal size and keeps track of which of the chunks are free. Pool allocators are fast allocators that allow for out of order free in constant time O(1) whilst keeping very little fragmentation. The main restriction of a pool allocator is that every memory allocation must be of the same size. A free list is a general purpose allocator which, compared to the other allocators that we previously looked at, does not impose any restrictions. It allows allocations and deallocations to be out of order and of any size. Due to its nature, the allocator’s performance is not as good as the others previously discussed in this series. There are two common approaches to implementing a free list allocator: one using a linked list and one use a red black tree . Using a linked list approach is the most common approach and what we’ll look at first. As the title of this section suggests, we’ll be using a linked list to store the address of free contiguous blocks in the memory along with its size. When the user requests memory, it searches in the linked list for a block where the data can fit. It then removes the element from the linked list and places an allocation header (which is required on free) just before the data (similar to what we used in the article on stack allocators ). For freeing memory, we recover the allocation header (stored before the allocation) to know the size of the block we want to free. Once that block has been freed, it is inserted into the linked list, and then we try to coalescence contiguous blocks of memory together to create larger blocks. n.b. The following implementation does provide some constraints on the size and alignment of requested allocations with this particular allocator. The minimum size of an allocation must be at least the size of the free list node data structure, and the alignment has similar requirements. These are the data structures that are required to implement the linked list based free list allocator. To allocate a block of memory within this allocator, we need to look for a block in the memory in which to fit our data. This means iterating across our linked list of free memory blocks until a block has at least the size requested, and then remove it from the linked list of free memory. Finding the first block is called a first-fit placement policy as it stops at the first block which fits the requested memory size. Another placement policy is called the best-fit which looks for a free block of memory which is the smallest available which fits the memory size. The latter option reduces memory fragmentation within the allocator. In the diagram there three free memory blocks, but not all are appropriate for the size of the memory allocation that is requested (plus the header) When an allocation has been made, the free list will then be corrected to remove the used node. This algorithm has a time complexity of O(N) , where N is the number of free blocks in the free list. When freeing a memory block that was allocated with our free list allocator, we need to retrieve the allocation header and that memory block to be treated as a free memory block now. We then need to iterate across the linked list of free memory blocks until will get to the right position in memory order (as the link list is sorted), and then insert new at that position. This can be achieved by looking at the previous and next nodes in the list since they are already sorted by address. When the insert of the free list, we want to coalescence any free memory blocks which are contiguous. When we were iterating across linked list we had to store both the previous and next free nodes, this means that we may be able to merge these blocks together if possible. This algorithm has a time complexity of O(N) , where N is the number of free blocks in the free list. General utilities needed for free list insertion, removal, and calculating the padding required for the header. The other way of implementing a free list is with a red black tree ; the purpose of which is to improve the speed at which allocations and deallocations can be done in. The linked list from above, any operation made is needed to be iterated across linearly ( O(N) ). A red black reduces its time complexity to O(log(N)) , whilst keeping the space complexity relatively low (using the same trick as before by storing the tree data within the free memory blocks). And as a consequence of this data-structure approach, a best-fit algorithm may be taken always (in order to reduce fragmentation whilst keeping the allocation/deallocation speed). The minor increase in space complexity is due to instead of using a singly linked list, a (sorted) doubly linked is required, but as a consequence, it allows coalescence operations in O(1) time. This implementation is a common aspect in many implementations, but note that most s utilize multiple different memory allocation strategies that complement each other. I will not demonstrate how to implement this approach in this article and leave it as an small exercise for the reader. The following diagram may help: The free list allocator is a very useful allocator for when you need to general purpose allocator that requires allocations of arbitrary size and out of order deallocations. In the next article, I will discuss the buddy memory allocator .

0 views
Ginger Bill 4 years ago

The Value Propagation Experiment Part 2

[Originally from a Twitter Thread] I have revisited The Value Propagation Experiment and have come to a different conclusion as to why it failed and how I recovered it. The recovery work has been merged into master now with this PR: https://github.com/odin-lang/Odin/pull/ 1082 I think there were three things which were confusing which make it look like a failure of an experiment: Changes in the PR: Most people understood what the purpose and semantics pretty intuitively but many were very confused regarding the semantics of . My conclusion was that the keyword-name and positioning were the main culprits for that. Another aspect which I stated in the original article was that I thought the construct was not as common as I originally thought. And for my codebases (including most of Odin’s core library), this was true. However it was false for many codebases, including the new core:math/big . In the core:math/big package alone, or_return was able to replace 350+ instances of idioms. The more C-like a library is in terms of design, the more it required this construct. It appears that when a package needs it, it REALLY needs it. When this correction was made, there were 350+ instances of in a ~3900 LOC is ~9% of (non-blank) lines of code. That’s a useful construct for definite. Another (smaller) package that found or_return useful was the new core:encoding/hxa, which is an Odin native implementation of @quelsolaar ’s HxA format. https://github.com/quelsolaar/HxA The entire implementation for reading and writing is 500 LOC, of which there are 32 s. I do believe that my general hypotheses are still correct regarding exception-like error value handling. The main points being: The most important one is the degenerate state issue, where all values can degenerate to a single type. It appears that you and many others pretty much only want to know if there was an error value or not and then pass that up the stack, writing your code as if it was purely the happy path and then handling any error value. Contrasting with Go, Go a built-in concept of an interface, and all values degenerate to this interface. In practice from what I have seen of many Go programmers, most people just don’t handle error values and just pass them up the stack to a very high place and then pretty much handle any error state as they are all the same degenerate value: “error or not”. This is now equivalent to a fancy boolean. I do hope that this addition of does aid a lot of people when making projects and designing packages. It does appear to be very useful already for many of the core library package developers for Odin. was a confusing name for what the semantics were. as a prefix was the wrong place. Unifying with was a bad idea for many reasons. Built-in procedure became a binary operator instead (meaning it became a keyword). was added as a keyword. became a suffix-operator/atom-expression. Error value propagation ACROSS library boundaries Degenerate states due to type erasure or automatic inference Cultural lack of partial success states

0 views
Ginger Bill 4 years ago

The Value Propagation Experiment

[Originally from a Twitter Thread] Part 2 of this Experiment I recently experimented with adding a feature into Odin which allowed for a way to propagate a value by early returning if that value was or not . It was in a similar vein to Rust’s which became , or Zig’s , etc. I have now removed it from Odin. But why? The hypothesis was that that this idiom was common: where may be an enum, a (discriminated) union, or any other kind of value that has . And replace it with 1 This construct solves a very specific kind of error handling, of which optimizes for typing code rather than reading code. It also fails because Odin (and Go) are languages with multiple return values rather than single-type returns. And the more I think about it, the and similar stuff may be the least worst option, and the best in terms of readability. It’s a question of whether you are optimizing for reading or typing, and in Odin, it has usually been reading. And something like instead of does reduce typing but is a lot harder to catch (even with syntax highlighting). It happens that Go already declined such a proposal for numerous reasons. And the research done for this is directly applicable to Odin because both languages share the multiple return value semantics. The research has been fruitful however. I did experiment with a construct which has now become a built-in procedure which can be used on things with an optional-ok check e.g. map indices, type assertions Some people may be a little surprised with my experimentation with this exception-like shorthand with error values. Especially since I wrote an article (which was originally two github comments) titled: Exceptions — And Why Odin Will Never Have Them . One thing I did not comment on in the that article is the cause of the problem (other than the cultural issues). My hypothesis is that if you have a degenerate type ( type erasure or automatic inference), then if a value can convert to it implicitly (easily), people will (ab)use it. So in languages with exceptions, all exception values can degenerate to the “base type”. In Rust, it can either go to the base trait or be inferred parametrically. In Zig it can either do or it will infer the error set from usage. Go has the built-in interface type which acts as the common degenerate value. As I discuss in the article, I am not against error value propagation within a library, but I am pretty much always against it across library boundaries. A degenerate state has high entropy and a lack of specific information. And due to this form of type erasure, “downcasting” (broad use of term) is a way to recover the information, but it assumes implicit information which is not known in the type system itself. The other issue when people pass the error up the stack for someone else to handle (something I criticize in the previous article already) is that it’s common to see this in many codebases already that have such a type: Go, Rust, and Zig (public) codebases exhibit this a lot. And my hypothesis for this phenomenon is due to the very nature of this “degenerative type”. Now a design judgement is to be made when designing a language: is such a concept worth it for the problems it intrinsically has. For Odin, I do not think it was worth it. In Odin, errors are just values, and not something special. For other languages? That’s another thing. For Odin, I have found that having an error value type defined per package is absolutely fine (and ergonomic too), and minimizes, but cannot remove, the problem value propagation across library boundaries. was a bad idea for Odin consider the rest of its semantics (multiple return values, lack of error value type at the semantics level, optimizes for typing rather than reading) has now become which is useful. n.b. I am not criticizing any particular language’s design for doing this, but rather saying that it does not work well for Odin’s semantics nor philosophy. Part 2 of this Experiment The concept of worked by popping off the end value in a multiple valued expression and checking whether it was or , and if so, setting the end return value to value if possible. If the procedure only had one return value, it did a simple return. If the procedure had multiple return values, required that they were all named so that the end value could be assigned to by name and then an empty could be called.  ↩︎ The concept of worked by popping off the end value in a multiple valued expression and checking whether it was or , and if so, setting the end return value to value if possible. If the procedure only had one return value, it did a simple return. If the procedure had multiple return values, required that they were all named so that the end value could be assigned to by name and then an empty could be called.  ↩︎

0 views
Ginger Bill 4 years ago

Untyped Types

When I was designing the constant value system in Odin , I wanted literals (especially numbers) to “just work”. I was inspired by how both Ada 1 and Go 2 both handled their constant value systems. But this lead me to a realization that there are two general different models of thought when it comes to values in programming languages. Model-1 is the more traditional approach in most languages, especially C-like languages. This has the consequence that all literals must have a concrete type associated with them. Using C as an example, every number literal has a type, and to change to a specific type requires a suffix: In C, there is a set of rules called the “usual arithmetic conversions” which are a form of implicit type promotion. Due to this concept, most people who do not realize that literals have specific types have had little issue in practice due to these implicit conversions. However, these implicit conversions may leads to many bugs, crashes, and other problems relating to portability when integers of different sizes and signedness are combined 3 . Model-2 is quite different and can be very foreign to think about if you are so used to Model-1. Model-2 treats values closer to how most people intuitively think about them. The literal just represents the number one hundred and twenty three. It has no intrinsic type to it, it’s just a value. Applying this idea to a programming language, the value literal can be represented by a whole range of different types. In Odin, all of these examples will work with the same value literal: Value literals are not just limited to numbers but work for other kinds of values: No implicit conversions at runtime have been performed (unlike C) as each value can be represented without truncation. A consequence of this model is that constants could have no (concrete) type to them: In order to get this to “feel correct” and make it “just work” leads to a complication in the design of the compiler requiring a big number implementation, since numbers don’t have any “size”/“width” to them. As a consequence, is an error in Odin, whilst it’s perfectly valid in C because has the width of . In order to achieve the same thing in Odin, a type must be assigned like . Odin supports type inference which leads to a interesting question, is the following a valid statement: There are two solutions to this problem: make this invalid since has no type, give a untyped type . Untyped types sound like an oxymoron but it is a way to give a default type to a typeless value. The literal can be assigned the “untyped type” of “untyped integer”. Each “untyped type” can be assigned a default type to which if the value needs to be made concrete at runtime, it will default to that (if possible). In Odin, comparison operations will result in the type of the expression to be an “untyped boolean”. There are two reasons behind this behaviour: For Model-1, the consequence of every expression having a type requires the idea of giving a type to something with no value: . For Model-2, the consequence of some expressions not have a type does not lead a concept of in the type system. I have noticed this confusion when people ask what the equivalent of is in Odin, which is (a separate specialized pointer type). Another consequence is that it allows for the ability for expressions to have multiple values (not tuples) associated with them, which is how Odin’s multiple return values in procedures work. A common question after learning about each model is ask “what are the advantages of each model?”. Personally, I think this question is actually nonsensical since you cannot compare without context due to each model having different foundational axioms. The advantages completely depend on what you are trying to aim for, the models cannot be compared out of context. In Odin and Go, literals are “untyped” and there are (virtually) no implicit type conversions. The advantage of the untyped literals in this case is that they work for any type that can represent that value (which has no type). This also complements the distinct type systems of both languages. In languages with implicit type promotions, literals being typed is less of a (hypothetical) problem. If the rules can be defined without many of the issues from C, then typed literals have little disadvantages to them. Applying typed literals to a distinct type system will cause some problem since explicit casting may be required or defeats the point of distinct typing because implicit casting will be applied. https://www.adaic.org/resources/add_content/docs/95style/html/sec_3/3-2-6.html   ↩︎ I highly recommend reading the article regarding how Go implements its constant value system https://blog.golang.org/constants   ↩︎ Most modern C compilers can catch these bugs but that is a patch over a design flaw rather than solution to the underlying problem.  ↩︎ Model-1: Expressions have a type, not all expressions may have a value. Therefore all values must have a type. Model-2: Expressions may have a value, not all expressions have a type. Therefore some values may not have a (concrete) type. Allows any comparisons to assign to any boolean type Allows the backend to choose the more efficient sized operation if needed, rather than requiring a byte sized operation. https://www.adaic.org/resources/add_content/docs/95style/html/sec_3/3-2-6.html   ↩︎ I highly recommend reading the article regarding how Go implements its constant value system https://blog.golang.org/constants   ↩︎ Most modern C compilers can catch these bugs but that is a patch over a design flaw rather than solution to the underlying problem.  ↩︎

0 views
Ginger Bill 4 years ago

Structured Control Flow (Brain Dump)

Note: This is a “brain dump” article, and subject to be cleaned up. Examples in Odin Procedure call Terminating Conditional , , Looping - loop with initial statement, condition, post statement, and body - loop with a value to be iterated over - loop with condition then body - loop with body then condition Branching - go to end outside of the control statement - skip to the end of a loop - merge two switch case bodies, to have multiple entry points to the merged body Labels on other control flow statements Deferred / Structured Exception Handling (not specifically Microsoft’s SEH , Default (named) return values Procedure call x86: Call instruction Terminating x86: Return instruction Comparison x86: Comparison instructions Jump x86: Unconditional jump instructions x86: Conditional jump instructions Indirect Jump (goto by value) GNU C: Labels as values (jump tables) Stack Unwinding (exceptions) longjmp/setjmp

0 views
Ginger Bill 4 years ago

The Essence of Programming

One thing I have noticed a lot when a programmer is struggling to solve a problem, especially a novice, is that he is stuck worrying about the “best way” to implement the solution rather than actually understanding the problem he has. I believe a lot of this stems from not understanding the essence of what programming fundamentally is. In a previous article of mine, I state that “Programming is a tool to solve problems that you have in the domain of computers”. At the essence of everything to do with programming, it is using and building tools with computers. My honest belief is that studying the concept of a tool itself and its essentially ordered aspects 1 will aid the use into correctly structuring our thinking about how we build tools in general. The fundamental aspects of a tool are: Each following aspect/stage is a fulfilment of the previous, to which so stage can be skipped. A tool by its very nature is a means to achieve a particular ends. A good tool is one that fulfils its purpose well, the reason it was made. We use and build tools for purposes , there is a why behind the tool. The function of a tool, is the tasks that a particular means is assigned to accomplish. Using an example, the function of hammer may be described in that it used to drive nails, but in general, it hits other lesser tools (a nail being another kind of tool). The specific function of the tool is determined by the specific problem. Note: To clarify the difference, the purpose is the why behind the tool and the function is the what behind the tool. A tool is used in a particular way by the user. The usage is the fulfilment of its function. Most hammers will have a place for the user to hold, its handle, and a part which it can be used to hit things, its head/face. The usage of a tool is restricted by what function that tool has. Finally, how that tool is to be implemented is restricted and dictated by how tool is meant to be used . If all of the previous aspects have been fulfilled, then the implementation is most of the time not complicated (especially in programming). A hammer used for bricklaying may be implemented with a wooden handle, and a flat smooth steel head. Note: Aesthetics fits under this category of implementation or form. The implementation and how it appears can be separated, but rarely is in practice. How someone implements a tool is part of the aesthetic of that specific tool. Aesthetics is usually something that is evolved and discovered through tradition and not from pure reason. For most people reading this article, most of this will seem dead obvious but they may not have seen it in this specific articulation. So why do I bring it up? Because forgetting (or not knowing of) this structure of the reality of what you are actually doing is the pitfall many novices, and even veterans, fall into when programming. Getting caught up in what is the best way to implement a program, or any problem for that matter, distracts the programmer from the problem. For the vast majority of problems, especially the problems novices have, if you understand the purpose, function, and usage of the tool you are creating, then its implementation will be “trivial” in the sense that the previous aspects restrict how you approach the problem. When teaching novices, I have found that there are usually two different kinds of people (sometimes embodied in the same person): Interestingly, the former is much more common than the latter. In the latter case, these individuals are natural problem solvers and require only teaching the explicit thinking that goes on and the tools they require for producing their program, e.g. algorithms and data structures. In this case, practising is required to hone the skills of the novice. In the former case, it does not matter if all the tools can be taught at all if the individual does not understand how to use them, what their function is, nor what the purpose of the problem he is trying to solve. This kind of novice needs to learn how to think like a problem solver. Understanding this process of structuring your thinking of what a is tool, allows you to apply it further, and that can be applied to every aspect itself recursively. Each stage can be split up further into its own four-part structure. For example, there usually are many different ways to implement something to achieve the same ends which means understanding which one is the best compromise for the current situation. Each stage can also give feedback as to whether it is possible to fulfil the previous stage, i.e. trying to implement a particular usage may unearth the impracticability or impossibility of the usage. This does not mean the process has been reversed but rather shown that the previous aspects were not stable foundations to work upon. Understanding how something is implemented can aid a lot about why something is used a particular way too. Sometimes the implementation of something is difficult to determine, as hard problems are hard. Sometimes you cannot make a problem less complicated, but you can try to make it less complex by breaking it down into smaller problems. This is a skill that is continually learnt over many years, and not something you will ever perfect but will get better at. There is no such thing as an unqualified universal “best way” to solve a given problem in the abstract. And what is needed to be done is to understand the particular situations that requires one to attend to such problems. Compromise is a given, and the only way to get better is to practice, practice, practice. Abstract solutions are called algorithms, but how that algorithm is implemented depends on what it will be implemented on—the machine. For novices, it should be made clear that being taught tools (e.g algorithms, data structures, paradigms, idioms, etc), does not imply that the “correct way” to do things has been learnt. The “correct way” is heavily specific to the problem domain; this cannot be taught but only learnt through experience over many years. Actualizing abstract ideas into reality is difficult to do, even for veterans to programming. Aiming to abstract/generalize a problem is a recipe for disaster, especially for novices, because it is rarely ever the case problems are abstract/general. Most problems are particular/specific, and require knowledge of the domain of that problem. Application is should not be separated from the teaching of the topic. Try to solve the specific problem that is actually at hand, not a general problem that might exist 2 . Programming is an inherently practical endeavour rather than a theoretical one. And its teaching should reflect that. Theory is important, but most programmers do not want to, nor should they, have to become mathematicians in order to understand the application of anything. Note: For novices, a good rule of thumb when implementing a program is to aim for clarity, especially for others to read and comprehend, and do not try to be clever. And novices ought not be afraid to rewrite code if needed. Getting a good (not perfect) solution the first time is not expected of a novice, and through experience, getting a good solution will come quicker. Through the habit of practice, one will become better at the art of programming. It should be clearly noted that this ordering only works for solving known problems, it is a way of thinking about purpose-driven domains, i.e. engineering . This process of structuring your thinking does not apply to exploratory things such as research and science . Science is not a purpose-driven domain but a exploration-driven domain. There is not a “goal” to the art of science, but a continuous process of discovery. As a discipline, programming is fundamentally closer to something like carpentry than a science. Programming is a craft of solving problems on computers. Understanding the essence of what a tool is helps us structure our thinking when solving problems. It gives us a language and process to understand why and how we approach building tools. Many novices fall into the same trap of not understanding the problem that is trying to be solved, and are too concerned about the implementation. Teaching novices how to think about solving problems is extremely important in improving and honing in the craft. This is a Peripatetic / Aristotelian schema.  ↩︎ https://www.gingerbill.org/article/2020/05/31/progamming-pragmatist-proverbs/   ↩︎ Its Purpose, And Implementation/Form. Novices who are stuck worrying about how to best implement their programs, but do not actually understand what the purpose of the program they are trying to do Novices who understand what the purpose, function, and usage, of what their program is, but do not know the tools well enough about how to implement stuff This is a Peripatetic / Aristotelian schema.  ↩︎ https://www.gingerbill.org/article/2020/05/31/progamming-pragmatist-proverbs/   ↩︎

0 views
Ginger Bill 5 years ago

The Fatal Flaw of Ownership Semantics

I have been toying with a theoretical idea for the past 18 months off-and-on in my head and I have not fully articulated it aloud yet. It is regarding the concept of Ownership Semantics (OS) or Move Semantics in programming languages. Fundamentally this article is a criticism of the concept and states that the concept is a duality of traditional OOP but applied to a different area. A general list of definitions of terminology used within this article in order to minimize confusion. A Value is a datum with an associated type A (Data) Type is an attribute of a value which encodes information about how the data value can be operated upon An Object is a value with associated behaviour, and thus implies it has agency A Class is the data type of an Object A hierarchy of value ownership is a hierarchy of responsibility of values An Owned-Value is a value which belongs to a hierarchy of value ownership, which implies it is governed by an agent An Agent is an actor with the capacity to act within a given environment A Model of Interpretation is a way to view and analyse a subject A Paradigm is a way of classifying models of structure of programming languages; a Paradigm is a model of interpretation Object Orient(at)ed Programming (OOP) - A paradigm of structuring a program around the sole concept of Objects , commonly through coupling data and code into a single unit. Ownership/Move Semantics (OS) - Orientation around responsibility of values in a hierarchical fashion Though the original conception of the term coined by Alan Kay 1 was never used as he intended it to be, the term Object Orient(at)ed Programming (OOP) has been commonly understood to be a paradigm of structuring a program around the concept of Objects , commonly through coupling data and code into a single unit. Many languages support multiple paradigms, including aspects for the OOP paradigm, but I would class those as multiparadigm rather than being solely an OOP language. Most languages implement Objects and Classes in the Simula tradition; most of the notable OOP languages have a similar form by defining methods (member functions) within the class definition. Traditionally languages such as Java can be classed as solely an OOP language. Most traditional OOP languages are based around the concept of inheritance , a mechanism of deriving a class data type from another class data type and retaining similar information. Most people generally view inheritance as a combination of subtyping and dynamic dispatch through virtual method tables (vtables) . This has lead to many discussions asking whether a language can be called as OOP if it does not support inheritance 2 . In recent times, inheritance has been falling out of fashion in favour of composition 3 . This is mostly due to the issue of conforming a class to a strict (singular) hierarchy of agency when in reality, things can belong to many (if not infinite) categories and hierarchies, as well as another aspect which I will be discussing throughout this article. There are many criticisms of OOP 4 5 6 7 8 9 but my general criticism is that by placing emphasis on trying to solve problem in the type system, it shifts focus from the data structures and algorithms, the core of what a program fundamentally is . Since objects themselves are being treated as if they have behaviour (not just type properties), they are effectively being treated as if they were agents in the program. This mental model has many conclusions, many of which cause issues. In my article Pragmatism in Programming Proverbs , I state: Object orientated programming is a form of misinterpreted and misapplied Aristotelian Metaphysics applied to a domain it was never meant to model What I mean by this statement is that artificially conforming any/all relationships between data and types to an artificial hierarchy of agency is a form of naïve-Aristotelian metaphysics. Since there is no actual agency in the programming objects, it is a partial fallacy. When trying to conform a program to have a particular structure when it does not naturally, the absence of a structure in a program is more useful than a bad structure. The concept of adding methods to classes/objects has proven useful to many. The real questions are: For most people, I am going to bet that methods, in languages with an emphasis on inheritance rather than composition (such as C++ or Java), are treated as a way of categorizing and associating functions/procedures with a data record. There are a few reasons for this approach: From experience, I have found that long time users of “OOP” languages eventually start treating methods primarily in the first two approaches. I will not go into depth about the other main aspects of OOP such as encapsulation, local retention, forms of polymorphism, etc, as the hierarchical nature is the fundamental aspect of focus for this article. The (linear) hierarchy of agency is the main problem. The reason why people argue for composition over inheritance is that it flattens this linear hierarchy, reducing its effect. It is the transition from nominal typing to structural typing , which is more flexible because many data structures and problems have a non-linear nature to them, which linear approaches cannot handle. When trying to adhere to the the strict hierarchical type system approaches, it leads to numerous issues because data is more commonly graph-like (non-linear) than tree-like (linear) for most problems. This strict hierarchy does occur with encapsulation at the object level too, a strict hierarchy of messages/references; this hierarchical nature arises from the concept agency itself, inheritance is not the root cause. n.b. Inheritance is not all bad and does have many real life practical uses, but these costs must be known before using them, like with any tool. n.b. The linearity is with regards to the data structures themselves and not the algorithms. C++11 introduced the concept of move semantics or ownership semantics (OS), a way to minimize the copying of data through copy constructors. It utilizes the added concept of r-value references ( ) as a means to do this. However, the concept began to be used for a lot more than its basic purpose. The concept adds the high level abstraction of “moving” objects rather than “copying” objects. Physically, a computer only ever copies and this high level abstraction, to treat objects as if they were “real objects”, is not what actually happens. It is also a category error to treat them as “real objects” since “real objects” and “programming objects” have little connection with each other ontologically. When a value or object is “moved”, this means is that the responsibilities of the resources of that object have been transferred to another object or environment— agents . In this case, ownership/move semantics is fundamentally based around the responsibilities of values by tracking value usage. In this model of agency, the arena of agency can take on many forms, such as blocks, procedure bodies, or aggregate values. Therefore some owned-values also own other values, and thus a value could have agency. If we were to call Ownership Semantics a paradigm, it would be the orientation around the responsibility of values in a hierarchical fashion, placing emphasis on this system of responsibility, shifting focus from data structures and algorithms. The concept of responsibility and ownership is similar to the real world counter parts in that to own something means to have exclusive use and full responsibility over it. Rust is a multi-paradigm programming language but at its core is an Ownership-Orientated language. Everything in Rust has a concept of “ownership” and lifetime associated with it. Rust is designed around trying to be first and foremost “safe”, especially with regards to concurrency. Rust derives from the C++ family in terms of philosophy and style, but uses a more qualifier-focused declaration syntax and many concepts from functional languages from the ML family . Lifetimes are theoretically orthogonal to ownership but in practical, they usually are intrinsically coupled. I will not discuss the problems with object-based lifetimes in this article. The following Rust code can be used to demonstrate this responsibility transfer between different capturing things such as statements: Rust is an immutable-by-default language, with the option to opt into mutability with . Immutability helps a lot with mathematical proofs for logic since things things can be “flattened” quite easily, however virtually all computers are fundamentally mutable things, even if the abstraction of immutability is a useful tool. As a result, the ownership semantics system requires a few more rules to take into account mutability, by adding the concept of “borrowing” through references. The general rules for the borrow checker are: When using Rust (or move semantics to their full extent in C++11), most people will fight the borrow checker regularly 10 (especially newcomers to the language or people swapping between different languages). Many people have found approaches to reduce these issues: Essentially, all of these approaches are bypassing the borrow checker in some form (if not entirely), especially the use of indices/handles. The first three approaches are to flatten the (linear) responsibility hierarchy. n.b. Ownership semantics do have many practical use cases, and can be used to prove the safety of numerous problems, especially reducing vulnerabilities in programs. This is the main purpose why Rust was developed at Mozilla. Web browsers need to be very safe programs, through sandboxing, data races, networking, and other concurrency issues. Being able to prove certain things at compile time is a very useful thing when it comes to safety and robustness of a program. However, as I have stated that because of the linear nature of OS, it cannot solve a whole range of other problems without resorting to or another way of bypassing the borrow checker entirely. Ownership semantics are a form of an affine substructural type system 11 12 which means that they are fundamentally described by a linear logic , and explains why it struggles to express non-linear problems. Therefore ownership semantics and the borrow checker are fundamentally a linear tree (hierarchy) and not a non-linear graph, as described by its underlying formal logic. Many data structures and problems in real life are fundamentally non-linear , of which linear approaches cannot handle. In C++11 with introduction of move semantics, the STL includes the concept of “smart pointers”, each with different substructural logic. If you would like to learn more about the fundamental logic of the ownership semantics applied to the Rust language, I recommend reading this paper explaining the logic using formal mathematics: Oxide: The Essence of Rust (arXiv:1903.00982). As I have described above, both the OOP and OS both share similarities: In the the OOP case, the value/object is the agent. In the OS case, the agent is whatever has responsibility for it (e.g. another object, function, block, etc). Both have linear value hierarchies which are quite strict and singular. Both are singular in nature in that they deal with singular forms of values rather than groups of values. They are both (traditionally) very hierarchical, and place emphasis on the system as a way to control the processes rather than the algorithms directing the processes. Objects and Owned-Values are fundamentally “nouns”, but programs are “verbs”. Dealing with singular values can be very useful, but not everything is a value. Some things are fundamentally “non-values” e.g. instructions/control-flow/declarations. It is a similar holistic world-view as OOP where everything must be X(, or produce X). Ownership Semantics are separate from Lifetime Semantics , but they are both required to be useful in more complex problems, and are usually coupled; naturally because of the singular-value-based nature. From these articles, many others have argued that languages like Rust would solve many of these problems, such as use-after-free . However this may not be necessarily true. It is correct that ownership semantics would solve some problems that cause things like use-after-free but that does not mean it will solve most of them. Even if things like use-after-free are security/memory bugs, they are usually a symptom of another larger problem than it itself being the root cause. One thing many people will ask whilst reading this article is “if Ownership Semantics are bad, what do you propose as a replacement?” In general, most hard problems cannot be solved at compile time; because of this, adding more and more concepts to the type system of the language will not help without adding extra costs. This does not mean ownership semantics is bad but not a solution for many problems in that domain. A lot of the problems related to responsibility are better solved with forms of “subsystems” within programs which handle groups of “things” and give out handles of the “thing” instead of a direct reference. This is related to the approach many people use already to bypass the borrow checker through using indices/handles. Handles can contain a lot more information than a singular number. A common approach is to store a generation number alongside the index in the handle. If a generation has died, but the handle is asked to be used, the subsystem can give out a dummy sentinel value and report an error. Other approaches are to reduce the need for responsibility in the first place. By keeping data structures POD , trivially copyable, and the zero value of it useful, can help you change the way you think about the problem at hand and simplify code. It places more emphasis on the data and algorithms themselves rather than the relationships between objects and types. Ownership semantics are a way to handle the responsibility of values in a hierarchical fashion Orientation around responsibility of values in a hierarchical fashion. This results in a (linear) value hierarchy of responsibility, where agents are responsible for values. The issues of ownership semantics parallel the same structural issues that Traditional-OOP has which results in a (linear) value hierarchy of behaviour, where the values act as agents. Ownership semantics can be a useful tool for certain problems, but due to their underlying linear logic, they cannot be use to express non-linear problems which resorts people to try and bypass the concept entirely. “I thought of objects being like biological cells and/or individual computers on a network, only able to communicate with messages (so messaging came at the very beginning – it took a while to see how to do messaging in a programming language efficiently enough to be useful).” - Alan Kay, 2003 http://userpage.fu-berlin.de/~ram/pub/pub_jf47ht81Ht/doc_kay_oop_en   ↩︎ The Go programming language does not support inheritance. However under my definition, Go is an OOP language but is designed around (implicit) s (a form of type-classes or structural typing ) as a way to compose objects, and methods can be applied to any user defined type; not just record types.  ↩︎ https://en.wikipedia.org/wiki/Composition_over_inheritance   ↩︎ https://www.youtube.com/watch?v=QM1iUe6IofM   ↩︎ http://www.stlport.org/resources/StepanovUSA.html   ↩︎ https://groups.google.com/forum/#!topic/comp.os.plan9/VUUznNK2t4Q%5B151-175%5D   ↩︎ https://commandcenter.blogspot.com/2012/06/less-is-exponentially-more.html   ↩︎ http://harmful.cat-v.org/software/OO_programming/   ↩︎ http://harmful.cat-v.org/software/OO_programming/why_oo_sucks   ↩︎ Getting used bypassing the borrow checker to reduce fighting implies people have just found a way to cope with the constraints it imposes.  ↩︎ Whilst writing this article, I did not realize that this was already developed, and I accidentally rediscovered substructural type systems and linear logic, and had my own terminology for it. However, it is more appropriate to the more commonly used terminology.  ↩︎ A Value is a datum with an associated type A (Data) Type is an attribute of a value which encodes information about how the data value can be operated upon An Object is a value with associated behaviour, and thus implies it has agency A Class is the data type of an Object A hierarchy of value ownership is a hierarchy of responsibility of values An Owned-Value is a value which belongs to a hierarchy of value ownership, which implies it is governed by an agent An Agent is an actor with the capacity to act within a given environment A Model of Interpretation is a way to view and analyse a subject A Paradigm is a way of classifying models of structure of programming languages; a Paradigm is a model of interpretation Object Orient(at)ed Programming (OOP) - A paradigm of structuring a program around the sole concept of Objects , commonly through coupling data and code into a single unit. Ownership/Move Semantics (OS) - Orientation around responsibility of values in a hierarchical fashion And how do people actually conceptualize methods on a day-to-day basis? Easy to organize and search for procedures by a data type Allowing methods as a form of syntactic sugar for writing calls in a subject verb object manner e.g. vs Mental model of behaviour for objects Each value may have as many immutable borrows as you want Each value may only have one mutable borrow at a time Each value may not borrow immutably and mutably at the same time Values will be “dropped” when the owning connecting goes out of scope Taking a value by s the original value Keep blocks small, structs small, etc—this reduces the size of the arena of agency and thus reduces the amount of responsibility it must take care of Minimize self references in structs, i.e. graph-like data structures are difficult to implement using references , , , and are more common than many would like Use indices/handles instead of references - linear (1-use) self owning pointer (1u1o) - normal (N-use) self owning pointer (Nu1o) - normal (N-use) non-owned pointer (Nu0o) - affine (0/1-use) non-owned pointer (?u0o) Traditional-OOP: A (linear) value hierarchy of behaviour. The values act as agents. Ownership Semantics: A (linear) value hierarchy of responsibility. Agents are responsible for values. Chromium Security Bugs Microsoft: 70 percent of all security bugs are memory safety issues Introduction to Memory Unsafety for VPs of Engineering Handles are the better pointers - André Weißflog Bitsquid Articles http://bitsquid.blogspot.com/2011/09/managing-decoupling-part-4-id-lookup.html http://bitsquid.blogspot.com/2014/09/building-data-oriented-entity-system.html Making the Zero Value Useful: Default to Zero Go Proverbs Illustrated “I thought of objects being like biological cells and/or individual computers on a network, only able to communicate with messages (so messaging came at the very beginning – it took a while to see how to do messaging in a programming language efficiently enough to be useful).” - Alan Kay, 2003 http://userpage.fu-berlin.de/~ram/pub/pub_jf47ht81Ht/doc_kay_oop_en   ↩︎ The Go programming language does not support inheritance. However under my definition, Go is an OOP language but is designed around (implicit) s (a form of type-classes or structural typing ) as a way to compose objects, and methods can be applied to any user defined type; not just record types.  ↩︎ https://en.wikipedia.org/wiki/Composition_over_inheritance   ↩︎ https://www.youtube.com/watch?v=QM1iUe6IofM   ↩︎ http://www.stlport.org/resources/StepanovUSA.html   ↩︎ https://groups.google.com/forum/#!topic/comp.os.plan9/VUUznNK2t4Q%5B151-175%5D   ↩︎ https://commandcenter.blogspot.com/2012/06/less-is-exponentially-more.html   ↩︎ http://harmful.cat-v.org/software/OO_programming/   ↩︎ http://harmful.cat-v.org/software/OO_programming/why_oo_sucks   ↩︎ Getting used bypassing the borrow checker to reduce fighting implies people have just found a way to cope with the constraints it imposes.  ↩︎ Whilst writing this article, I did not realize that this was already developed, and I accidentally rediscovered substructural type systems and linear logic, and had my own terminology for it. However, it is more appropriate to the more commonly used terminology.  ↩︎ `XnYo` (X-use Y-owners), `?` (0/1), `2+` (2 or more), `N` (arbitrary uses) Examples: `1u1o` (1-use 1-owner, linear owned-value) `?u0o` (0/1-use 0-owners, affine non-owned-value) `Nu1o` (N-uses 1-owner, normal owned-value) `2+1o` (2+-uses 1-owner, relevant owned-value)   ↩︎

0 views