Hacker News new | past | comments | ask | show | jobs | submit login
Failing to Learn Zig via Advent of Code (forrestthewoods.com)
239 points by forrestthewoods on Jan 17, 2022 | hide | past | favorite | 329 comments



The opinion that reimplementing AoC in two languages was not a good idea resonates a lot with my experience.

Every year, I see people quitting their AoC run with frustration, because they don't measure their effort and try to complete each puzzle daily. The usual scenario is that, at some point, they block on a problem and spend longer than they should, or they can't solve the problem because they have some other thing to do. The next day, they try to solve that puzzle and another one, and as work piles up, they start doing it joylessly to catch up. After a while, it doesn't work anymore, because people are stressed, because they code poorly just to finish rather than taking pleasure in their craft. The strict calendar can become deadlines that remind of work woes.

For a person not used to sport code, AoC can take an hour to solve, sometimes more. Over the whole advent, people can easily work for 30 to 40 hours more than they are used to. In terms of work hygiene, this is not good. Not everyone has a week's worth of work to use just before the end of year.

I would advise people doing AoC to time their effort, and not hesitate to stop a puzzle and write it down for later in the year.


I definitely fall into this category. I stopped on Day 8 Part 2 this year getting frustrated and falling behind. Good practice for sure and I feel more prepared for next year, still, just wanted to post that I identify myself in what you mention here!


Zig is a very low level language. I think the fancy type system can trip up people into thinking they are working with a high-level language.

Zig is basically C with a fancy type system, so you should not expect things like special String types, overloading of index based access etc.

I think the author was thinking that Zig was very close to Rust or C++, when in reality it is much closer to C. I had to keep reminding myself of that many times as I was learning Zig.

I had my own struggled with Zig, but not quite as much as the author. I think will probably have a much better experience if you don't try to jump and code right away but read some articles or listen to some videos to get a sense of the overall philosophy of Zig.

I am normally against having to look at source code, but with Zig that is kind of needed but also not quite as bad as it sounds. Zig code base is not that large and it is relatively easy to search. You can lookup a Zig function signature very easily. You need to do this if you are going to use any of the standard library apart from the most basic stuff.


> Zig is basically C with a fancy type system, so you should not expect things like special String types,

If I can't have nice strings, what should I be expecting from a fancy type system?


Zig's "fancy" (I don't think they're that fancy) type features that IMO make it a great C alternative are:

- non-null pointers, and distinct types for single-item pointers and multi-item pointers (multi-item pointers are rarely used except indirectly via slices, so unchecked pointer arithmetic errors are largely banished)

- builtin tagged unions (AKA algebraic data types) with very pleasant to use switch logic -- it can't be overstated how nice the "handle all the cases" logic is in Zig in general (catch, orelse, if-else/switch expressions)

- a decent proposition for errors (the error union, and try keyword), but I haven't decided if I really like it yet

There's lots of non-type stuff there too. I was writing personal projects in C without libc, but found there to be a lot of annoying work involved -- happy to do it, but it's not earned/fruitful annoyance, more like a long list of incidental historical annoyances. Zig seems to cater to the same level of the stack, but with all that boring stuff taken care of.


Many of Zig's fancy type features are in the library, not in the language, because types can be programmatically constructed at compile time.


So it sounds like Zig makes a distinction between pointers and arrays. Am I reading that right?


Yes, arrays are another distinction:

- an array [3]u8

- a single item non-nullable pointer *u8

- a single item nullable pointer ?*u8

- a multi-item non-nullable pointer [*]u8

- a multi-item nullable pointer ?[*]u8

- a slice []u8

Typically your API is just made up of slices and non-nullable single item pointers. Arrays are just the typical backing store for a slice, that you might define in main or for small scratch buffers. Here's a typical example of an OS read (where the fd has been set to non-blocking already):

    fn readUpTo32Bytes(fd: std.os.fd_t) ![]u8 {
        var array: [32]u8 = undefined;
        var data = array[0..]; // slice of entire array
        data.len = std.os.read(fd, data) catch |err| switch (err) {
            error.WouldBlock => 0,
            else => return err,
        };
        handle(data);
    }

    fn handle(data: []const u8) void { ... }
If you don't need to handle EAGAIN/error.WouldBlock, then just do `data.len = try std.os.read(fd, data);`

I didn't explain the `!` in the `![]u8` return type, but it's basically saying "an error or a []u8", where it's compile time known what the full set of errors is (in this case every error std.os.read can return, minus WouldBlock).


Replying to note I mistakenly left in the `![]u8` return type when it's not returning any byte slice. Should be `!void`.


Also Zig's memory alignment in the type system is great for doing low-level I/O.

Coming from C, I find that Zig's memory alignment options are easier and more powerful.


Totally agree. Having std.os.mmap enforce correct page_size alignment for the base pointer has saved me once already (using mmap and memfd for a fixed size circular buffer where you can always provide contiguous bytes for the full size available, without any memmove).


I still can't figure out how Zig proposes to prevent undefined behavior without a borrow checker (aka MLKit regions) or GC.

AFAICT the answer is "inject as many runtime checks as needed" although the docs seem to go way out of their way to avoid making this explicit.... or deal with the fact that these checks are now runtime failures rather than compile-time failures, and therefore need code to handle them.

It seems like it would be the same as writing Rust code using std::cell instead of references, except that Rust would force you to insert handlers for all the new failure modes this would create (of course you could just panic!(), but at least the compiler would force you to insert those panics...).


You're conflating undefined behavior with spatial/temporal memory safety, the latter of which is what is 100% prevented by Rust's borrow checker, provided you're not interfacing with hardware, in which case I believe that things change more in binary fashion.

However, Zig treats memory safety not as an extreme-at-all-costs but as a spectrum (there are reasons for wanting to think like this, at least when writing low-level code if you want to make more use of the hardware), getting 100% on the spatial memory safety front and reaching to 50-75% on the temporal memory safety front through the GPA. That's already an order of magnitude more safety than C, at which point memory exploits have dropped in ranking, and you should be more concerned about things like explicit control flow, error handling and checked arithmetic, not to mention the orthogonality of the language.

Furthermore, in the systems world, there are many safety critical systems where dynamic allocation and multi-threaded control planes are simply off the table to begin with because they're dangerous in some domains and not as safe as static allocation and single-threaded control planes, which are less dimensional and easier to reason about. And in those cases, UAFs and multi-threaded races are less of a concern (still a concern, but less).

Also, Rust won't protect you from all undefined behavior, and Zig often helps more than you think. For example, you might be surprised to hear that Rust has checked arithmetic off by default in safe builds, whereas Zig has this enabled. I've done a little security work on some large systems and the decision to disable checked arithmetic always blows my mind. Integer overflow and underflow are right up there as threat vectors when writing anything that's touching hostile data.

I'm waiting for the day when Rust changes direction on this, and I think there's a chance this will happen because the alternative status quo of not checking arithmetic (at all) is just not tenable, at least not if we care about safety and security holistically, and not only memory safety.


If dealing with potentially hostile data, Zig certainly isn't more appropriate than Rust in my opinion, try maybe WUFFS.

Suppose we have been given a 32KB data structure with some "step" bytes - in a conforming input these should always sum to less then 32768 and thus the total will easily fit in a 16-bit unsigned integer, so that's what our naive program does. Unfortunately attackers provided a structure whose step bytes sum to more than 65535...

Zig will panic here if using default arithmetic with default release builds. If the attacker wanted to cause a Denial of Service, job done already.

Rust will panic if explicitly told to enable checked arithmetic on release but it also provides explicit checked, wrapping, saturating and so on variants of the arithmetic operators if you want them for this part of your software (perhaps anticipating the risk) you can just have that without touching the behaviour of all other arithmetic in the program. 65530u16.checked_add(255u16) is None even in a default release build of Rust, what you do with that None (silently abandon this input? log the error?) is up to you and of course may not be adequately tested.

However, in WUFFS we simply can't write the erroneous program. It doesn't compile because WUFFS can't see why it's safe. Because it isn't safe. WUFFS requires the programmer to spell out what's going on, and so either you have to realise what might happen ("Oh, it can overflow, I should handle that") or choose a strategy that can't suffer the problem, ("Let's not sum up those steps, I see a different way to handle valid input").


> If dealing with potentially hostile data, Zig certainly isn't more appropriate than Rust in my opinion, try maybe WUFFS.

Thanks! Great recommendation on WUFFS! And completely agreed, it's also easy to turn on checked arithmetic for Rust (if you know about it, but Rust definitely has an unsafe default there for those that don't, which is surprising to me).

At the same time, WUFFS is not always applicable, for example to writing something like a distributed system where you do still want safety, often the flip side of security. I'm sure you'll also agree it's good to balance out that security is more nuanced than just a rant about memory safety to the extreme. It's great to have positive discussions about languages, to evaluate trade-offs positively.

Counter-intuitively, I do feel also that Zig's explicitness as a language as a whole fits a security mindset well. For example, in `std/mem.zig` there's a very careful divExact assertion around underflow when calling `bytesAsSlice()`. This is just a fantastic way to prevent buffer bleeds, i.e. HeartBleed or CloudBleed, but it's probably uncommon to see in many libraries, and something like a borrow checker wouldn't provide this aspect of memory safety automatically. You can easily get lulled into a false sense of security.

From a security angle, I also like Zig's philosophy around very simple control flow and avoiding unnecessary abstractions, no matter if they're zero-cost. I think this is going to lead to a healthier package ecosystem when it arrives, compared to say NPM, where you get these dependency explosions that are a real headache for supply chain attacks. Attackers always go one level deeper, they attack through the basement, and there's often more low-hanging fruit at hand than a UAF (especially considering that many embedded systems that Zig targets probably do static allocation anyway, so bleeds might often be the worst that can happen). It will be interesting to see how Zig's philosophy around explicitness and avoiding bloat makes a difference here.

> Zig will panic here if using default arithmetic with default release builds. If the attacker wanted to cause a Denial of Service, job done already.

In the security world, a DoS is usually not treated as a P1. Perhaps a P3 at best (if you're lucky as a researcher!). For example, I've submitted one or two DoS MIME bomb samples that can shutdown Gmail servers and got very much an "okay, we'll just not bother about it because we're Gmail and our fleet is so massive". The DoS is probably still out in the wild for Gmail. Even ProtonMail, which has experienced numerous outages, didn't classify it as a P1, although they awarded it.

However, for a read/write exploit (running with the email example, perhaps a directory traversal in Apple Mail), having checked arithmetic convert what could have been a P1 into a P3 is actually exactly what you want because it prevents the exploit from going further (these things are almost always chained).

It also surfaces the bug visibly, you get a crash, you investigate, you fix. So from an attacker's perspective, they're actually less likely in fact to try and trigger it, because then they reveal they're in your system.


Something like WUFFS is exactly what we should be using for Wrangling Untrusted File Formats as it says in the name, even if you've decided to do that in a distributed system. Realistically you're definitely going to get this wrong, so, use a language where the worst case is it doesn't work is a massive improvement over using languages where it's all additional attack surface.

That recent Apple bug where they render PNGs incorrectly can (in principle) happen in WUFFS. The other recent Apple bug where bad guys seize control of your iPhone by sending a malicious image file cannot. One of these things is not like the other.

I think you're missing the point if you expect the borrow checker to care about buffer underflow. Rust has a runtime bounds check to check bounds, the borrow checker is, as its name suggests, checking the borrow rules. The trick (compared to arithmetic overflow) is that the optimiser can often push a bounds check outside a fast loop or eliminate it altogether, so you really can afford to do this in all or almost all your release code unlike checked arithmetic. WUFFS shows that you can do away with both of these runtime checks and be entirely safe if you're not interested in being a general purpose programming language. Which is (part of) why WUFFS gets to be both safer and faster. Both Zig and Rust are intended as general purpose languages.

I don't buy the "surfaces the bug" thing because I have too much experience of real world systems where there's so much noise and mayhem that you are focused on stuff that's causing your real users pain. Even if the DoS means the server falls over and must be manually restarted, the ticket in my queue says "Urgent: Auto-restart server. Watchdog maybe?" not "OMG bad guys are trying to break into our system somehow, find out how ASAP"


> Something like WUFFS is exactly what we should be using for Wrangling Untrusted File Formats as it says in the name, even if you've decided to do that in a distributed system.

No, I was saying earlier that there are limits to WUFFS. The example I gave was that you can't write something like a distributed system (think consensus protocol like Viewstamped Replication, Raft or Paxos) in WUFFS, but where safety is nevertheless still critical, and where you reach that through crystal clear control flow and explicitness. In other words, safety is the other side of the coin to security. Hope it's a little more clear now.

> That recent Apple bug where they render PNGs incorrectly can (in principle) happen in WUFFS. The other recent Apple bug where bad guys seize control of your iPhone by sending a malicious image file cannot. One of these things is not like the other.

Of course.

> I think you're missing the point if you expect the borrow checker to care about buffer underflow.

No, I was stating the obvious, that it can't (or at least not always, but in some cases it can), not that it should.

> I don't buy the "surfaces the bug" thing

I was just trying to convey a little bit about how security works and how hackers (or at least red teamers) think, especially when blue teams are involved. I've found that the more I get into this, it becomes much less about preventing the breach and more about "assume breach, okay, now how do we detect it?". And a software DoS is also really just bottom-of-the-rung, you'll find almost no programs paying out for any findings. You shouldn't worry about them. Asserts are the safe thing to do. They close semantic gaps and make your code much more secure. It's like putting in a thousand trip wires, any thing off and an attacker can't get further. It completely shuts down exploit chaining.


> No, I was saying earlier that there are limits to WUFFS

Of course there are limits to WUFFS, that's why it isn't a general purpose language. You shouldn't implement these distributed protocols in it for the same reason toothpaste isn't a good engine lubricant, you deliberately can't even write "Hello, world" in WUFFS.

And yet, if you find yourself, in your distributed system, Wrangling Untrusted File Formats, you should reach for WUFFS to do that safely. Somewhere between "The device has a single button, it's green, press it" and "We process any PDF, HTML or XML documents sent to this email address" you will realise you need all the help you can get to Wrangle the data safely, and that's why WUFFS.


> Of course there are limits to WUFFS, that's why it isn't a general purpose language. You shouldn't implement these distributed protocols in it for the same reason toothpaste isn't a good engine lubricant, you deliberately can't even write "Hello, world" in WUFFS.

LOL, I would never have thought to do that till now! :)

I think we've always been on the same page regarding WUFFS and file format sanitizers. For me the question here really is, how do we improve the status quo when WUFFS is not an option? i.e. What are sane defaults for general purpose programming languages?

I still maintain that checked arithmetic should be enabled by default in general purpose programming languages, and that's because I believe in the principles behind WUFFS, having worked exactly on these kinds of tools myself.


> You're conflating undefined behavior with spatial/temporal memory safety,

You are correct; I should have written "memory safety".

> Zig treats memory safety not as an extreme-at-all-costs but as a spectrum

Zig needs to be more forthright about this.

When I first heard about Zig, I googled "zig vs rust" and found an article on the Ziglang website addressing that very topic:

https://ziglang.org/learn/why_zig_rust_d_cpp/

It completely fails to mention memory safety at all. That seems extremely dishonest, since memory safety is basically the "headline feature" of Rust (well, one of two or three at most). I wasted a lot of time digging through the Zig language manual ("so then how do they...") before concluding that something didn't add up. It definitely left a bad taste in my mouth.

> Rust won't protect you from all undefined behavior ... Rust has checked arithmetic off by default in safe builds

That didn't surprise me at all, nor will it surprise anybody who knows Java. Modular arithmetic is perfectly well-defined.

It's only C/C++ that picked the crazysauce option of decreeing that signed overflow is totally equivalent to scribbling all over random pieces of memory. It isn't overflow that's a security risk; it's languages that define overflow to be undefined in order to squeeze out a few piddly loop micro-optimizations. This becomes increasingly less beneficial in languages with iterators and no backward-compatible-with-C burden. Details (scroll to "Myth: overflow is undefined"):

https://huonw.github.io/blog/2016/04/myths-and-legends-about...


> Modular arithmetic is perfectly well-defined.

Yes (and thanks for the link!), I was in fact thinking more of this non-UB case (not signed overflow UB) as an example of where it's clearly defined as wraparound but can be chained into an exploit nevertheless, not technically UB but a vulnerability nevertheless. Not all exploits bother to go as far as a UAF. Unchecked arithmetic can be low hanging fruit.

> That didn't surprise me at all

It surprises me that Rust doesn't just enable checked arithmetic by default with an opt-out for performance, rather than enabling it by default for performance with an opt-out for safety. Zig's choice here is the safer choice from a security perspective.


Checked arithmetic is a much bigger performance hit than most people expect.

It means that every arithmetic operation is potentially a branch/jump instruction. This wrecks a lot of pipelining/out-of-order-execution schemes.

I once worked on an exotic architecture where the integer types had a "NaN" value just like floating point numbers do; it had both modular and checked arithmetic, but the checked versions would return NaN instead of branching.

It also had 37-bit integers. Yes, 37-bit. Fun times.


> Checked arithmetic is a much bigger performance hit than most people expect.

You're right about the branching cost. I believe there's a better way to solve that than disabling checked arithmetic everywhere.

This comes out of something I learned working on TigerBeetle [1], a new distributed database that can process a million financial transactions per second.

We differentiate between the control plane (where we want crystal clear control flow and literally thousands of assertions, see NASA's "The Power of 10: Rules for Developing Safety-Critical Code") and the data plane (where the loops are hot).

There are few places where we wouldn't want checked arithmetic in TigerBeetle enabled by default. However, where the branch mispredict cost relative to the amount of data being checked is too high, Zig enables us to mark the block scope as ReleaseFast to disable checked arithmetic.

> It also had 37-bit integers. Yes, 37-bit. Fun times.

Wow, fun times indeed! We just disabled 32-bit support for TigerBeetle because it was getting too hard to reason about padding. I can't imagine 37-bit, LOL!

[1] https://www.tigerbeetle.com


Zig doesn't propose to prevent UB (in fact the docs say that it takes aggressive advantage of it for optimization).

(Neither does Rust.)


safe Rust doesn't have any Undefined Behaviour.

Your unsafe Rust is supposed to provide suitable constraint/ guarantees that you, the programmer, conclude it does not have any Undefined Behaviour. The language can't force you to do this, and at some point it becomes a social contract not a programming language feature.

I wrote the misfortunate crate to explore Rust's promise here. The crate provides legal but obviously inappropriate implementations of lots of safe Rust traits, and sure enough nothing blows up, there is no undefined behaviour.

The defined behaviour can be undesirable for example if you insist on putting a bunch of misfortunate::Maxwells in a HashSet you're going to have a bad time. Rust doesn't promise this is a good idea, it might cause infinite loops, memory leaks, all sorts of defined trouble, but it won't be Undefined Behaviour.


Not "you, the programmer", any programmer. The Rust standard library can and sometimes does have UB, and that's even more true for other libraries.


"This software may have bugs" seems like a very different situation from "Our language just doesn't care whether maybe your program is nonsense and might have Undefined Behaviour".


Nice strawman and goalpost shifting. You said that it's up to "you", the programmer, whether unsafe code misbehaves, but this simply isn't true because "you" don't control all the unsafe code you execute. Again, third party libraries can have UB ... that it's wrapped in "unsafe" keywords doesn't change that, it just means that the compiler offers a tool to help reduce it. What I said is that rust doesn't propose to prevent UB, and that's true. I didn't say that it doesn't care. (And it's not true that Zig doesn't care, either, but it makes a different tradeoff between intrusive limitations and safety.)


Interesting.

I had the impression Zig proponents were praising it because it wasn't as low-level as Rust.


Quite the opposite. The very broad-brush overview is that zig aims to replace C and rust aims to replace C++... but among other issues with that analogy, that can't happen unless zig and rust have a dead-simple integration story.


I've only ever really seen it in Zig communities that Rust is closer to C++

In Rust communities, it's often pitched as alternates to both, but closer to C.

I suppose it's all relative, but the comparison of rust to c++ seems external


I think it is a C++ replacement that is _closer_ to C. It has generics, traits, RAII, etc, which are C++, not C, features but the way that Rust implements them _feels_ closer to C. I don't know if this quite makes sense but as an external observer this is my impression.


I don't really understand how Rust could be compared to C. Rust is perhaps more austere than C++, but that's like saying the neptune is smaller than jupiter. It's still a big language with a generous standard library. You almost never do everything yourself, by hand, which is the hallmark of the C experience. Just compare the list of methods rust has for a &str, compared to the fact C doesn't have a string type, to see the gulf of cultural difference.

Obviously, C and Rust are both low-level languages (in terms of control and overhead), but there are quite a few of those.


Yes, I also had the impression Rust is a C replacement.


From a programming language design space, it's much more like C++.

However it can occupy some niches that C can but C++ does poorly because of zero cost abstractions, I think.

(Very handwavy)


There are no niches where Rust can do better than C++. In many places Rust can, uniquely, match C++.


De gustibus non est disputandum, and thus at such a high level of abstraction you are at best trolling.


The first person to mention trolling is invariably the troll.


This is of course false.


I assure you there really are places where Rust can match C++.


Yes, but that's the sensible part of your comment, not the part I was obviously disagreeing with. (And no, the troll is not the one who first uses the word.)


Rust's purpose, its whole reason to exist, is to displace C. Rust will unavoidably fail in that, because anybody still using C is not willing to learn anything else: anybody willing to move on from C already did, long ago.

Rust is already approaching C++ in complexity, surpassing it in some places; and also in expressive power, but not surpassing it anywhere yet.

If Rust does not end up fizzling (which is still very possible!), Rust programmers will generally be drawn from the same population as C++ programmers. They will be people who want and can use a powerful language to make themselves more productive and able to manage bigger projects, without need to worry that they are taking a performance penalty, or losing control of details that matter.

Users of Zig, like of Nim and C, will be those uncomfortable with language power, disinclined to automate. Their attention is not on software and what they can build of it, but on problems where a thin veneer of software can add something useful. When there is not much for the software to do, you don't need much power to get it doing that.


Very minor correction - Nim tends to attract people specifically with its language power via its powerful Lisp-like metaprogramming facilities, static introspection, etc. These features are expressly there to automate away mundane repetition. I do not think it belongs in a list with Zig & C the way you use it here.

I also think users of C (not sure about Zig) are quite happy to automate things. Linus Torvalds is a big user of C. He wrote a little C-like compiler to check Linux kernel code called Sparse [1]. You seem to be trying to discuss maybe larger (but not very well articulated) subpopulations of "Users" than Apex Programmers like Linus. It is definitely easier to do this with C than giant languages like C++.

Why, the 1980s & 1990s were littered with maybe dozens of hacked C compilers doing "this or that" automation in a way you do not see for C++ (and will probably never see for Rust). In point of fact, C++ itself (C with classes) was an early example of such! The idea was to automate/codify the object-oriented style of Simula in C.

pjmlp's sibling & child comments are also some good color on the history/context of all this. { Of course, partly it all depends on what you meant by "language power" and "automate" - I am just going by what that seemed like. }

[1] https://sparse.docs.kernel.org/en/latest/


"Littered" is the operative word: who today uses any of them? (Maybe lex and yacc.) In the 80s and 90s, viable alternatives to C were thin on the ground, so the population demographic of C coders (which included myself) differed markedly from today. That hardly any are used today tells us about the inclinations of the long tail of remaining hold-out C coders.


Even if the only success story for Rust would be mainstream adoption of lifetime checkers across languages to some extent, that would already be a victory as it managed to change the baseline of language design across the industry.

A subject that now has become even regular presence at C++ conferences and considered a must have in static analysers roadmap by all major vendors.

Rust might fizzle out in a decade, and still leave such a mark in the industry.


I have not heard of a lifetime checker in any other language, except maybe Midori.


So here is a sample,

Chapel, HPC language mostly sponsored by Intel and HPC

https://chapel-lang.org/

D programming language,

https://dlang.org/blog/2019/07/15/ownership-and-borrowing-in...

Ada/SPARK,

https://docs.adacore.com/spark2014-docs/html/ug/en/source/la...

Swift,

https://github.com/apple/swift/blob/main/docs/OwnershipManif...

ParaSail

http://www.parasail-lang.org/

Project Verona from Microsoft Research

https://www.microsoft.com/en-us/research/project/project-ver...

Project Snowflake from Microsoft Research

https://www.microsoft.com/en-us/research/publication/project...

And finally your favourite C++

"Implementing the C++ Core Guidelines’ Lifetime Safety Profile in Clang"

https://llvm.org/devmtg/2019-04/slides/TechTalk-Horvath-Impl...

Also the "Clang Static Analyzer - A Tryst with Smart Pointers" talk at 2021 LLVM Developers Meeting.

For the Visual C++ part of the story

https://devblogs.microsoft.com/cppblog/lifetime-profile-upda...

And GCC as well, although they are late to the party

https://gcc.gnu.org/wiki/DavidMalcolm/StaticAnalyzer

Finally a couple of CppCon 2021 talks that touch on the subject in various ways,

Type-and-resource safety in modern C++

Code Analysis++

Static Analysis and Program Safety in C++: Making it Real

Finding Bugs Using Path-Sensitive Static Analysis


> Chapel, HPC language mostly sponsored by Intel and HPC

Minor correction: Intel hasn't traditionally been a sponsor of Chapel (though we'd love to see that change). Chapel was pioneered at Cray Inc. and continues on even stronger at HPE after its acquisition of Cray.

-Brad


Thanks for the correction, I thought seeing them referenced in some talks.

All the best.


At least most of these don't seem to be added language features, but separate "static analyzers" that rely on inference from existing language structures.


Sounds like a lot of "contemporary automation today" to me - automation that would often apply about as well to C code bases as well as horrifically overcomplex C++ codebases.

Also, why even draw a stark contrast between "a language" and "its tooling"? As a dev, you get to use both.

What is even the line..? Almost every compiler for anything provides options. Does gcc -fsanitize=.. not count because it's not "standardized" or only because its not activated in "typical" deployments like Rust integer overflow checks?


The line is very, very easy: if it is in the language, you can see it represented syntactically in the source code, and programmers using the language write it there. If not, not.


Clang and Visual C++ static analysis tooling can be represented on the source code via C++ attributes and blessed library types, so...


So... not in the language, but in proprietary, non-portable extensions.


With exception of C++, they certainly aren't.

C++ type system is impossible to fix while keeping backwards compatibility, so static analysis tooling is the only possible solution.


I sort of look at it as aiming at the spot where you could write the code in C with a sprinkle of C++ features and whether to add that sprinkle or do things like hand-rolling a vtable system where you need one is a decision that will vary depending on circumstances.

Though I believe the mozilla code it's replaced was all -very- C++.


>>Feature Highlights Small, simple language

>>There is no hidden control flow, no hidden memory allocations, no preprocessor, and no macros.

https://ziglang.org/learn/overview/

It try's (achieves?) to be a C without the flaws and historic ballast.


Where did you get that impression? No one has ever done that.


Here at HN.

I never read anything about Zig outside of HN.

What I remember was mostly that Zig should be easier to use than Rust, because Zig doesn't have lifetimes.


That doesn't make Zig higher level than Rust. It doesn't have lifetimes because it's low level, on a par with C ... all memory management is manual and you can abuse pointers to your heart's content.


I like zig, but agree with many of the points here. A couple of thoughts below,

> Zig reference documentation badly needs examples. Can't figure out how to use std.fmt.parseInt.

While yes, Zig documentation badly needs examples, I'm not sure this particular criticism is justified. I would have thought that the usage of parseInt, was fairly obvious from the type-signature:

    parseInt(comptime T: type, buf: []const u8, radix: u8) ParseIntError!T
Or, translated to C++ (and assuming the use of exceptions to return the error):

    template<typename T> T parseInt(const char buf[], uint8_t radix)
> Need to access myArray.items[idx] instead of myArray[idx]. I get it. But very unintuitive and requires knowledge of implementation details.

I'm 50:50 on this one. While it might be nice to hide the implementation here and have a .item(n: usize) member function, ArrayList explicitly manages a contiguous region of memory, so I don't see the problem with exposing it as a slice as part of the interface.

> Should I pass the allocator to every function? Doesn't seem great. Maybe I'm supposed to create a global? Globals are evil and feel bad.

As a rule of thumb, libraries should take allocators as arguments to their functions, while applications can either do that or create a global allocator. There is absolutely nothing wrong with using a global allocator in an application; after all it's what almost every other language does. Zig just makes that global explicit.

Globals aren't always evil.

> Zig's inability to infer type is annoying. If I create var count and return it and the function return type is usize then var count is obviously a usize.

It's not. It could be any unsigned type of smaller size than usize.


> It's not. It could be any unsigned type of smaller size than usize.

I think it refers to Rust's ability to "indirectly" infer a type by it's later usage which can be somewhere entirely else in the function (for instance when it's used as return value), while Zig has a more direct type inference (but IMHO Rust's approach definitely isn't "obviously better", because sometimes one needs to read and understand the entire code of a function to get an idea what type a variable might be).


For the record, I find Rust's ability to change the type of the variable _after_ it was declared to be absolutely bonkers for my ability to understand what is going on in the code.

I understand what it is doing, but the fact that changing the return type of a method will change what previous code will compile to is something that I find extremely non obvious and surprising.

Inferring what the type is from the current code, sure. That saves me typing, but from what happens elsewhere looks like time travel.


The trouble is, when you say it's OK to infer "from the current code" that includes your function call, which is the thing where you're surprised it happens.

There's no coherent reason why it'd be OK for Rust to conclude that x is a u64 from

  let x = some_u64.add(6); // the u64 type implement Add returning a u64
and yet not OK for Rust to conclude that x is a u64 from

  let x = someFunction(); // someFunction was defined to return u64
You had to explicitly change the return type of the method to cause yourself a surprise, Rust won't chase this rabbit through a warren, every individual function has to declare the types of its parameters and returns, so if you'd changed the function body not its declared return type, it could not have caused any change in type inference in other functions.


That isn't the problem. This is the issue:

   let x = str.parse()?;

   someFunc(x);

The function called here is dependent on what `someFunc` will accept. That means that I can change what values will be accepted here by changing a completely different piece of code.


Only if you think of other code in the same function as 'a completely different piece of code'.

With global type inference, the whole program is inferred for example.


You can always specify the type.

let foo: T = bar.parse()?;

// or let foo = bar.parse::<u32>()?;

baz(foo);

In my experience I've never found this to be surprising behavior, or unclear.


This is not going to solve your problem everywhere (certainly not when you are reading some github repo), but in the editor, rust analyzer will print the type from the beginning.


Interesting, I never thought about it this way.

I just always find it a PITA that Rust can't do global type inference and that I have to type most of the types for myself.


I understood that, but my comment could have been more clear. I more meant that it's not necessarily obvious that the type check should infer that it's a `usize` just because the function returns a `usize`. It could be any type that can be implicitly cast to a `usize` which in a 64-bit Zig system is any of `u[1-64]`.


> ArrayList explicitly manages a contiguous region of memory, so I don't see the problem with exposing it as a slice as part of the interface.

It used to be that you'd get an allocator by taking a pointer to the allocator field (&gpa.allocator), in 0.9.0 that was changed to a function call (gpa.allocator()) and thus broke _a lot_ of things.

Exposing all these fields seems like it'll cause challenges when trying balance backwards compatibility vs breaking changes.


In most cases I would agree with you, but I think ArrayList is simple enough that backwards compatibility won't be a major concern. It's literally just a wrapper around slices that handles resizing for you, and there's no real reason for that to change. Exposing the slice underneath is a necessary part of its interface.


To someone very familiar with the idea of slices and Zig, it might be obvious and sensible. But to me, unawareness of how slices work means I thought I could easily access garbage data if array capacity >= list length; and accessing the underlying data structure feels unclean and makes me think of hacky code. While the field of ArrayLists is unlikely to see any innovation, it certainly doesn’t make a good first impression to programmers not exposed to Zig before.


I think part of it comes down to mindset. If you think of ArrayList as an abstract container, then yes, you're totally correct that having the underlying data structure being exposed is a bit jarring and unexpected.

But I don't think that's the right level of abstraction to come from. ArrayList is just a wrapper around a slice that handles allocation and deallocation for you: the underlying data structure is part of the contract of the type, allowing it to be almost a drop in replacement.


> Exposing all these fields seems like it'll cause challenges when trying balance backwards compatibility vs breaking changes.

That's less important than it sounds because the project is pretty explicit about not caring about backward compatibility till 1.0.


> I would have thought that the usage of parseInt, was fairly obvious from the type-signature:

Blog author here. In hindsight yes it's obvious.

I think my problem was a lack of understanding of comptime types. They're a little different from C++ / Rust since they're passed as args. Looking back I think I found the Zig docs less "sticky" than other languages. The concepts are familiar but just new enough I don't understand and don't fully remember them.

There's a point while working on AoC that I wasn't sure if Zig had first-class runtime types or not. It has comptime types and it also has a TypeInfo built-in with some degree of reflection? Is that runtime reflection maybe? I think partially? I honestly don't know.

Either way I was confused. :)


As far as I know, typeinfo is used at compile time. In general, doing runtime type shenanigans involves implementing things like vtables, and some other weirdness. This breaks "no hidden control flow".

If you do that at comprime instead, the thing that runs at runtime is just memory offsets. You can for example generate code that parses json into 30 types you know at compile time, or you can use things like hashmaps and arraylists and whatnot to do it at runtime.

You get a lot of mileage by just doing the dynamic code away from runtime, and that's one of the benefits of comptime. And it's a slow thing to realize, also not necessarily something everyone cares about.


> It has comptime types and it also has a TypeInfo built-in with some degree of reflection? Is that runtime reflection maybe?

I do wish this was explained more prominently! The way this seems to work is that Zig has a lazy runtime for Zig code at compile time that does some interning (such that identical types obtained separately are equal, identical strings obtained separately are equal, etc.), and some types such as type can only exist in this runtime, not in the actual uh "run time" runtime. TypeInfo is for iterating over struct, union, or enum definitions at compile time to generate code. You'd use it if you wanted to write a generic json parser and serializer or if you wanted to write a type that's like an arraylist but internally uses a struct-of-arrays layout. Both of these are in the standard library and make informative reading.

TypeInfo itself is a union that can exist at runtime, but I don't think you can get into a situation where you have something at runtime and you don't know what type it is already. So actually using the TypeInfo at runtime may not be very useful.

> Zig appears to not report compiler errors for functions that get optimized out.

This is apparently caused by the laziness. The docs suggest using std.testing.refAllDecls(@This()) to have the compiler check unreachable code. I mean the actual usable docs at https://ziglang.org/documentation/master/#Nested-Container-T... rather than the automatically generated stdlib docs.


I never thought about this until now, but I guess it's actually super weird that ArrayList doesn't expose a getter-by-index and an iterator and instead expects you to use .items.


The thing is... Every function in the std works with slices, not woth array lists. You pass slices to mem.tokenize(), you pass alices to mem.cmp(), you pass a slice to sort().

I think that's why ArrayList exposes it for you, so you can just use it as a regular slice everywhere. I, honestly, find that simple and liberating. I'm glad there's no other kind of accessor thing.

Continuous memory is just fast, and if you have some first class support for working with it, it makes no sense to duplicate the api. But again, if you want a hashmap you don't get the slice, for the slice doesn't make sense there...


When I did some Zig a month or so ago, I had great help looking at the tests, e.g.[1].

That being said, the state of the documentation is also the reason why I gave up on it for now. But I'm sure it will improve over time.

[1] https://github.com/ziglang/zig/blob/79628d48a4429818bddef2e8...


As a seasoned Zig programmer, this is good information, though painful to read. A lot of the problems seem to be from a fundamental misunderstanding of Zig's philosophy and very basic things about how the language works. Possibly Zig needs more emphasis on those things in its documentation.

I also have to wonder about a fundamental misalignment of thinking when someone says downloading and replacing a single .exe is tedious, and isn't at all excited about 'Why Zig' list. If none of those things seem like compelling ideas to you then yeah, the language probably isn't for you.


If there are fundamental philosophies in a programming language (and there usually are), tutorials ought to begin with lots of tiny example programs that exemplify those fundamentals. They should continue with examples of gradually increasing complexity, but ALWAYS using those same fundamentals in various combinations to show how those fundamentals are intended to work together in all sorts of ways.

If the tutorials aren't deliberately built up from the language's own defining philosophical fundamentals introduced singly and then in combination, then someone like me will just re-use the fundamentals I learned in whatever other languages I know that feel most similar to the new one. Most languages can be used in roughly parallel ways with just a change in syntax, even if that's not how they are supposed to be used, and if I (like most of us) have to get up and running ASAP in some new tool and don't have a quick on-ramp to establish the preferred patterns right from the start, I'll just have to re-use as much as I already know and just get on with it.


My understanding of the Zig's team's opinion is that Zig is not currently at a place where they want to solidify the design by doing such things as writing extensive tutorials documenting how to write it - tutorials that will need to be updated every time they make breaking changes, which they explicitly still want to do.

A lot of this seems to be a mismatch between people's expectations of zig's stability and the reality?


The way you handle that is by adding all your tutorials to the test of tests for the compiler. Then, whenever a breaking change happens, you have to update the tutorials for the tests to pass, which isn't difficult since you're probably only talking about minor, incremental changes.


I like this idea quite a lot. If you can write the tutorial itself in a "literate code" sort of way, the test harness should be able to consume the entire tutorial and run tests against it. Presuming, of course, that there's a way to tangle/weave using standard comments in Zig. Sounds like it could be a fun project. Could also serve as a way to auto-doc (like JavaDocs or similar using documentation generators).


Zig already does this with its current documentation and example code.


I saw that after reading further. Loris (I don't remember his user name here) also said he plans on rewriting it from scratch on Twitch. I may tune in just to see how his mind works. :D


Well worth doing, I've learned a lot about coding from Loris.


You're talking about idioms, not philosophy ... the philosophy of Zig can be obtained by running `zig zen`. And the OP didn't even read the Hello World section of the Zig Reference Manual that has an example of printing.


Nah, I think it is fine. The problems are almost all (the author's) problems with zig's philosophy of "when to do things you need for a 1.0". Author mostly wants docs, guides, error messages, and package manager, which are supposed to be "late in the development process" according to my understanding of the unofficial zig roadmap.


The author didn't even read the Hello World section of the Zig Reference Manual that (of course) has an example of printing.


I have a feeling that Zig is sufficiently different to most other languages in the space that developing a mental model for the way things are supposed to fit together is something that involves "aha" moments - the sort that are tricky to produce for everybody via a single piece of documentation because the thing that makes it snap together in a particular person's head varies widely between people.

Similar to the process of getting the concepts of git when you've come from other version control systems.

(I have a horrible feeling that the solution to this is mostly going to be "lots of people writing tutorials that approach it in different ways until there's at least one tutorial out there that will work for (most values of) any given person", and having been there on projects of mine I sympathise with the unsatisfyingness of this conclusion)


When I saw the first line of some Zig code, it was unusual so I thought about it and got an inkling of the idea behind it and its power:

const std = @import("std");

When I saw the second line and subsequent similar lines, I realized it was effing brilliant:

const os = std.os;

std and os here are names bound to types ... and you can have type variables and do compile-time manipulation and construction of types. This is confirmed in the manual when it talks about generic types being the return values of functions executed at compile time that take types (and/or other comptime values) as arguments -- while C++ templates are purportedly Turing complete, this is far more powerful in practice because it's vastly easier and more straightforward. Looking at code in the library like MultiArrayList--which implements AoS (array of structures) in the library rather than in the language--further confirms this.

So what's going on with those `std` and `os` bindings? The answer is given at the beginning of the Zig Language Reference (https://ziglang.org/documentation/master/) (which the OP apparently didn't read since it has a Hello World program that is an example of how to print): "The @import("std") function call creates a structure that represents the Zig Standard Library". So essentially, every source file represents an anonymous struct (all Zig structs are anonymous), the members of which are the top level declarations in the file--or rather, @import(filename) presents the file as such a struct, which can be assigned to a type variable like `std`. And `std.os` is in turn a type variable whose value is @import("os.zig") ... except that the actual value of `std.os` is computed based on the target machine. By turning files into comptime data structures containing all of the file's top level declarations, and having Zig code executable at comptime, immense power is achieved and one of the consequences of this is that zig running on any host is a complete cross compiler that can generate code for any target, using an appropriate target-specific version of the library. And it only took me a little bit of reading of docs and code to get my "aha" about how this works.


Now I want somebody who understands both languages better than I do to write a comparison of how Zig and Nim's respective approaches to imports and comptime work.


Nim is a far richer and higher level language that, like Zig, can run arbitrary code at comptime, but it also has templates and macros. But Zig does have a some things like "import a file imports a struct" which are unique to it and reflect its devotion to cross-compilation as a first class feature: every Zig compiler on every host can compile programs for every other host (at least within the Tier 1 support). Nim can't do that and I don't think any other language can. And its import mechanism is pretty standard, simply making names visible to the importing file or block; the names aren't organized into nested structs as in Zig.


>A lot of the problems seem to be from a fundamental misunderstanding of Zig's philosophy and very basic things about how the language works. Possibly Zig needs more emphasis on those things in its documentation.

As a rank beginner, I had no problem learning these things ... they're mentioned repeatedly in material about Zig both from the project and from outside descriptions, the philosophy is the output of `zig zen` and in all the documentation, etc. The OP couldn't figure out how to print despite the Hello World program at the beginning of the Zig Reference Manual. I think most of the issues here are PEBKAC.


The biggest thing that stands out to me from the complaints is the lack of documentation, which is probably part of the misunderstanding of the philosophy.


My big problem with Zig is that Andrew Kelley is promising a lot of features, but doesn't really deliver much. Zig still can't proper handle UTF-8 strings [1] in 2022, which is kind of unfortunate, because it's a `requirement`. In a `recent` interview[2], he claims that Zig is faster than C and Rust, but he refers to extremely short benchmarking that has almost no value in the real world.

At least Rust, as blamed and loved as it is, delivered a stable compiler and people started working on the ecosystem (in the first years, most packages were working only on nightly, but at least there were crates available). The ecosystem for zig is insignificant now and a stable release would help the language.

[1] https://github.com/ziglang/zig/issues/234 [2] https://about.sourcegraph.com/podcast/andrew-kelley/


> My big problem with Zig is that Andrew Kelley is promising a lot of features, but doesn't really deliver much.

Have you, like, seen the release notes for 0.9.0?

https://ziglang.org/download/0.9.0/release-notes.html

> Zig still can't proper handle UTF-8 strings [1] in 2022

There's plenty of discussion on the subject in basically every HN thread about Zig: the stdlib has some utf8 and wtf validation code, ziglyph implements the full unicode spec.

https://github.com/jecolon/ziglyph

You might not like how it's done, but its factually incorrect to state that Zig can't handle unicode.

> In a `recent` interview[2], he claims that Zig is faster than C and Rust, but he refers to extremely short benchmarking that has almost no value in the real world.

From my reddit reply to this same topic:

This podcast interview might not be the best showcase of the practical implications of Zig's take on safety and performance. If you want something with more meat, I highly recommend Andrew's recent talk from Handmade Seattle, where he shows the work being done on the Zig self-hosted compiler.

https://media.handmade-seattle.com/practical-data-oriented-d...

Lots of bit fiddling that can't be fully proven safe statically, but then you get a compiler capable of compiling Zig code stupidly fast, and that's even without factoring in incremental compilation with in-place binary patching, with which we're aiming for sub-millisecond rebuilds of arbitrarily large projects.

> The ecosystem for zig is insignificant now and a stable release would help the language.

I hope you don't mind if we don't take this advice, given the overall tone of your post.


Why does something as basic as uppercasing a string or decoding latin1 require a third-party library? I would expect that to be part of stdlib in any language. Also, why does that third-party library come with its own string implementation? What if my dependency X uses zigstr but dependency Y prefers zig-string <https://github.com/JakubSzark/zig-string>? Basically all languages designed in the past 30 years have at least basic and correct-for-BMP Unicode support built-in/as part of stdlib. Why doesn’t Zig?


That's not "simple". Rust also does neither of those two tasks with just the stdlib!

- latin1 is dead and should be in no stdlib in 2022 - uppercasing requires the current Unicode tables, so, a largish moving target that you probably don't want to embed in small programs.


Latin-1 is actually the first 256 code points from Unicode. So, you can do that in Rust by casting u8 (the Latin-1 bytes) into char (Unicode scalar values). That's unintuitive perhaps because of course in C that wouldn't do anything useful since the char type isn't Unicode, but in Rust that's exactly what you wanted.

In this environment you might very well not need actual uppercase/ lowercase but only the ASCII subset. Accordingly Rust provides that too, which is far less to carry around than the Unicode case rules. Since the ASCII case change can always be performed in situ (if you can modify the data) Rust provides that too if it's what you want.


Those are all valid points. At the moment I believe Zig has decided to leave full unicode support out of std because they don't want language releases dependent on unicode updates.


> they don't want language releases dependent on unicode updates.

I'm sorry, what do you mean by this?


The "rules" of unicode change over time with updates to the unicode standard(s). One big one is the grapheme breaking algorithm, which has been updated over time to support things like the family emoji and other compositions.


That should be strictly related to the rendering


correct-for-BMP-but-not-otherwise is simply a bug (and cultural chauvinism). And almost all of such implementations aren't correct-for-BMP because uppercasing Unicode is far from "basic".


you get a compiler capable of compiling Zig code stupidly fast, and that's even without factoring in incremental compilation with in-place binary patching, with which we're aiming for sub-millisecond rebuilds of arbitrarily large projects

That sounds great! But at the same time people in other threads here are talking about 1-3 second compilation times for Advent of Code solutions (which I presume are smallish). Can you summarise where that really fast compiler comes from, to save me searching through that talk video? Is this something that everyday users will be able to use in typical workflows?


Here's a full write up about it

https://kristoff.it/blog/zig-new-relationship-llvm/

Long story short, we're currently working on a self-hosted implementation of the compiler and what people are using now is the old C++ implementation. As soon as the new compiler is feature-complete enough, we'll start shipping it and we expect much better compilation speeds, which will be even greater speed for debug builds once the native (i.e., non-llvm) backends catch up as well.

Latest progress update on this work: https://twitter.com/andy_kelley/status/1481862781380874240?s...



The language called "Rust" prior to 2013 is a completely different language from what people today know as "Rust". That language had a garbage collector, mutable aliasing, and no borrow checker (the three most unique features of today's Rust), and was basically "golang with different syntax":

http://smallcultfollowing.com/babysteps/blog/2012/11/18/imag...

The whole language got rebooted shortly after the blog post above, mostly because the borrow checker made so many other things suddenly unnecessary or trivial. What we call Rust today is at most 9 years old, and any similarities to pre-2013 Rust are strictly superficial syntax. They share a name and some syntax, sort of like Java and Javascript do.

Zig today at T+7 is not where Rust was in 2020 at T+7.


What matters is time from initial inception.


I don't think a lot, I have no problems with UTF-8 strings (unless you speak in the source code, haven't tried), faster than c and rust when it can, ofc, I saw a rust program, made my own with zig, and it's fast , and I didn't even optimize that much.

Compiling is faster than rust and C a lot of the times

We have packages, and a good few, thing is, this is no rust big, we don't have mozilla nor to backup and work into it.

I don't think is overpromise, Vlang is overpromise, zig atm, is slowly getting there, no promises on when


> My big problem with Zig is that Andrew Kelley is promising a lot of features, but doesn't really deliver much.

My biggest problem with your comment is that it is completely and utterly false.

>At least Rust, as blamed and loved as it is, delivered a stable compiler

After MANY years and numerous complete redesigns.


Yes, error messages in zig needs improving but that's not surprising. Rust error messages where pretty terrible back in the day.

I found zig language reference pretty good[0]. It is simple, lot of example. I would find 80% on there and had to google the rest. And as another commenter said, looking at zig source code is actually not a bad idea. The std lib is pretty clear and with comments.

My biggest beef with zig is a lack of a package manager. But apparently, it is high in the list of priority for the author so...

[0] https://ziglang.org/documentation/master/


I think zigs documentation is wonderful and explains most seemingly esoteric syntax and language options really well.

But if you hit a roadblock it's hard to track down more info. Some of that is due to the lack of adoption, though.


A potentially huge roadblock to better error messages is lack of generics and interfaces/traits/classes.

The comptime machinery is really cool, but it gives the compiler much less information to work with for producing good errors.

It'll be interesting to see how this plays out as the ecosystem grows.


> The comptime machinery is really cool, but it gives the compiler much less information to work with for producing good errors.

Why do you think that's the case?


Because all the semantics for what are the valid parameters for types and other comptime operations are scattered throughout the code that is run at comptime. It's the same problem as with C++ template code, but perhaps worse. For instance, if you do a formatted print and pass an integer to a "{s}" descriptor, the std.fmt code calls @compileError with an informative message, then a dozen lines of stack trace are printed before a line that shows the source code line that invoked the print, and then yet another stack trace because std.fmt returns an error. And that's for std.fmt that is well behaved and calls @compileError when it should ... in many cases the error will be due to a type mismatch or invoking a non-existent member, etc.


I'm still not quite seeing the problem. To my mind, comptime parameters have a direct translation to parametric type parameters in something like system F or a similar parametrically polymorphic type system, and reasonable type errors are possible in those. Do Zig's comptime parameters universally violate parametricity in some way, and this is why the errors are more difficult?


I disagree from this part:

"I also think it's partially wrong. No one in the history of the world has ever been confused or upset by a + b calling a function."

It depends. If this is simple math on vectors I think it can be OK but it should probably be a built-in feature of the language as this is common, solved and we all implement it the same way (for short vectors at least)

But the + operator has been abused in the past, especially with strings concatenation and I think this is a huge liability. Such an innocent looking operator, the simplest of all operations, leading to a function call, a memory allocation and thus a very real potential memory leak, all of that hidden from the eyes...

In my opinion, clarity should have priority over anything else. If an operation is computationally complex (especially with side effects) it should be at least hinted to the reader by a function call.


I also would prefer not to have + as concatenate, and Linus feels strongly enough about it (and has enough influence) that Rust for the Linux kernel does not have Add and AddAssign overloaded on string types, among many concessions.

However I think in general purpose programming we lost that battle when Java special-cased the operator. There's no overloading in Java, you can't have + do the Right Thing™ in Java for your 3D vector class, but it does concatenate strings because people had begun to expect that.

Also, while I'm in this topic, the author argues they don't want anybody overloading either of the member access operators (which is something Python kinda-sorta supports) but notice that Rust effectively does this all the time and nobody freaks out because it feels completely natural.

Your Box<Thing> isn't a Thing, so, why can you call Thing methods on it? Because it is transparently passing those into the underlying Thing by implementing the relevant overload features from core::ops and all of Rust's smart pointers work the same way.

misfortunate::Double shows that this gives very strange behaviour if somebody uses it inappropriately, a Double<Thing> is actually two Things inside, but when you change it, you're changing one of them, while any immutable references are to the other one... an eerie experience.


Why Java given the history of programming languages between 1950 and 1996?


Influence. A completely amazing amount of Java was written (is being written) and there's a huge workforce in that language.

I think that it's possible if say Gosling hated + concatenating and had provided Java with a different String concatenate (it clearly wants a concatenate operator, but it needn't be named +) we'd see that popularised and while some languages with overloading might overload + it wouldn't be ubiquitous. I can't prove that of course, it's purely my opinion.


What? Influence, what a nonsense.

Even BASIC uses it, and it was everywhere during the 15 years that preceded Java.

Let alone all the other languages since Jovial that can't be bothered to dig out just to prove my point.


Huh. I'm pretty sure that at least some BASIC dialects I used did not have concatenation, but you seem to be correct that in general they did. This intrigued me enough to go dig through nested "from old machine" directories until I found some BASIC (I'm going to guess that I have not written any BASIC since about 1992) and you're correct that BASIC from the era which didn't live on cassette tape and thus I still own does have concatenating + operators in it.

I stand corrected.


> However I think in general purpose programming we lost that battle when Java special-cased the operator.

It was already special-cased by Pascal.


> Such an innocent looking operator, the simplest of all operations, leading to a function call, a memory allocation and thus a very real potential memory leak

I hadn't considered the memory leak angle of this, and now that you mention it it's clearly a driving concern here. When freeing is explicit, it's a problem to heap-allocate temporaries. (And I guess the addition operator would also require a third operand to supply the allocator?)

I'm still a big fan of destructors, and how they make this problem mostly vanish. But I understand that it's hard to make them efficient without move semantics, and maybe also a global allocator so every object doesn't need to store an allocator pointer? What other interactions am I missing?


Strings are complex entities, I am not sure if there is a perfect way to handle them.

In my case (I use my own framework) I use a lot of makeStringWith* functions that all allocate in the same global scratch buffer (one per thread) and I don't care at all about releasing the memory.

There is simply one function to clear reset all scratch buffers, it has to be called explicitely by the user, most of the time once per frame (I create video games) but it can also be when the current job is done or never.

From my perspective it is very efficient and reliable, never had to solve a bug related to this mechanism.


At the very least, it's wrong that no one has ever been upset by this; many people have been very vocally upset by this.


> Such an innocent looking operator, the simplest of all operations, leading to a function call, a memory allocation and thus a very real potential memory leak, all of that hidden from the eyes...

It's not a given that floating-point addition is a simpler operation than string concatenation.

If you're serious about what you write, then all operators should be banned.


What is a case where floating-point addition is more complicated than a string concatenation? Are you referring to some obscure architecture?


One example would be an architecture on which hardware floating point is not implemented, and has to be emulated in software. This isn't uncommon in embedded, many ARM Cortex-M cores are like this AFAIK.


When I was writing arm26 (i.e. arm2) assembly for the Archimedes everybody passed around a block of ASM that did division since the processor had a whole 16 instructions and integer division wasn't one of them.

Things are less primitive these days but the memory still brings a smile to my face.


MIPS was also like that, if I remember correctly, division and multiplication were assembler macros.


Reading about MIPS was where I first encountered the concept of a branch delay slot and I am -so- glad I've never had to write assembler for a processor with that feature.


There's a pretty big difference between doing dynamic memory allocation for string concatenation on ALL systems, and doing software addition on some embedded processors. Even on your embedded system, floating point addition is simpler than string concatenation.


I don't know, I'd argue that in the simplest cases—a bump allocator followed by a memcpy—string concatenation can be significantly simpler.


In the worst case, and this is pretty rare these days, yes, adding two floats is a function call, but without memory allocation.

Integer division or multiplication of long (64bits) can also force the compiler to inline a bit more code or even to call a function on older architecture, hardware division was not a given on ARM before ARMv6 I think, so for example on the Gameboy Advance or even the Nintendo DSi you had to be careful with division etc.

But again, this will only slow down your code if you're not careful, not generating memory leaks silently in the background.


Why 'memory leaks'? Presumably, destructors will be automatically called one way or another.

In the simplest case, you can just store all strings on the stack.


The DSL I'm currently hacking around on has ++ for string (and list) concatenation and using a -seperate- operator seems to rather help.

(I've yet to conclude if stealing ++ for this rather than preinc/postinc was a terrible mistake, so far in context it hasn't seemed to be but I'm still keeping an eye on the question)


Perl uses ~ for string concatenation.

Nothing says that you cannot use both infix "++" for concatenation and unary ++ for increment. "-" is both a unary prefix and binary infix operator, for example.


Perl uses '.' for string concatenation.

Nothing says that I can't do that but in context of deliberately -not- re-using operators for completely different things it would seem rather self-defeating.


Oof, you're right. I was thinking of Raku.


Interestingly since you can't pass an allocator to an operator call, there is some extra guarantee around whether a hypothetical overloaded operator can allocate. If you implement it for a vector type which doesn't contain an allocator reference, then you're sure that any operators won't allocate.


To me the biggest difference between Rust and Zig in practical terms is that Zig does not offer statically safe resource management like Rust does, and as far as I know they have no intentions of doing so in the future because they think it’s less important than all this stuff about control flow. There are a lot of interesting ideas in the language but my disagreement with them on this issue is so fundamental that I don’t feel compelled to investigate it further.

Nonetheless I had gotten the impression from many posts on here before that Zig was a lot closer to finished than it really is, it seems like they’re not quite half way to something as polished as Rust 1.0. It is probably unfair to judge the language against Rust in its current state. Hopefully in a few years they will have figured out the memory safety problems and the stdlib documentation will stabilize more.


I think the comparison should be more nuanced and holistic than only memory safety, if it is to be a discussion on security and not theater.

For example, to get the conversation started, how do both languages compare in terms of checked integer arithmetic? Do they enable checked arithmetic by default in safe builds with an opt-out for performance, or do they leave default builds unsafe with an opt-in for safety?

Another example, JavaScript is 100% memory safe, does it follow that we can expect to see less exploits against NPM than C? Both ecosystems are massive, but I'd wager that most product release security teams are more stressed out about NPM than C dependencies right now. Not to say that they shouldn't be running C dependencies in sandboxes or be evaluating the risk of C (and remember that Zig is an order of magnitude safer than C, much closer to Rust actually, and more so in some areas). But NPM is probably getting more attention. Same thing for bug bounty issues reported. Probably more for NPM supply chain vulnerabilities than anything else. All 100% memory safety, and yet security is still a thing.

It's the many small decisions like these, along with thinking of security not as a binary extreme but as a probabilistic spectrum, that are more interesting to me.


> All 100% memory safety, and yet security is still a thing.

And nobody has ever claimed that memory safety is the only thing that matters with security, but it’s definitely high on the list. Can you imagine how much more of a nightmare NPM would be if JavaScript weren’t memory safe? This is the relevant counterfactual.

> Zig is an order of magnitude safer than C, much closer to Rust actually, and more so in some areas

Hard disagree. It’s definitely safer than C, (in terms of UB) but you can tell it’s not even close to Rust on a number of axes (yet?): https://scattered-thoughts.net/writing/how-safe-is-zig/

As for integer overflows, I don’t think they are nearly as big a security concern in a memory safe language with bounds checking. Feel free to correct me though.


> As for integer overflows, I don’t think they are nearly as big a security concern in a memory safe language with bounds checking. Feel free to correct me though.

See HeartBleed, CloudBleed, all buffer underflows resulting in buffer bleeds, letting someone read all your sensitive server memory, with no UAF.

They can also be caused just by integer overflow, which is what makes unchecked integer arithmetic so incredibly dangerous. It's easy for programmers to be oblivious to this, and overly rely on the borrow checker, thinking it can provide 100% memory safety, when by definition it can't.

And all kinds of software systems have these vulnerabilities. For example, I've worked on static analysis security software that could detect bleeds automatically in outgoing email attachments and it would find different bank systems leaking data in autogenerated statements.

From this experience, and from some bug bounty and security engagements I've done, I'm much more comfortable actually with Zig's approach to correctness and safety overall. I think the borrow checker is pretty awesome and has some serious muscle, but nevertheless Zig impresses me with its strict focus on explicitness, which I believe is the best approach still to eliminate these kinds of semantic gaps in general.

The borrow checker obviously can't protect you from bleeds (it can from some where static allocation is at play), but checked arithmetic would, or at least go pretty far — and again, it's about the spectrum, not the extreme.

It would be great for Rust to enable checked arithmetic by default for safety, with an opt-out for performance. Flipping this around would be a better default.


I feel like Nim is a much closer comparison than Rust. All three could be described as "grassroots projects on a holy mission to replace C and/or C++," but Nim and Zig are more similar in terms of operating budget/team size/adoption rate.

I may be a bit biased, but many of the problems mentioned in the post are the sorts of things I haven't dealt with in Nim for probably 2 years. Nim's error messages are still hit-or-miss and {.gcsafe.} still haunts me in my dreams, but the stdlib and language documents are great, and most of the annoying gotchas have been fixed.


Rust isn't on a holy mission; it's simply the only memory-safe language without a garbage collector.

It's really that simple. There is no international conspiracy behind Rust's popularity. It's the only thing that can do what it does in that respect.


Depends, I consider that Rust might succeed where Modula-2, Object Pascal and Ada failed, for anything else other than being a new generation of developers having a go at it.


There is nothing fundamental about ownership-based resource management, memory-safety can be achieved in other ways. Zig is C, it's not meant to abstract away memory management. You can do whatever in Zig, write a specialized allocator that's verified to be correct, even for a microcontroller with 2 KiB memory. Zig is a machine-oriented language, which means no hidden control flow, no hidden allocation. You can write a garbage collector or a compiler like Rust has too if that's what you need.


> memory-safety can be achieved in other ways

I mean, as far as achieving it at compile time, I would say it remains to be seen whether Zig's approach is one of these ways. There are certainly other known approaches, like the way ATS models pointers as proof objects, but these are also fairly abstract.

> Zig is C, it's not meant to abstract away memory management.

This is just hiding the ball though. Why don't we want abstractions on memory management? There is no runtime cost to providing the abstraction that Rust does. As such, "machine-oriented" is an ambiguous description of the difference here. The whole lesson of Rust is that there's nothing fundamental about the abstractions the machine itself provides, and we can create better abstractions without losing our orientation to those concerns.


> There is no runtime cost to providing the abstraction that Rust does.

This is technically true, but if I want to write a program that does not call malloc in its steady state, I can't use the stdlib or any library that uses the stdlib.


Herein lies the second biggest barrier to Rust's viability in embedded systems. (The first being that no microcontroller OEM ships libraries for Rust)


But this is not a fundamental limitation, there's nothing about the language that requires you to use heap allocation. That it contains affordances for safe heap allocation is a good thing, but you're not required to use them. There's always no_std and unsafe.


Here is an example with a keyboard firmware why you might not always want all that complexity.

https://kevinlynagh.com/rust-zig/

https://zig.news/aransentin/analysis-of-the-overhead-of-a-mi...

The point of languages like C and Zig is that they are only a bit higher level than assembly for portability, but otherwise they don't hinder you to do whatever. It's up to you to solve problems like memory safety. You might not even have a memory safety problem because you don't have memory in the first place, or your use case makes memory safety trivial.


I read the thread about the keyboard at the time. The author literally said it was a hobby project and they weren't concerned about safety, which is why they weren't interested in all of the solutions people offered for their problems with Rust. Zig's fine if you don't care about safety. But that's not really how I've heard it advertised.

> but otherwise they don't hinder you to do whatever

Neither does Rust! You can choose to ignore what the compiler wants you to do and just program with unsafe everywhere. The difference is that you don't have the option of writing safe code in Zig and C.

> It's up to you to solve problems like memory safety.

And so far, memory safety is not actually a problem that programmers have been able to solve by sheer force of will without borrow checkers or garbage collection. For a subset of trivial programs, sure. But not at any reasonable scale.


> nothing fundamental about ownership-based resource management, memory-safety can be achieved in other ways.

Not other ways. One other way. Just one. Garbage collection.



Thing is, if Zig is C, why bother at all, we already have C for it.


Zig is warts-free C, but you can use them together. You can gradually refactor your C codebase, Zig can even transpile C. Zig is also a standalone C toolchain, compiling and cross-compiling C is a breeze with it, and it has it's own libc implementation.


it does not have it's own implementation of libc, it just makes easier to work with implementations of libc and letting the cross compile, I think it was like that.

However a libc is a thing that zig probably wants and something I am also thinking on doing if I have the time


It doesn't fix C flaws like use after free, hardly warts-free.


That's not a C flaw, you can use an allocator that prevents use-after-free.

https://github.com/bwickman97/ffmalloc

The point of C and Zig is that they are low level and you can do whatever, like not use an allocator, or write an allocator.


Just like C memory debuggers for the last 30 years, so what is the benefit?


Defer, comptime, far better cross file model, yo name just a few things


For the price of an ecosystem with the same security flaws, hardly worth it.

Much easier to just use C+.


It's an open question [1] along with other safety checks both comptime and runtime [2]. The issue here is a balance of keeping language complexity low while providing safety at the same time and not depriving users of control when they know better than the type checker/lifetime analysis.

[1]: https://github.com/ziglang/zig/issues/782

[2]: https://github.com/ziglang/zig/issues/2301


> not depriving users of control when they know better than the type checker/lifetime analysis.

You're never deprived of control though, that's what unsafe is for, and it provides roughly the same level of safety as C or Zig would.


unsafe Rust has its own UB pitfalls that don't exist in C nor Zig, particularly in regards to the lifetimes of references [0] and preserving provenance through experimental apis [1].

[0]: the compiler is allowed to emit extra reads/writes to references so a mut and shared one to the same memory location cant be alive at the same time, across threads, and across sometimes non-obvious scoping rules.

[1]: [pointer::set_ptr_value](https://doc.rust-lang.org/std/primitive.pointer.html#method....) is used by containers like Arc for `from_raw` in order to carry over aliasing, mutability, and other location meta data.

unsafe Rust is also less ergonomic when dealing with concepts that go against the borrow checker like Pinning for intrusive memory, ptr::read/write and ManuallyDrop for controlling unorthodox object ownership/lifetime, and MaybeUninit/NonNull for facilitating certain data layout/access optimizations. Such designs often can't be wrapped in safe-Rust without introducing runtime overhead the patterns were used to originally avoid. Languages like Zig and C however make these patterns natural or event pleasant enough to consider it over unsafe Rust.


This is such a squishy value proposition, which is why people aren't taking Zig seriously.

Rust's value proposition is simple: no GC and no undefined behavior. Period. Nothing else has that.


I don't know, people like Mitchell Hashimoto and Tobi Lütke are taking Zig seriously for systems programming. Coil are also investing in writing a new distributed financial database for Zig—considering the trajectory of the language and the lifetime of our project, it made sense.

Of course the swell is early, but waves are what technology is about, and the surfers are there and paddling out. It's a great time to be getting involved, especially for greenfield projects that have some time in themselves to reach stability and don't want to pay a language compiler/complexity tax for the rest of the project's lifetime.

You could also throw a dart blindfolded into the Zig community and be pretty sure to hit some seriously talented programmers to learn from. If you're investing in a deep understanding of the language now, I'm pretty sure it will pay off down the line.



In terms of safety in systems programming, Zig is hardly better than something like Modula-2 or Object Pascal.

It just happens to have a better syntax for the C crowd.


Yeah, I've tried out zig and it's about at the point where Rust was when there were like 5 different pointer sigils and a GC was still included.


The author is not excited about "No hidden control flow", but as a code reader it's really nice. It means that the only context you need in order to understand the control flow of a given line of code is that line itself. You don't need to check for overloaded operators, exceptions, virtual functions, etc. It's this property of Zig that I think makes "read the stdlib source" actually a viable strategy for learning how to use it.


What did you think of the example in the article about vector math? Seems like that's an area where operator overloading actually makes the code more readable. Maybe it depends on the problem domain you're working in.


I agree that with vector math, overloaded arithmetic operators are easier to read. However, I don't see how you could add "overloading, but only for actual math" - once it's in, people will repurpose it for all kinds of cursed purposes.

The implementation would probably be ugly, but I wonder if it could be implemented by using a comptime string to represent the operation, e.g. something like:

    fn doMath(comptime op: []const u8, args: anytype) MathReturnType(op, args) {
        // TODO: implement me
    }

    const result = doMath(
        \\a + b
        ,
        .{ .a = a, .b = b }
    );
Where the implementation would call `.add` etc on the parameters when infix operators were used.


There was a comment several days ago (https://news.ycombinator.com/item?id=29825516) that made me reconsider the whole enterprise of operator overloading even for math, specifically its last paragraph.

The gist is that you can easily build them to preclude useful optimizations and efficient execution, when it's often more desirable to be fast than to have syntactic sugar, hence having explicit function calls like multiply_add(a, b, c) instead of a+b*c. If you really want syntactic sugar when it comes to math, operator overloading probably isn't the way to implement it, it'd be nicer to have something with the full context so there can be optimizing reductions. Lisp macros can do that, or you might have some other kind of parser (that might have to work on strings), or with sufficient cleverness you could build an overloaded operator nest full of context-accumulating operations-to-perform that either require some doMath wrapper at the end or a final overload of operations producing a fully computed return type.

I prefer languages that don't cripple expressive freedom and so overall I'm not anti-operator-overloading in general even if I think some overloads are pretty questionable (I dislike C++'s arrow overload for Optionals) but I no longer think that e.g. a math-focused library is an obvious win or exception to the downsides of the expressive power granted from operator overloading.


I think vector math is a compelling example but it is far from the only one. Bignum arithmetic is probably just as common if not more so, in fact I see that Zig actually has a bignum library built in if I understand correctly. Using that library will be painful because of this choice. There are plenty of other such examples, imagine implementing (and then using) something like SymPy in Zig.


One of the issues with operator overloading itself in relation to bignums and similar is the need for an allocator (there's nowhere to pass one) along with the lack of error handling (suppose allocation fails) and when/how to cleanup. Same with string concatenation at runtime via operators and many other places where it could be used.

So even if zig allowed for operator overloading, those issues would have to be solved.


Zig is fairly low-level. GPU compute kernels are already defined as stringly typed things that get thrown into an additional non-Zig compilation step; I don't know that I'd feel too badly about somebody having to write a tensor DSL that Zig could interpret at compile time.

Do you have any examples of situations where people would be writing a lot of code with custom operators where each individual block is pretty small and having to deal with a DSL of some flavor would be especially burdensome?


It’s a bit of a non-issue really. Once you’ve worked with the non-overloaded version for a while you quickly get used to it. The overloaded version is also superficially simple but loses information. For example, multiply versus multiplyScalar versus applyQuaternion on a vector which might all be represented by overloads of the multiply operator.


Why include the infix operators at all then? If it's just as clear without them maybe even machine native types don't need them.


They’re only operating on scalar values. In some languages differing types and unsafe coercion rules mean it’d probably be clearer if they weren’t used.

And I don’t believe I said they were just as clear, just that it’s no big deal to get used to and gave an example where overloading muddies the waters.


I also did AOC 2021 to learn zig. But I found zig interesting and useful.

As someone with only 18 days more experience than OP it’s a little silly for me to say, but I think OP is getting put off the the normal learning curve of a new language and standard library. The struggle to learn different patterns faded away after not too many more days. Zig is pretty simple, and while the standard library is barely documented, it is pretty well organized and straightforward to understand.

I’m pretty interested to see where zig goes, and I’m hopeful about it.

(As mentioned, zig is a work-in-progress. The final form might be significantly different that what is there now,)


> [Compiling] takes about ~3 seconds minimum which is frustratingly slow

I feel old, I know that any time is an opportunity to get distracted but 3 seconds doesn't strike me as a long compile time.

Or is that a typo for 30, which would make more sense, which is definitely long enough to be a frustration?


That complaint was weird, here's my day 7 solution in zig built in release mode:

    $ time zig build-exe ./a.zig -O ReleaseSmall

    real 0m1.728s
    user 0m1.544s
    sys 0m0.292s
Which is actually slower than running the entire solution in debug mode:

    $ time zig run a.zig -- input.txt
    min 337 342641
    min 470 93006301

    real 0m1.138s
    user 0m0.789s
    sys 0m0.270s
Totally possible that it's much slower on the dev's machine, but if he's a VR developer it seems weird that he'd have a low-powered box.

[1]: https://github.com/llimllib/personal_code/blob/master/misc/a...


Blog author here. I'm an Intel i7-8700k desktop. A few years old at this point, but quite beefy.

Maybe this is a Windows issue? If I run "zig build" and then immediately run "zig build" again it still takes 3 seconds.


A relatively quick way to test that theory might be to pull down a WSL2 image of some sort and run the same thing in there.

Speaking as somebody who mostly codes for *n?x but has some windows users of their libraries, I've run face first into "everything I know about build optimisation is inapplicable on windows" more than once.

('everything' is admittedly slightly hyperbolic but I'm sufficiently bad at windows tooling that it always feels that way)


Sounds like something is messed up with your zig cache. I have a medium-size project (30 files, 10s of thousands of lines of code) that takes 0.07 seconds to `zig build` if nothing has changed. This is on a Mac.


How do I clean the zig cache? There does not appear to be a "zig clean".


rm -rf zig-cache over here


Ah didn't realize it was just a folder in the project directory, oops!

Nuked zig-cache and zig-out, rebuild, no change in build times.


Totally possible! I don't have a Windows box to test on, and win does sometimes end up as a second class citizen for stuff like this.


That's funny. I am old, and I think 3 seconds is unacceptable.

I use zig anyway.


I read it differently to you. I don't think he meant three seconds is long for compiling in general, but when you are unfamiliar with the code and trying to make it work, having to wait three seconds everytime you make a small change adds up to become annoying.

I think that's why he would like a `zig check` command that would give him these errors and warnings faster.


Depends on the machine, on large projects, zig can take up to 10-15 seconds in my i5 12gb, ofc, after the first hit it becomes 0.01 recompiling without change (in case you removed the out and not the cache) and the first time you compile it takes a bit (not counted) since it will precompile std (or that is what I've seen) in a global cache (just so std does not make your code take longer to compile), small things takes 1-3 seconds depending on how comptime heavy I make my code do.

Zig comptime is the part that takes a bit more time compiling for being an interpreted version of the language, and the more comptime you use, the more time it takes (for a bit, even comptime heavy code can gain just a few seconds out of it, of course can gain 30 minutes, but you would be making like, weird jumps and generating strings on comptime that gets used on runtime, or an expensive algorithm that will take a long time to execute)


Honestly I'm surprised it takes 3 whole seconds. Unless I'm importing zigwin32 I don't think I've ever had a Zig project take that long to build. But then again, I avoid linking to C at pretty much any cost, maybe OP didn't.


when you don’t have in editor compiler errors and are trying to learn instant feedback can be nice


fun stuff, zig syntax error stuff is handled both by zls and zig fmt --ast-check

(or if you care only for one file) zig ast-check =)

For compile errors I think zig build is the only way


I see what went wrong here. You tried to lean on documentation and Google, you should have just go for the source code. It's quite readable, and it's the recommended way to learn for now. Zig has a few peculiarities, it's not stable yet, you kind of have to follow it's development.

https://github.com/ziglang/zig/tree/master/lib/std


>I see what went wrong here. You tried to lean on documentation and Google, you should have just go for the source code.

That sounds rather like a different thing was wrong - not where the author tried to lean: Zig has crap documentation.


Zig documentation isn't crap; the documentation *does not exist* yet. There's an important difference between the two which seems to be ignored.

Zig is young. They're spending their time maturing the core ecosystem --- the compiler, the standard library, and so forth. It's perfectly reasonable to expect early adopters to "RTFS/read the fucking source". *When* they start writing documentation, I'd be perfectly happy to hear comments like "The zig documentation is crap".

At this stage of the project, I'll be honest, I find it upsetting to read comments like this.


>Zig documentation isn't crap; the documentation does not exist* yet. There's an important difference between the two which seems to be ignored.*

Well, crap as an offering (what you get atm), not necessarily as in the quality of what little it is. It does though have a reference and documentation page:

https://ziglang.org/learn/


The autogenerated docs immediately greet you with a red banner that states

    These docs are experimental. Progress depends on the self-hosted compiler, consider reading the stdlib source in the meantime.
It also links to a page that explains how the standard library is structured: https://github.com/ziglang/zig/wiki/How-to-read-the-standard...


Zig absolutely has documentation: https://ziglang.org/documentation/master/


That's the language reference, that also links the experimental autogenerated std documentation

Right now, Zig is young and the documentation is kinda ok (Language reference is clear enough, outside a few things that need better visibility, for example, explaining concepts that are only on the examples can be lost easy), the autogenerated std on the other hand used to work meh, until a commit broke it afaik, so rn is in a pityfull state, but there is no use in reworking it until stage 2, so that's that


Well, the STD documentation page has a red banner saying it's experimental and you should consider reading the code (which worked great for me) so at least you have a warning.


Yeah I did AoC a few years ago in Zig and found sticking the std path in my project so I could quickly search it was the best tool to figuring out how to do various bits and pieces.


I did the same! Cuz at the time there was no language server available beyond syntax highlighting and linting. But now there is a pretty good language server package available which is really helpful giving autocompleting


Recommended way to learn for now? Any sources for that?


Read the language reference, use ziglings, use ziglearn.org, use zls (don't trust it too much tho, but it will help a bit) and vscode or any text editor that supports jumping to the source, and of course, RTFS


Ziglings are indeed excellent. Try them if you haven't.


Read The Fabulous Sources?


Well yes, they're nice, but how would they help with Zig?

https://github.com/fsprojects/Fabulous


That's a way of reading it


When saying it positively I often say UTSL instead (Use The Source, Luke).


Isnt the problem with that approach you have a model of the implementation (and not of the 'interface')? I mean you dont know if you are useing the function as intended and if not used as intended it might (actually will) break in the future.

For auditing you are right, of course.


Not exactly.

I mean, as a first note one of the ideas here is reading code already written in zig to understand how to write code in zig - so in that case you're effectively inferring the interface from what the authors of the code itself are treating said interface to be.

But also, interface documentation is pretty much never truly complete because there are almost always some implicit assumptions involved (and if you try and make all of those explicit you rapidly end up with documentation that's so verbose people's eyes glaze over when they try to read it so their model often ends up incomplete anyway so how explicit to be is itself a trade-off).

Then zig embeds its test cases in the source file, so you can look at what the authors have explicitly declared -must- work to help you know what the interface is intended to be.

Plus when I'm source diving, I've done enough of it over the years to be able to at least attempt to build up a model of not just the implementation, but of the author's mental model as they were implementing it, and if you can figure out their intent, it's much easier to guess what their code is -meant- to do and thereby what it will hopefully continue to do into the future.

If in doubt, though, leaving a comment in your own code as to what assumption you're making -and- writing a unit test in your own code that verifies the assumption continues to hold will mean that at least if it doesn't you'll see a test failure in your own suite that tells you it changed.

(or: "code to the interface, not the implementation" is absolutely the right thing to aim for, but in practice the line between the two is fuzzy and cases where you have to make a judgement call will always show up eventually)


Would there by any interest in an imperative language with a basic Standard ML type system?

Personally, while I find the imperative paradigm more intuitive than the functional paradigm, I believe functional type systems are simpler and more "right" than imperative ones are (which have been taken over by OOP). There are multi-paradigm languages, but these have large numbers of features, and complicated type systems, and I want something simple. Is there interest in this? Thanks.

By SML type system, I'm thinking:

  - Algebraic data types
  - Anonymous functions
  - Pattern matching
  - Simple generics
  - Type inference
  - Simple modules and implementation hiding
combined with some approach to operator overloading (which SML doesn't consider at all). And to remind you, the paradigm should be imperative, not functional.

I'm thinking a use case might be in numerical linear algebra, but with support for multiple different scalar types including floats, ints, complex floats, dual numbers (for autodiff), "codual numbers" (for better autodiff), bignums, quaternions, symbolic algebra, etc. The functional paradigm is not suitable here, but Julia is perhaps overcomplicated.


OCaml is pretty close already to what you're describing. The OCaml ecosystem fully embraces imperative programming. And if the syntax of OCaml is problematic, then perhaps ReasonML (https://reasonml.github.io/) is what you're looking for: a curly-braces syntax for OCaml.


I think you just described Rust (although you have to deal with / get to take advantage of the borrow checker).


OCaml and Rust both offer these, OCaml does overloading mostly with module functors and Rust with traits. Of course they’re both multi-paradigm, OCaml is also really good for functional programming.


It starts with a mention of hating Python and continues into a detailed report that could be alternatively called “100 wrong things that Python got right”


There's a Hello World program at the beginning of the Zig Reference Manual. It of course shows how to print things. Many other of these failures could be avoided by reading that document, reading Zig source code in the distribution, looking at issues on Github, reading blogs and watching presentations by Andrew Kelley, etc.

Zig is far from a finished project so it's premature to complain about most of the things you are complaining about.


> No one in the history of the world has ever been confused or upset by a + b calling a function.

That depends on `a`, `b` and the function - I have seen way too many `+` that weren't commutative and/or associative. Languages which do not allow to use _other_ symbols like `⊕` too almost always lead to confusing behavior of 'usual' operators.

And what does `a * b` do for vectors `a` and `b`? Is it the 'usual' 'dot product', is it component wise multiplication (`[.., ai * bi, ..]`) is it 'matrix multiplication' (`a * b^t`), is it the cross product (because the language doesn't allow to define `×` as operator), is it ...

Actually, in C and C++ the implicit conversion and promotion rules are footguns enough.


If zig supports + for floating point addition, then it is already fine with ‘+’ being non commutative. The rest of the argument can also be applied to functions. What does product(a, b) do?


> The rest of the argument can also be apply to functions.

Of course it is, but you can use other functions names than `product`. If the language doesn't let you use any other symbol for the operator but `*`, you're out of luck.


Aside from `NaN`, how is floating point addition non-commutative? It's not associative, but `a+b` will have the same value as `b+a`.


> `a+b` will have the same value as `b+a`

No, as IEEE754 doesn't even guarantee that `a + b` == `a + b`. You always need to compare the absolute difference of two values against an ε.

So

  a == b <=> |a - b| < ε

But for sensible comparison of floats, the addition is commutative.


> No, as IEEE754 doesn't even guarantee that `a + b` == `a + b`

What? I'm pretty sure that is plain incorrect. Can you back that claim up with references?


Floating point addition is not commutative if you're adding negative zero to positive zero, but you won't be able to check that with `a + b == a + b`


Of course it is incorrect, the system is of course not stochastic, not even floating points.


What if a is nan? You don't have that nan == nan...


The rounding can depend 'where' (e.g. the width of the register) the float is stored in the CPU ('destination' in IEEE 754 speak). So, if you compare the result stored in two different 'destinations' (because one has been computed in another register or at a GPU or ....), they can differ even in the same program with the same optimizations.

See for example:

   int main() {
       double  q;

       q = 3.0/7.0;
       if (q == 3.0/7.0) printf("Equal\n");
       else printf("Not Equal\n");
       return 0;
   }

   On an extended-based system, even though the expression 3.0/7.0 has 
   type double, the quotient will be computed in a register in extended 
   double format, and thus in the default mode, it will be rounded to 
   extended double precision. When the resulting value is assigned to the 
   variable q, however, it may then be stored in memory, and since q is 
   declared double, the value will be rounded to double precision. In the 
   next line, the expression 3.0/7.0 may again be evaluated in extended 
   precision yielding a result that differs from the double precision 
   value stored in q, causing the program to print "Not Equal". Of course, 
   other outcomes are possible, too: the compiler could decide to store 
   and thus round the value of the expression 3.0/7.0 in the second line 
   before comparing it with q, or it could keep q in a register in 
   extended precision without storing it. An optimizing compiler might 
   evaluate the expression 3.0/7.0 at compile time, perhaps in double 
   precision or perhaps in extended double precision. (With one x86 
   compiler, the program prints "Equal" when compiled with optimization 
   and "Not Equal" when compiled for debugging.) Finally, some compilers 
   for extended-based systems automatically change the rounding precision 
   mode to cause operations producing results in registers to round those 
   results to single or double precision, albeit possibly with a wider 
   range. Thus, on these systems, we can't predict the behavior of the 
   program simply by reading its source code and applying a basic 
   understanding of IEEE 754 arithmetic. 
The relevant conclusion:

   Neither can we accuse the hardware or the compiler of failing to 
   provide an IEEE 754 compliant environment; the hardware has delivered a correctly rounded result to 
   each destination, as it is required to do, and the compiler has 
   assigned some intermediate results to destinations that are beyond the 
   user's control, as it is allowed to do.
https://grouper.ieee.org/groups/msc/ANSI_IEEE-Std-754-2019/b...


I find the argument unconvincing. While IEEE 758 does not specify what `/` (or `+`) operator in particular does (that falls under C standard), it does specify that:

> Implementations shall provide the following formatOf general-computational operations, for destinations of

> all supported arithmetic formats, and, for each destination format, for operands of all supported arithmetic

> formats with the same radix as the destination format

Which to my interpretation sounds like providing a `division(double, double) -> double` operation is required. I suppose the argument could be that in C the way of invoking that specific operation would be to add an explicit cast, i.e.

    (double)(a op b)

But I do think that this is more a quirk on how operators are done specifically in C and not a general matter; the actual IEEE 758 operations are consistent and don't have such surprises.

So I guess you were technically correct that IEEE 758 does not guarantee `a + b == a + b`, because IEEE 758 does not specify `+` (or `==` for that matter) operators at all. What IEEE 758 does guarantee is that you get the same result for the same operation.

In terms of C language, it is maybe interesting question if `FP_CONTRACT` pragma influences the result here at all:

> A floating expression may be contracted, that is, evaluated as though it were a single opera-

> tion, thereby omitting rounding errors implied by the source code and the expression evalua-

> tion method. The FP_CONTRACT pragma in <math.h> provides a way to disallow contracted

> expressions. Otherwise, whether and how expressions are contracted is implementation-defined.

Are the comparison and arithmetic operations considered to be contracted in these sort of situations?


> But I do think that this is more a quirk on how operators are done specifically in C and not a general matte

That has nothing to do with C, but with the CPU and its registers. x86 has 80bit floating point registers (too), so if the compiler/CPU (doesn't matter which language) saves a value in a 80 bit floating point register and moves it from there to memory where it is stored in 64 bits, the number gets rounded (not truncated, it's actually converted).

See also the GCC page:

    For instance, in the following code segment, depending on the compilation 
    flags and numbers and calculations used to find tmp, the following code may 
    print out that the values are different:


     double tmp, X[2];
     tmp = ....
     tmp += ....
     ...
     X[0] = tmp;
     if (X[0] == tmp)
        printf("Values are the same, as expected!\n");
     else
        printf("Values are different!\n");

    This is because tmp will typically be moved to a register during register 
    assignment, which means tmp may hold a full 80 bits of accuracy, some of 
    which are lost in the store to X[0], and thus the numbers are no longer 
    equal. You may workaround this problem by always explicitly storing to 
    memory to force the round-down.
https://gcc.gnu.org/wiki/x87note

> Are the comparison and arithmetic operations considered to be contracted in these sort of situations?

No. Or well, maybe. 'Contracted' means using e.g. FMA (fused multiply and add), so addition and multiplication like `ab + c` is done in a single step instead of two. Which means that the result is only rounded once (`ab + c` is rounded), whereas without this fusing/contracting `ab` would be rounded and `ab + c` would be rounded again (actually depending on the optimization/compiler flags). So the results (may) differ.


i believe there are strange corner cases around x87, but that may have changed since IEE754 got revised last year


For example, yes.


I appreciate the honesty and thoroughness that the author is displaying here.

There are too many posts that focus on successes and embellish the state of affairs.

That said nothing was too surprising about a language that hasn't hit 1.0 yet.


I completed 10 days of Advent Of Code 2021 in Zig and loved the learning experience. Then I rewrote a snake game I made originally in Elm into Zig and compiled it to WASM. So now it runs in a browser. There is Zig standard library documentation on Zig website, but if you want to see the missing parts, go to GitHub and browse stdlib source files. Each source file contains test cases that are sufficient for understanding how it works.


Such usability reports are super informative for when you want to make software usable.


> Building is slow. It takes about ~3 seconds minimum which is frustratingly slow when I'm fighting basic syntax errors. I wish there was a fast zig check.

> Lack of zig-analyzer makes learning hard.

> zig fmt src/main.zig is nice. Wish it automatically ran on all files.

I also did (well, "am doing", can only work a bit each day and am plugging through day 7 right now) AdventOfCode in Zig this year.

These points here didn't resonate with me at all. I wonder if the author knew about or tried ZLS[0]. I had it on and integrated with my VSCode and it would check a lot of things as I went and format on save. I think I followed something like this[1] to set it up.

[0] https://github.com/zigtools/zls [1] https://zig.news/jarredsumner/setting-up-visual-studio-code-...


I don't think it is that hard to learn

I'm not a professional programmer, this is all a hobby for me and i managed to pick and write an entire game with it

maybe you are just bad as a programmer (nothing wrong with that)

you have to learn how to learn, not everybody can do that (and it's fine)

that being said, zig is a language in the making, some areas are rough, but that's to be expected


> maybe you are just bad as a programmer (nothing wrong with that)

This was actually my impression after reading the post. A bad programmer making a lot of ill-informed complaints. Zig is an unfinished low-level language. Not suitable for bad programmers.


This is a pretty bullshit argument. People can have a really easy or hard time learning a language, depending on said language. It can be too far from what you know, or too similar but just different enough.

Also it's a log of personal experience and not titled "an objective criticism of Zig". I've been interested in Zig a little for a while and this was awesome to read. It hasn't made my interest any smaller or bigger, it's simply a description of one beginner's journey, and if the Zig people are smart they will analyze it and then decide which points are valid and which ones can be ignored. Neither ignoring everything nor accepting everything would probably be useful.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: