Hacker News new | past | comments | ask | show | jobs | submit login
Angle-grinder: Slice and dice logs on the command line (github.com/rcoh)
151 points by kqr 16 days ago | hide | past | favorite | 34 comments



See https://lnav.org for a powerful mini-ETL CLI power tool; it embeds SQLite, supports ~every format, has great UX and easily handles a few million rows at a time.


How does it compare to angle grinder?


I haven't tried angle-grinder, but the README says it's a functional programming language paired with a CLI. Whereas lnav is a CLI util that embeds SQLite and supports SQL and regex filters. I'd guess that lnav has a lower learning curve given there's no new language involved.


Angle-grinder is really nice and the successor of sumoshell (by the same author).

I maintain a list of tools like these as part of the docs for my own tool klp (https://github.com/dloss/klp), which I think has a few useful features that are not in angle-grinder, but is orders of magnitude slower, because it's implemented in pure Python instead of Rust.


For those who find this tool interesting, I can recommend to take a look at Logdy.dev (https://logdy.dev) https://github.com/logdyhq/logdy-core Similar use case, different implementation. Disclaimer, I'm author of that tool.


I see a lot of cool tools in the broadly defined dev space written in Rust. Best ad for Rust.


I'm not taking anything away from, Rust, it is a great language for CLI developer tooling. But in fairness you'll find a plethora of tooling written in all popular languages these days. This is also true for a great deal of "uncool" languages too, like Perl.

I also think newer languages benefit from the fact that CLI developer tooling is a great "hello world" type project to work on. Such projects solve a very real problem that someone starting out in that language wants to solve, while also often being simple enough problem to learn that language in.


I don’t see many useful cli tools written in Java.


I don’t work much in Java-land these days, but there definitely were plenty of CLI tools when I did.


I've used plenty of java cli tools for batch processes that takes ages to run, having something launch fast or run interactively is a different story. No hate on java the jvm is just not that fast launching.


Language in which something is written tells a lot.

For example for something written in go you might expect signals to be mishandled, colour escape codes when piping to another command, and non-GNU command line parsing.

It's not inherent to the language, but it's more about the fact that often people who write the libraries don't know about unix and the system calls they should be using.


Ironically, neither of your examples requires understanding syscalls in low level languages like C either.

In fact I don’t think your average developer should need to understand syscalls to write applications. That kind of stuff should be abstracted away. Plus if you want to write good applications for Linux/UNIX then that might mean using platform specific syscalls rather than sticking with POSIX, which would be a maintainability nightmare if it wasn’t already abstracted. Furthermore, some platforms, like macOS, don’t even guarantee binary compatibility for syscalls between releases. So you’re not even supposed to making syscalls directly in Darwin.

So with that in mind, application developers are usually encouraged to stick with their respective language runtime as the OS interface — and even then, a lot of language runtimes still just hook into libc rather than making syscalls directly themselves.

To come to your examples specifically, even in C you wouldn’t directly call ioctl on your fd to check if it’s a TTY. Instead you’d use isatty() in libc. (In case anyone doesn’t follow, this is to check if stdout is being piped or a tty, and thus whether to output or suppress human readable formatting like ANSI escape sequences).

Your point about GNU argument passing is also off. GNU is a convention and it’s only officially supported in GNU/Linux. You talk about understanding UNIX while referencing conventions that aren’t even anything to do with UNIX!

Lastly, I’m not sure why you singled out Go when there are a lot of advanced developer tools written in it, from pretty much everything by Hashicorp (Terraform, Vault, Consul, etc) through to Docker and Kubenetes. I don’t want to start a flame war about Go nor its developers, just saying it was yet another weird example in a comment already full of weird examples.


> In fact I don’t think your average developer should need to understand syscalls to write applications.

I'm not saying they should never use a wrapper and must use opcodes. I'm saying they should understand the difference between a function call and a system call.

> GNU is a convention and it’s only officially supported in GNU/Linux

Which is where 99.999% of go software runs, coincidentally.

> Lastly, I’m not sure why you singled out Go

Because the problems I listed usually happen to me only if the command is implemented in go. And since go in itself doesn't have any such limitations, it must be something with the libraries, and of course with people who write those libraries.


Previous discussion: https://news.ycombinator.com/item?id=27877956 (144 points | July 19, 2021 | 21 comments)

See also: https://github.com/Textualize/toolong (terminal application to view, tail, merge, and search log files)


For anyone who read the README and is now thinking that agrind is "like jq but with aggregations" — it's much more than that, and (at least in the tasks I use it for) really more complementary to pipeline transformer tools like jq.

Specifically, the "killer app" feature for agrind, is that it's a live reporting tool. You can just throw a CSV file through agrind to get a nice one-shot aggregation out. But if you stream a live data source into it — think something like "your k8s ingress controller's access logs" — then agrind will grab your TTY; draw a table to fills it; and begin live-updating that table with the report, with the rows re-ordering to stay sorted as more data is received.

Want to see e.g. the top 10 highest-duration HTTP requests by path given some particular HTTP status code? You can just start streaming logs into it, and the report will gradually come into focus as it finds matching events. Hit ^C when the data has "stabilized" (i.e. when datapoint ranks are no longer jumping around), and you'll be left with the last-refreshed state of the table in your terminal scrollback, to use in further investigations.

agrind has jq-like manipulation tools, but honestly, I mostly still use jq for that part. I have no idea how to unwrap JSON list items in agrind (or if that's even possible), but if I want to do that, I'll just:

    http localhost:8001/v1/debug/inflight-requests | jq -c '.[].jsonPayload' | agrind '* | json | sum(dur) by path'
---

Bonus recommendation, as it's something I use together with jq and agrind: stern (https://github.com/rancher/stern) — for live tail(1)ing of the O(N) pods of a k8s Deployment, right from the cluster, without any external log multiplexer.

    stern ingress-nginx-controller -i some_path_frag -o raw 2>/dev/null | agrind '* | json | where status == 403 | count by client_addr' 2>/dev/null
Note the two instances of `2>/dev/null` here.

The first one is to remove the "connected to pod XYZ" meta info from stern(1) output, that would muck up the agrind table.

The second one, though, is more interesting: it's there because unlike jq, agrind doesn't die when it hits an unparseable line! Instead, it just spits a parse error and continues. So you can have agrind consume a "dirty" data stream — e.g. a combined "structured JSON access log + plaintext error log" stream — and then just discard stderr, to silence the resulting errors printed from the failed attempts at parsing the error-log lines of the stream.


I mean, jq can aggregate... Maybe it should have more interesting aggregations built-in though.


Cool tool!

I think log processing is something that is easy enough to do (stream through STDIN lines) and has specific enough requirements to individual teams that it's worth building ones own little pipe-in binary.

Can start by highlighting common log fields, but later add subcommands to aggregate, or analyze known problems, etc.


When experimenting with its syntax, you may like to try using it "live" with: https://github.com/akavel/up


another nice one: lnav (log navigator)..


Nice! I shall give it a try. Looks like a very developer oriented tool.


Is the name a riff on a similar google-internal tool?


For those not familiar, Google had a tradition of choosing names of wood-processing tools for “logs” analysis:

- Sawzall: https://research.google/pubs/interpreting-the-data-parallel-...

- Dremel: https://research.google/pubs/dremel-interactive-analysis-of-...

- PowerDrill: https://research.google/pubs/processing-a-trillion-cells-per...


Before anyone gets any ideas, I’m sure it is possible, but angle grinders are most definitely not designed for cutting wood. They’re dangerous enough as it is when you’re using them correctly!



Chainsaw would be a more fitting name. I don’t think angle grinders are used to slice logs. Perhaps a bandsaw.


Not to be juvenile, but I firmly believe that "computer log processing" is much more comparable to sewage treatment than converting trees into lumber.

Maybe you can't get funding from people who name things after middle earth magic rocks if you call your system "sludge strainer" ?


Not to be juvenile, but I firmly believe that "computer log processing" is much more comparable to sewage treatment than converting trees into lumber.

I now have another facet to add to my appreciation of "Log" from the Renn & Stimpy show.


Angle grinders are useful when working with pipes, though.


Slice, no. But look up "wood carving disc" or "wood shaping disc" for some angle-grinder related crimes against woodworking - lots of power, but you'd better know exactly what you're trying to accomplish. (Besides, after 3+ decades of command-line-enhanced log processing, you can assume that a fitting or obvious name has been used three or four times already :-)


There’s already a DFIR log tool named chainsaw: https://github.com/WithSecureLabs/chainsaw


"Sawmill"? https://en.wikipedia.org/wiki/Sawmill (or maybe "crosscut"?)


Not sure why "MUSL-compatible" is the default for the binary install. This is surely in the minority of Linux distros, even in 2024. I imagine it's best (when distributing) to target the convenience of the end-user rather than of the dev team (since one could simply create a separate CICD channel that builds with a glibc-based starting image).

Makes me wonder if the 'cargo install' method requires MUSL libc. This is not stated but that also makes it unclear since it comes right after the former variant.


It does not matter as the libc part is statically linked into the binary. This means that this project runs on any linux kernel that the musl libc supports, regardless of which libc the rest of the system uses.

Musl was likely picked because glibc doesn't support proper static linking.


Most likely for Docker/container users using Alpine Linux or similar.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: