SQL shows it's age by having exactly the same problem.
Queries should start by the `FROM` clause, that way which entities are involved can be quickly resolved and a smart editor can aid you in writing a sensible query faster.
The order should be FROM -> SELECT -> WHERE, since SELECT commonly gives names to columns, which WHERE will reference.
You could even avoid crap like `SELECT * FROM table`, and just write `FROM table` and have the select clause implied.
Never mind me, I'm just an old man with a grudge, I'll go back to my cave...
> It's written that way because it stems from relational algebra, in which the projection is typically (always?) written first.
It's inspired by a mish-mash of both relational algebra and relational calculus, but the reason why SELECT comes first is because authors wanted it to read like English (it was originally called Structured English Query Language).
You can write the relational algebra operators in any order you want to get the result you want.
> You can write the relational algebra operators in any order you want
Ultimately, yes, you can express relational algebra in any notation that gets the point across, but the parent is right that
π₁(R)
is what is commonly used.
(R)π₁
not so much. Even Codd himself used the former notation style in his papers, even though he settled on putting the relation first in his query language.
I was explaining why it is the way that it is. If you'd like your own version of a parser, here's Postgres' [0]. Personally, I really like SQL's syntax and find that it makes sense when reading it.
There was not argument about how much sense it makes. There was an argument for improving readability by placing the table names first.
Lots of people “like” things because they are familiar with them. And that’s a fine enough reason. But if you step out of your zone of familiarity, can you find improvements? Are you willing to forgo any prejudice you may possess to evaluate other suggestions?
Just a little willingness to see another perspective is all anyone asks.
Yes, C#'s DSL that compiles to SQL (LINQ-to-SQL) does the same thing, `from` before the other clauses, for the same reason that it allows the IDE code completion to offer fields while typing the other clauses.
I do agree, that is about time that SQL could have a variant starting with FROM, and it shouldn't be that hard to support that, it feels like unwillingness to improve the experience.
Kusto is so much vetter than it has any right to be! Normally I'd run a mile at a cloud provider specific programming language that can't ve used elsewhere, but it really is quite nice! (there are some wierd quirks, but a tonne less than I'd have thought)
My main caveat here, is that often the person starting a select knows what they want to select before they know where to select it from. To that end, having autocomplete for the sources of columns is far far more useful than autocomplete for columns from a source.
I will also hazard a guess that the total number of columns most people would need autocomplete for are rather limited? Such that you can almost certainly just tab complete for all columns, if that is what you really want/need. The few of us that are working with large databases probably have a set of views that should encompass most of what we would reasonably be able to get from the database in a query.
Seems nonsensical. Column names have no meaning without the table. Table is the object, columns are the properties. What language goes Property.Object? All popular languages have this wrong?
Anytime you store a single property for all objects in an array or hash table, so, Fortran, BASIC, APL, Numpy, R, awk, and sometimes Perl. Parallel arrays are out of style but certainly not unheard of. They're the hot new thing in games programming, under the name "entity-component-system".
Even C when `property` is computed rather than materialized. And CLOS even uses that syntax for method invocation.
The only language I can think of that uses field(record) syntax for record field access is AT&T-syntax assembly.
I'm assuming you are largely just not thinking this one through? We are not modeling the domain, we are describing some data we want to select. Without knowing how it is modeled, I can give a brief "top line" for expected select statements on many data models. "I want average_weather, year, zip_code", "I want year, college_name, degree_name, median_graduate_salary, mean_graduate_salary", "..."
I don't think it is tough to describe many many reports of data that you would want in this way. Is it enough for you to flat out get the answer? No, of course not. But nor is it enough to just start all queries at what possible tables you could use as a starting point.
How would you even know it's possible get the data before you've chosen a table?
Your example is also not a complete select statement, you would need to go back and add the actual aggregate functions, and oops the zip_code column was actually called zip, so we need to remap that as well. You can almost never finish a select statement before you have inspected the tables, so why not just start there immediately?
Often times you don't know it is possible to get some data without consulting the table? Worse, you probably need to consult the views that have been made by any supporting DB team before you go off and build your own rollups. Unless you like not taking advantage of any aggregate jobs that are running nightly for reporting.
And to be clear, I'm all for complaining about the order of a select statement, to a very large degree. It is done in such a way that you have to jump back and forth as you are constructing the full query. That can feel a bit painful.
However, I don't know that it is just the SELECT before FROM that causes this jumping back and forth and fully expect that you would jump around a fair bit even with the FROM first. More, if I am ever reworking a query that the system is running, I treat the SELECT as the contract to the application, not the FROM.
There is a bit of "you should know the database before you can expect to send off a good query", but that really cuts to any side of this debate? How do you know the tables you want so well, but you don't know the columns you are going to ask for?
> How do you know the tables you want so well, but you don't know the columns you are going to ask for?
Simply because there are a lot more columns than there are tables.
Of course, I sometimes forget the exact table name as well. However, this is mostly not an issue as the IDE knows all table names before anything has been entered. By simply entering `FROM user`, the autocomplete will list all tables that contain `user`, and I can complete it from there. I cannot do the same with the column selection unless I first write the FROM part. And even if I do know exactly which columns and tables I want I would still want autocomplete to work when creating the SELECT statement. Rather than typing `zip_code`, I can most likely just type `z<TAB>`.
That is why 99% of my select queries start as `SELECT * FROM mytable`, and then I go back to fill in the select statement. And it's not just me, all colleagues I've worked with does exact same thing.
For larger, more complicated queries, you'll have to go back and forth a lot as you join tables together, that is unavoidable, but 80-90% of my queries could be finished in one go if the FROM part came first.
But "more columns than there are tables" is also why I put the thing about, "the total number of columns most people would need autocomplete for are rather limited?" I could see an argument on how this mattered back at the origin of SQL, but today? It is entirely conceivable that you can autocomplete the column names with it updating the from as necessary for the columns you have selected. You could then tab to the from and pick alternative tables with natural joins prefilled between the tables. With very minimum effort.
I actually seem to recall this was done in some of the early dedicated SQL tools. It is rather amusing how much we handicap ourselves by not using some things older tools built.
Maybe in this case you know the same naming conventions are enforced across all tables. But in general it’s difficult to know the exact column name without looking it up first.
> How did you know it was zip_code and not ZipCode?
I think you're missing the point; you start off with your goal, regardless of the column name. No one has the goal "we want columsn from this specific table", it's always "we want these columns in the final output".
select name, min(price) from product join product_price on product.id = product_price.product_id
where you start with "I need a product name and minimum price" before thinking about where they come from?
The more one uses SQL to filter for, the more you'll think about what you want to achieve vs how (which comes after): in a sense, the SELECT is your function return type declaration, and I frequently start my functions with the declaration.
Likewise when skimming code, having the attributes on the left and details on the right makes it much easier to see the data flow in the larger program - it's the same order as "var foo = bar()".
Can you help me understand this situation? Almost always what I want to select is downstream of what I am selecting from, I can't write anything in SELECT until i understand the structure/columns available in FROM.
"I want 'first_name, last_name, and average(grade)' for my report." That is easy enough to state without knowing anything about how the data is normalized.
Back when I worked to support a data science team, I actually remember taking some of their queries and stripping everything but the select so that I could see what they were trying to do and I could add in the correct parts of the rest.
I don't think anyone would ever say they want "id, id, id"? They would say they want "customer_id, order_id, item_id" or some such. You would then probably say, "we just use 'id' for the id of every item..." but many places that I've worked actually explicitly didn't do that for essentially this reason. Natural joins on "foo_id" "just work" if you don't give every table the same "id" column.
Current SELECT syntax does allow one to "SELECT user.id as user_id, product.id as product_id..." which can then even autocomplete the FROM for you from your query "declaration" (a-la function declaration, in particular its return type).
This assumes I have all the column memorized but not all the tables?
Even in your example, first and last could refer to student or teacher. But presumably you know you're looking for student data before knowing the exact columns.
No, this assumes you know what you want for a query. Which, seems largely fair?
Like, how would you send this question to someone? Or how would you expect it to be sent to you? If your boss doesn't tell you from what table they want some data, do you just not answer?
And sure, there could be ambiguities here. But these are not really fixed by listing the sources first? You would almost certainly need to augment the select to give the columns names that disambiguate for their uses. For example, when you want the teacher and the student names, both.
And this is ignoring more complications you get from normalized data that is just flat out hard to deal with. I want a teacher/student combination, but only for this specific class and term, as an example. Whether you start the from at the student, the class, or the term rosters feels somewhat immaterial to how you want to pull all of that data together for why ever you are querying it.
If my boss asks me for a zip code, I'm going to ask "for what?"
If they ask for "address for a customer" I can go to the customer table and look up what FKs are relevant and collect all possible data and then narrow down from there.
I'd assume they would ask for "aggregate sales by month to zip codes," or some such. Which, you'd probably get from a reporting table, and not bother trying to do the manual aggregate over transactional data. (That is, OLAP versus OLTP is a very real divide that you can't wave away.)
Realistically, I strongly suspect you could take this argument either direction. If you have someone making a query where they are having to ask "what all tables could I start from?" you are in for some pain. Often the same data is reachable from many tables, and you almost certainly have reasons for taking certain paths to get to it. Similarly, if they should want "person_name", heaven help them.
Such that, can you contrive scenarios where it makes sense to start the query from the from clause? Sure. My point is more that you almost certainly have the entire query conceptualized in your mind as you start. You might not remember all of the details on how some things are named, but the overall picture is there. Question then comes down to if one way is more efficient than the other? I have some caveats that this is really a thing hindered by the order of the query. We don't have data, of course, and are arguing based on some ideas that we have brought with us.
So, would I be upset if the order was reversed? Not at all. I just don't expect that would actually help much. My memory is using query builders in the past where I would search for "all tables that have customer_id and order_id in them" and then, "which table has customer_id and customer_address" and then... It was rarely (ever?) the name of the table that helped me know which one to use. Rather, I needed the ones with the columns I was interested in.
Your approach only works with massive assumptions about the structure of the data or a very simplistic data structure.
SELECT statments don't just use table names, they can use aliases for those table names, views, subqueries, etc.
The FROM / JOIN blocks are where the structure of the data your are selecting from is defined. You should not assume you understand what a SELECT statement means until you have read those blocks.
I can define the return format of my query in the SELECT statement, then adapt the data structure in the FROM block using subselects, aliases etc — all to give me the shape desired for the query.
If you've ever done complex querying with SQL, you'd know that you'd go back and forth on all parts of the query to get it right unless you knew the relations by heart, regardless of the order (sometimes you'll have to rework the FROM because you changed the SELECT is the point).
This was a historical decision because SQL is a declarative language.
I was confused for too long, I want to admit, about the SQL order:
FROM/JOIN → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT
As a self-taught developer, I didn't know what I was missing, but now the mechanics seem clear, and if somebody really needs to handle SELECT with given names, then he should probably use CTE:
WITH src AS (SELECT * FROM sales),
proj AS (SELECT customer_id, total_price AS total FROM src),
filt AS (SELECT * FROM proj WHERE total > 100)
SELECT * FROM filt;
> This was a historical decision because SQL is a declarative language.
What do you mean? Both ALPHA (the godfather declarative database language created by Codd himself) and QUEL, both of which inspired SQL, put "FROM" first. Fun fact: SQL was originally known as SEQUEL, which was intended to be a wordplay on being a followup to QUEL.
Another commenter makes a good case that SQL ended up that way because it was trying to capture relational algebra notation (i.e. π₁(R), where π~select, ₁~column, R~table). Yet relational algebra is procedural, not declarative. Relational calculus is the declarative branch.
Although the most likely explanation remains simply that SEQUEL, in addition to being fun wordplay, also stood for Structured English Query Language. In English, we're more likely to say "select the bottle from the fridge", rather than "from the fridge, select the bottle". Neither form precludes declarative use.
My other big dream would be allowing multiple WHERE clauses that would be semantically ANDed together because that's what would happen if you filtered a result set twice.
Obviously not, and bringing this up as if it's a gotcha just shows you aren't keeping up with the conversation. Try less to correct people and more to understand people.
It's about how humans think about it, not about how the computer executes it.
I read it as describing their preferred mental model for declaring a result set, which is different from describing their preferred behavior to produce it. This seems clear to me in wording and context; it’s also broadly consistent with how SQL is understood, and how it is generally implemented.
Because SQL is a whole language family of separate languages grudgingly adopting new features introduced by the notoriously slow standartization committee.
BigQuery SQL and Spark SQL (and probably some others) have adopted pipelined syntax, DuckDB SQL simply allows you to write the query FROM-first.
It's true. I call authoring SQL "holographic programming" because any change I want to make frequently implies a change to the top and bottom of the query too; there's almost never a localized change.
C++ has this issue too due to the split between header declarations and implementations. Change a function name? You're updating it in the implementation file, and the header file, and then you can start wondering if there are callers that need to be updated also. Then you add in templates and the situation becomes even more fun (does this code live in a .cc file? An .h file? Oh, is your firm one of the ones that does .hh files and/or .hpp files also? Have fun with that).
Don’t know why python gets so much love. It’s a painful language as soon as more than one person is involved. What the author describes is just the tip of the iceberg
Failure to understand something is not a virtue. That it does get a lot of love strongly suggests that there are reasons for that. Of course it has flaws, but that alone doesn't tell us anything; only comparisons do. Create a comprehensive pro/con list and see how it fares. Then compare that to pro/con lists for other languages.
> That it does get a lot of love strongly suggests that there are reasons for that.
Reasons, sure, but whether those reasons correlate with things that matter is a different question altogether.
Python has a really strong on-ramp and, these days, lots of network effects that make it a common default choice, like Java but for individual projects. The rub is that those same properties—ones that make a language or codebase friendly to beginners—also add friction to expert usage and work.
> Failure to understand something is not a virtue.
This is what I want to say to all the Bash-haters that knee-jerk a "you should use a real language like Python" in the comments to every article that shows any Bash at all. It's a pet peeve of mine.
It's not so much a matter of exhaustively listing pros and cons, but more a matter of appropriateness to the desired goal or goals IME. I seriously doubt that a comprehensive pro/con list can even be coherent. Is dynamic typing a pro or a con? Depends on to whom and even in what decade you ask. List comprehensions? Interpreted language? First class OOP? Any cost-benefit anaylsis will be highly context-dependent.
> I'm not here to defend tiresome strawmen.
I won't point out that you already tried to (contradiction intended). Perhaps a more interesting discussion would result if we defaulted to a more collaborative[0] stance here?
> Of course it has flaws, but that alone doesn't tell us anything; only comparisons do.
Comparisons won't tell us anything. If Python were the only programming language in existence, that doesn't imply that it would be loved. Or, if we could establish that Python is the technically worst programming language to ever be created, that doesn't imply that wouldn't be loved. Look at how many people in the world love other people who are by all reasonable measures bad for them (e.g. abusive). Love isn't rational. It is unlikely that it is possible for us to truly understand.
The same reason people are not flocking to the Lisps of the world: mathematical rigour and precision does not translate to higher legibility and understandability.
Python's list /dict/set comprehensions are equivalent to typed for loops: where everyone complains about Python being lax with types, it's weird that one statement that guarantees a return type is now the target.
Yet most other languages don't have the "properly ordered" for loop, Rust included (it's not "from iter as var" there either).
It's even funnier when function calling in one language is compared to syntax in another (you can do function calling for everything in most languages, a la Lisp). Esp in the given example for Python: there is functools.map, after all.
"the target"? The OP complaint is not that the statement exists, but that the output expression comes before the context that establishes its meaning. This is not just a problem for autocomplete, but people often report difficulties understanding complex comprehensions.
As for comprehensions themselves, ignoring that problem I find them a powerful and concise way to express a collection of computed values when that's possible. And I'm particularly fond of generator expressions (which you didn't mention) ... they often avoid unnecessary auxiliary memory, and cannot be replaced with an inline for loop--only with a generator function with a yield statement.
BTW, I don't understand your comment about types. What's the type of (x for x in foo()) ?
That one is a generator[Any], and for others it could be set[Any], list[Any] or dict[Any]. You obviously don't get embedded type information, but you still do get the encapsulating type, which is better than an untyped for loop :)
I get that it's about how it's structured and ordered, but that is true for the "for..in" loops in every language as well: you first set the variable, and only then get the context — this just follows the same model as the for loops in the language, and it would be weird if it didn't.
"that is true for the "for..in" loops in every language as well"
No, not at all. The output expression is arbitrary ... it might be f(x, y, z) where all of those are set later. You're confusing the output expression with the loop variable, which is also stated in the comprehension and may or may not be the same as the output expression or part of it. "The same model as the for loops in the language", where the language includes Python, is the comprehension with the output expression moved from the beginning to the end. e.g., (bar(x) for x in foo()) is `for x in foo(): bar(x)`. More concretely: lst = [bar(x) for x in foo()] is functionally equivalent to lst = []; for x in foo(): lst.append(bar(x))
Again, "the output expression comes before the context that establishes its meaning." ... I thought that would be clear as day to anyone who is actually familiar with Python comprehensions.
Note that I am not disputing the order is "inversed": all I am saying is that there are other common language features in most languages where things don't flow the same way, yet nobody finds it a huge problem.
It's like discussion of RPN or infix for calculations: both do the job, one is more rigorous and clear with no grouping/parentheses, yet we manage just fine with infix operators in our programming languages (or maybe not, perhaps all bugs are due to this? :)).
Just like you state a variable and some operations on it early in a comprehension, you do the same in a for loop: you don't know the type of it.
As you are typing the for loop in, your IDE does not know what is coming in as a context being iterated over to auto-complete, for instance (eg. imagine iterating over tuples with "for k, v in some_pairs:" — your editor does not even know if unpacking is possible).
Basically, what I am saying is that comprehensions are similarly "bad" as for loops, except they are more powerful and allow more expression types early.
C/C++ allow even crazier stuff in the "variable's" place in a for loop. Rust allows patterns, etc.
Typing is mostly a nice addendum I mentioned, that's not the core of my point.
Python was established as a fun and sensible language that was usable and batteries-included at a time when everything else was either expensive or excruciating, and has been coasting in that success ever since. If you'd only coded in bash, C/C++, and late-'90s Java, Python was a revelation.
Lists comprehensions were added to the language after it was already established and popular and imho was the first sign that the emperor might be naked.
Python 3 was the death of it, imho, since it showed that improving the language was just too difficult.
I find list and dict comprehensions are a lot less error prone and more robust than the “manual” alternatives.
I would say unless you have a good reason to do so, features such as meta classes or monkey patching would be top of list to avoid in shared codebases.
> For a language where there is supposed to be only one way to do things
That's not what the Zen says, it says that there should be one -- and preferably only one -- obvious way to do it.
That is, for any given task, it is most important that there is at least one obvious way, but also desirable that there should only be one obvious way, to do it. But there are necessarily going to be multiple ways to do most things, because if there was only one way, most of them for non-trivial tasks would be non-obvious.
The goal of Python was never to be the smallest Turing-complete language, and have no redundancy.
If I have to write Python like Go, I'd rather just write Go. (Not disagreeing with you, but this is one of many reasons that Python is the least-favourite language that I use on a regular basis.)
I used to agree with this completely, but type annotations & checking have made it much more reasonable. I still wouldn't choose it for a large project, but types have made it much, much easier to work with others' python code.
Python with strict type checking and its huge stdlib is my favourite scripting language now.
I agree, it's not too bad if you enable strict Pyright checking in CI, and you use `uv` and you don't care remotely about performance.
That's quite a lot of ifs though. Tbh I haven't found anything significantly better for scripting though. Deno is pretty nice but Typescript really has just as many warts as Python. At least it isn't so dog slow.
I've pretty much embraced PowerShell for scripting. The language is warty as hell and seems to be entirely made of sharp edges but I've gotten used to it and it does have a lot of excellent ideas.
Thankfully, uv makes this a lot better, but it'll still be a while before that percolates to the entire ecosystem, if ever.
There are real concerns about tying the language's future to a VC-backed project, but at the same time, it's just such an improvement on the state of things that I find it hard not to use.
Because everything that tries to fix it is just as painful in different ways.
I've had the displeasure of working in codebases using the style of programming op says is great. It's pretty neat. Until you get a chain 40 deep and you have to debug it. You either have to use language features, like show in pyspark, which don't scale when you need to trace a dozen transformations, or you get back to imperative style loops so you can log what's happening where.
This hasn't been my experience, but we use the Google style guide, linters, and static type verification to cut down on the number of options for how to write the program. Python has definitely strayed from its "one right way to do a thing" roots, but in the set of languages I use regularly it gives about the same amount of issues as JavaScript (and far, far less than C++) regarding having to deal with quirks that vary from user to user.
It's my 18th language and I prefer it over the alternatives. If I need to write an app I'll use Swift. If I need to do some web work I'll switch to TypeScript. To get work done it's Python all the way.
I'd argue it's because the web is the dominant platform for many applications and no other languages can offer a first class experience. if WASM had direct access to the DOM and web APIs and maybe a little more runtime support to lessen the bloat, I'd use something else.
JavaScript has been a backend language long before the web was the dominant platform.
And one of the, admittedly many, reasons why web technologies like Electron and React Native exist is because it’s easier to find JavaScript developers vs Kotin, Qt or whatever.
So youre not wrong but you’re also downplaying the network effect that led to the web becoming dominant. And a part of that was because developers didn’t want to learn something new.
Actually I would say that was the turning point when the web started to become the dominant platform. Not the conclusion.
Before then it was Flash and native apps. Before then, smart phones weren’t common (eg the iPhone was just over a year old at that point) and the common idiom was to write native apps for Windows Mobile.
IE was still the dominant browser too, Firefox and WebKit were only starting to take over but we are still a long way from IE being displaced.
Electron didn’t exist so desktop app were still native and thus Linux was still a second class citizen.
It took roughly another roughly 2015 before the web because the dominant platform. But by 2005 you could already see the times were changing. It just took ages for technology to catch up. The advent of v8 and thus node and Electron, for that transition to “complete”.
> It wasn't until 2009 with the introduction of Node.js that JavaScript became a viable option for backend development.
JavaScript was used for backend development since the late 1990s via the Rhino engine (the backends wouldn't be pure JS generally but a mix of JS and Java; Rhino was a JS engine for the JVM with Java interop as a key feature.)
That would be nice if devs always wrote code sequentially, i.e. left to right, one character at a time, one line at a time. But the reality is that we often jump around, filling in some things while leaving other things unfinished until we get back to them. Sometimes I'll write code that operates on a variable, then a minute later go back and declare that variable (perhaps assigning it a test value).
Code gets written once and read dozens or hundreds of times. Code that can be read sequentially is faster and easier to read than code that requires jumping around.
Exactly. You only write code sequentially when it's a new file.
If i decide to add a new field to some class, i won't necessarily go to the class definition first, I'll probably write the code using that field because that's where the IDE was when i got the idea.
If I want to enhance some condition checking, i'll go through a phase where the piece of code isn't valid while I'm rearranging ifs and elses.
I agree with this, but it leads to another principle that too many languages violate - it shouldn't fail to compile just because you haven't finished writing it! It should fail in some other non-blocking way.
But some languages just won't let you do that, because they put in errors for missing returns or unused variables.
How is it supposed to compile if you've written something syntactically invalid? You can make the argument that the compiler could interpret it in (perhaps even arbitrary) valid way that constitutes a valid syntax, but that's almost worse: rather than being chided with compiler warnings, you now end up with code that compiles but executes indeterminately.
Well I don't literally mean finished as in you haven't finished typing it. Although that's possible too - whitespace languages like Python tend to just work since they don't have end brackets, I suppose.
But you could have a compiled language where errors were limited to the function when possible, like by emitting asserts.
there are approaches to ensure you always or almost always have syntactic validity, e.g. structured editing or error correcting parsing, though of course it's true that the more permissive the system is the more tortured some of the syntactic 'corrections' must be in extreme cases. the approach we're taking with http://hazel.org (which we approximate but don't fully succeed at yet) is: allow normal-ish typing, always correct the code enough to be ran/typechecked, but insert placeholders/decorations to telegraph the parse structure that we're using in case of errors or incompleteness.
I feel that's a use case for compiler flags that convert warnings to errors or vice-versa, where the same source-code can either be "good enough to experiment with" or "not ready for release" depending on the context.
Reading through the article, the author makes the argument for the philosophy of progressive disclosure. The last paragraph brings it together and it's a reasonable take:
> When you’ve typed text, the program is valid. When you’ve typed text.split(" "), the program is valid. When you’ve typed text.split(" ").map(word => word.length), the program is valid. Since the program is valid as you build it up, your editor is able to help you out. If you had a REPL, you could even see the result as you type your program out.
In the age of CoPilot and agent coders I'm not so sure how important the ergonomics still are, though I dare say coding an LSP would certainly make one happy with the argument.
Some IDEs provide code templates, where you type some abbreviation that expands into a corresponding code construct with placeholders, followed by having you fill out the placeholders (jumping from one to the next with Tab). The important part here is that the placeholders’ tab order doesn’t need to be from left to right, so in TFA’s example you could have an order like
{3} for {2} in {1}
which would give you code completion for {3} based on the {1} and {2} that would be filled in first.
There is generally a trade-off between syntax that is nice to read vs. nice to type, and I’m a fan of having nice-to-read syntax out of the box (i.e. not requiring tool support) at the cost of having to use tooling to also make it nice to type.
This is not meant as an argument for the above for-in syntax, but as an argument that left-to-right typing isn’t a strict necessity.
The consensus here seems to be that Python is missing a pipe operator. That was one of the things I quickly learned to appreciate when transitioning from Mathematica to R. It makes writing data science code, where the data are transformed by a series of different steps, so much more readable and intuitive.
I know that Python is used for many more things than just data science, so I'd love to hear if in these other contexts, a pipe would also make sense. Just trying to understand why the pipe hasn't made it into Python already.
The pipe operator in R (really, tidyverse R, which might as well be its own language) is one of its "killer apps" for me. Working with data is so, so pleasant and easy. I remember a textbook that showed two ways of "coding" a cookie recipe:
pandas and polars both have pipe methods available on dataframes. you can method chain to the same effect. it's considered best practise in pandas as you're hopefully not mutating the initial df
The pipe operator uses what comes before as the first argument of the function. This means in R it would be:
result <- df
|> fun1(arg1=1)
|> fun2(arg2=2)
Python doesn't have a pipe operator, but if it did it would have similar syntax:
result = df
|> fun1(arg1=1)
|> fun2(arg2=2)
In existing Python, this might look something like:
result = pipe(df, [
(fun1, 1),
(fun2, 2)
])
(Implementing `pipe` would be fun, but I'll leave it as an exercise for the reader.)
Edit: Realized my last example won't work with named arguments like you've given. You'd need a function for that, which start looking awful similar to what you've written:
result = pipe(df, [
step(fun1, arg1=1),
step(fun2, arg2=2)
])
Python supports a syntax like your first example by implementing the appropriate magic method for the desired operator and starting the chain with that special object. For example, using just a single pipe: https://flexiple.com/python/python-pipeline-operator
The functions with extra arguments could be curried, or done ad-hoc like lambda v: fun1(v, arg1=1)
I haven't used R in forever, but is your `.` placeholder actually necessary? From my recollection of pipe operator the value being pipe piped is automatically as the first argument to the next function. That may have been a different implementation of a pipe operator though.
On the other hand, Python does have "from some_library import child_module" which is always nice. In JS we get "import { asYetUnknownModule } from SomeLibrary" which is considerably less helpful.
> While the Python code in the previous example is still readable, it gets worse as the complexity of the logic increases.
This bit is an aside in the article but I agree so much! List comprehensions in python are great for the simple and awful for the complex. I love map/reduce/filter because they can scale up in complexity without becoming an unreadable mess!
This is almost FP vs OOP religious war in disguise. Similar to vim-vs-emacs ... where op comes first in vim but selection comes first in emacs.
If you design something to "read like English", you'll likely get verb-first structure - as embodied in Lisp/Scheme. Other languages like German, Tamil use verbs at the end, which aligns well with OOP-like "noun first" syntax. (It is "water drink" word for word in Tamil but "drink water" in English.) So Forth reads better than Scheme if you tend to verbalize in Tamil. Perhaps why I feel comfy using vim than emacs.
Neither is particularly better or worse than the other and tools can be built appropriately. More so with language models these days.
Author picked up a quite convenient example to show methods/lambda superiority.
I prefer list/set/dict comprehensions any day. It's more general, doesn't require to know a myriad of different methods (which could not exists for all collections, PHP and JS are especially bad with this) and easily extendable to nested loops.
Yes it could be `[for line in text.splitlines() if line: for word in line.split(): word.upper()]`. But it is what it is. BTW I bet rust variant would be quite elaborate.
I'm a big fan of Python syntax, but really comprehensions don't make any sense to me efficiency wise (even for readability). Python syntax would become perfect with filter() and map() :')
> I prefer list/set/dict comprehensions any day. It's more general, doesn't require to know a myriad of different methods
It's the opposite, your knowledge of the standard set of folding algorithms (maps, filters, folds, traversals) is transferable almost verbatim across a wide range of languages: https://hoogletranslate.com/?q=map&type=by-algo
A minor corollary to this is that, as the user types, IDEs should predictably try to make programs valid – e.g. via structured editing, balancing parens in Lisp like paredit, etc.
This moves language design responsibility to the tool from the language itself. It might be okay to not hurt language elegance (e.g. lisp syntax), but in general I expect language to be convenient regardless of dev environment.
Autocomplete-oriented programming optimizes for writing code. I don't think that's a good route to go down. Autocomplete is good for spewing out a large volume of code, but is that what we want to encourage?
I'd much rather optimize for understanding code. Give me the freedom to order such that the most important ideas are up front, whatever the important details are. I'd much rather spend 3x the time writing code of it means I spend half the time understanding it every time I return to it in the future.
I miss the F# pipe operator (https://learn.microsoft.com/en-us/dotnet/fsharp/language-ref...) in other languages. It's so natural to think of function transform pipelines. In other languages you have to keep going to the left and prepend function names, and to the right to add additional args, parens etc ...
I've had to migrate to mostly Python for my work, and this is the thing I miss the absolute most from R (and how it works so seamlessly with the tidyverse)
Based on Haskell do-notation, Firefox's unfortunately withdrawn array comprehensions, and C# LINQ, I've been thinking about this:
[for b in c
let d = f(b)
for e in g if h else m
where p(d, e): b, d, e]
as alternative syntax to Python's
((b, d, e)
for b in c
for d in [f(b)]
for e in (g if h else m)
if p(d, e))
This solves five problems in Python listcomp syntax:
1. The one this article is about, which is also a problem in SQL, as juancn points out.
2. The discontinuous scope problem: in Python
[Ξ for x in Γ for y in Λ]
x is in scope in Ξ and Λ but obviously not Γ. This is confusing and inconsistent.
3. The ambiguity between conditional-expression ifs and listcomp-filtering trailing ifs, which Python solves by outlawing the former (unless you add extra parens). This is confusing when you get a syntax error on the else, but there is no non-confusing solution except using non-conflicting syntax.
4. let. In Python you can write `for d in [f(b)]` but this is inefficient and borders on obfuscated code.
5. Tuple parenthesization. If the elements generated by your iteration are tuples, as they very often are, Python needs parentheses: [(i, c) for i, c in enumerate(s) if c in s]. That's because `[i, c` looks like the beginning of a list whose first two items are i and c. Again, you could resolve these conflicting partial parses in different ways, but all of them are confusing.
In Rust, for..in loops have the same problem: you don't know the type and shape of the thing you are iterating on before you get to the "in" part. "Luckily", it does not allow you to call any methods on the object before you do.
Similarly, Python list/dict/set comprehensions are a form of for-loop syntax sugar to easily create particular structure. One can use functools.maps to get exactly the same behavior of Rust example.
If this was an all important text input microoptimization, we'd all be doing everything with pure functions like Lisp: yet somehow, functional languages are not the most popular even if they provide the highest syntax consistency.
That's an interesting idea: maybe instead of `for x, y in points: ...` (Python syntax) we should write `points do |x, y| ...` so the IDE has a maximum amount of type inference information available? That would also suggest writing variable declarations with the variable at the end, as in Forth or Levien's Io. 80 CONSTANT COLUMNS or `80 -> columns;`.
I had a similar thought a few years ago with an Advent of Code problem for which my solution in python might have been
max(map(sum, input_list.split(None)))
To decipher this the eye has to jump to the middle of the line, move rightwards, then to the left to see the "map" then move right again to see what we are mapping and then all the way to the beginning to find the "max".
The author would probably suggest rust's syntax* of
> Though neither python nor rust have such a nice `.split(None)` built in.
Sorry, I'm not sure I understand what `.split(None)` would do? My initial instinct is that would would return each character. i.e. `.chars()` in Rust or `list(s)` in Python.
>Sorry, I'm not sure I understand what `.split(None)` would do?
Reading the docs [0] it seems `.split(None)` returns an array of the indivual characters without whitespace - so something like [c in list(s) if not whitespace(c)]
It was intended to split a list of `int|None` into its non-none stretches. Much like how `string.split('x')` splits a string by matching the character 'x'
Gotcha! In python there is a `split_at` function for this in the more-itertools package, but I don't think there is a concise way to do it in the stdlib.
Honestly, I find using intermediate variables more readable than a long chain of function invocations where you have to keep track of intermediate results in your head:
While methods partially solve this problem, they cannot be used if you are not the author of the type. Languages with uniform function call syntax like Nim or D do this better.
So many engineers define their own narrow ergonomic values and then turn to the interwebs attempting to hammer their myopic belief into others as evangelical truth -- this reads as if the author believes there is a singular left-to-right process that a developer ought to adhere to. The author is oblivious to the non-linear compositional practices of other coders. It would be much more constructive to spend time creating specific tooling to aid your own process and then ask others if the values are also relevant to them. Not every developer embraces LSP, for example, as some are thwarted by its opinionated implementation. Not everyone is willing to give up local structure for auto-complete convenience.
It seems to me what the author desires is linguistic support for the Thrush combinator[0]. Another colloquial name for it is "the pipe operator."
Essentially, what this combinator does is allow expressing a nested invocation such as:
f(g(h(x)))
To be instead:
h(x) |> g |> f
For languages which support defining infix operators.
EDIT:
For languages which do not support defining infix operators, there is often a functor method named `andThen` which serves the same purpose. For example:
Funny how the article has no reference to C++ but still linking "Falling Into The Pit of Success". Did we all get our daily dose of subliminal messages from rust?
It's nice if after typing `file.` you see what functions you can use. But what if there end up being too many options? What's next, a fuzzy finding search box for all possible functions? Contextually relevant ones based on the code you've already written?
Not possible. There are more keystrokes that result in invalid programs (you are still writing the code!!) than keystrokes that result in a valid program.
More seriously, I do think that one consideration is that code is read more often that written, so fluidity in reading and comprehension seem more important to me than “a program should be valid after each keystroke.
Typing goes in the same directions as reading, so I expect many benefits to apply to both. But I agree that readability is much more important than writeability.
Reading through this article only elicits a "WTF!?" from me.
Your editor can’t help you out as you write it.
You shouldn't need handholding when you're writing code. It seems like the whole premise of the author's argument is that you shouldn't learn anything about the language and programming should be reduced to choosing from an autocomplete menu and never thinking more than that. I've seen developers who (try to) work like this, and the quality of their work left much to be desired, to put it lightly.
From there you can eventually find fread, but you have no confidence that it was the best choice.
In C, you have to know ahead of time that fclose is a function that you’ll need to call once you’re done with the file.
It's called knowledge. With that sort of attitude, you're practically begging for AI to replace you.
No wonder people claim typing speed doesn't matter - they can barely think ahead one token, nevermind a statement or function, much less the whole design! Ideally your typing speed should become the bottleneck and you should be able to code "blind", without looking at the screen but merely outputting the code in your mind into the machine as fast as humanly possible. Instead we have barely-"developers" constantly chasing that next tiny dopamine hit of picking from an autocomplete menu. WTF!?
When this descent into mediocrity gets applauded, it's no surprise that so much "modern" software is the way it is.
And we don't need navigation systems on our cars because we should know where we're going, right? ;)
Jokes apart, I think you're being too drastic. A good auto-complete is a nice feature, just like auto-indent, tab-complete, etc. Can it be abused? Sure. So what? Should we stop making it better for fear of abuse?
The reference to AI is far-fetched, too. We're talking about tools to help you with the syntax, not the semantic. I may forget if the function is called read, fread, or file_read, but I know what its effect is.
And finally, consider that if something is easier to parse for an editor, it most probably is for a human too. Not a rule, not working in 100% of cases, but usually exposing the user to the local context before the concept itself helps understanding.
I think the most absurd syntax goes to Python again.
Python offers an "extended" form of list comprehensions that lets you combine iteration over nested data structures.
The irony is that the extensions have a left-to-right order again, but because you have to awkwardly combine them with the rest of the clause that is still right-to-left, those comprehensions become completely unreadable unless you know exactly how they work.
E.g., consider a list of objects that themselves contain lists:
I've taken to writing complex comprehensions like this over multiple lines (which I initially thought wasn't possible!). It's still a bit awkward, but the order of "fors" is the same as if one was writing nested for loops, which makes reasoning about it easier for me.
> The proper way to parse or construct nested list comprehensions as explained in pep 202 is to think like you're using normal for:
A syntax construct that requires you to think in a different syntax construct to understand it is not a good syntax construct.
> Everything stays the same except the "use" part that goes in the front (the rule also includes the filters - if).
Yeah, you consistently have to write the end first followed by the start and then the middle, it's just in most cases there's no middle so you naturally think you only have to flip the syntax rather than having to write it middle-ended like a US date.
In 1990s-born scripting languages, it makes sense that there are plenty design choices that don't mesh well with static-analysis-driven autocompletion, because that was not at all part of the requirements for these languages at the time they were designed!
You don't need to be bothered with minute details such as syntax for counting string length anymore, you just need to know what you want to do. I mention this since OP is bringing up LSP's as an argument for why certain language's design is suboptimal.
"Count length of string s" -> LLM -> correct syntax for string-count for any programming language. This is the perfect context-length for an LLM. But note that you don't "complete the line", you tell the LLM what you want to have done in full (very isolated) context, instead of having it guessing.
The way people code with Python is by using its large ecosystem, few people only use the standart library. No one knows all the API, the more discoverability there is, the better.
I use stdlib whenever possible specifically because it’s huge, well-tested, and eliminates dependencies.
I don’t know all of its API, but I do read the docs periodically (not all of them, but I try to re-read modules I use a lot, and at least one that I don’t).
It’s more about learning the language’s library, which typically is too large to remember in detail, than about learning the language, which typically is small enough for that.
The complaints listed seem to focus on attributes / methods of a class. You can read Python’s docs, and see the methods available to a str type, for example.
To me, that’s learning the language. Learning its library would be more like knowing that the module `string` contains the function `capwords`, which can be used to - as the name suggests - capitalize (to title case) all words in a string. Hopefully, one would also know that the `str` class contains the method `upper`, so as not to confuse it with `string.capwords`.
This is the main reason I really like concatenative syntax for languages — this property is _enforced_ for programs (minus some delimited special cases, usually). It also neatly generalizes the special `self` argument so you can dispatch on the types of all the arguments.
> Suppose you have a FILE file and you want to get it’s contents. Ideally, you’d be able to type file. and see a list of every function that is primarily concerned with files. From there you could pick read and get on with your day.
> Instead, you must know that functions releated to FILE tend to start with f, and when you type f the best your editor can do is show you all functions ever written that start with an f
Why do you think that this is a problem of C? no one is stopping your tools from searching `fclose` by first parameter type when you wrote `file.`. Moreover, I know that CLion already do this.
> This is much more pleasent. Since the program is always in a somehwat valid state as you type it, your editor is able to guide you towards the Pit of Success.
Although the subtitle was “programs should be valid as they are typed”, it’s weakened to “somewhat valid” at this point. And yes, it is valid enough that tooling can help, a lot of the time (but not all) at full capability. But there’s also interesting discussion to be had about environments where programs are valid as they are typed. Syntactically, especially, which requires (necessary but not sufficient) either eschewing delimition, or only inserting opening and closing delimiters together.
That's fine for doing algebra in pure functions, but what about destructive commands or interactive scenarios?
For example, using "rm" on the command line, or an SQL "delete". I would very much like those short programs to be invalid, until someone provides more detail about what should be destroyed in a way that is accident-resistant.
If I had my 'druthers, the left-to-right prefix of "delete from table" would be invalid, and it would require "where true" as a safety mechanism.
Disagree. The first example the author seem to want something more like imperative programming, so the "loop" construct would come first. But then the assignment should come last. With the python syntax you get the thing you're assigning first - near the equals sign - and then where it is selected from and with any filtering criteria. It makes perfect sense. If you disagree that's fine, the whole post is an opinion piece.
It's not. The author gives objective reasons why Python's syntax is inferior – namely, that it makes IDE support in the form of discoverability and auto-completion more difficult.
And I'm sure there are people who program in Notepad or nano. If you want to develop software like it's the 80s again, go ahead, the rest of us appreciates at least basic IDE support.
Could you elaborate? AFAIK tacit programming tend to be scrambling around composition, paren, and args which makes left-to-right reading significantly harder for function with arity greater than 2.
I find Java's method reference or Rust's namespace resolution + function as an argument much better than Haskell tacit-style for left-to-right reading.
One of the things I dislike about function notation is that in f(g(h())) execution order is right-to-left. I like OO partly because execution order is writing order ( h().g().f() )
In Clojure I love the threading macro which accomplishes the same: (-> (h) (g) (f))
In Haskell there's also dot (edit - my bad, not dot, pipe) syntax which allows composing functions left to right. I believe Nim also has it. Just as a few more examples.
Left-to-right is also much easier to wrap your head around. Just imagine the data going on a conveyor belt with a bunch of machines instead of having to wrangle a nested tree.
A not insignificant number of functional languages disagree. You often read functional code from the inside out, which basically means you're reading right to left as you move up function calls.
For the record though, "it's more readable" is a much better argument then "LSP mad"
Right-to-left is equivalent to left-to-right in this regard: you're still reading in one linear direction, as opposed to Python, where you have to read inside-out and revisit later parts of the line to correctly understand previous parts of the line.
Fair enough. And like I said, you're argument is certainly better than the article's.
I'm not really a fan of list comprehensions, I usually just use for loops. It does seem consistent with Pythons syntax though. For loop is `for item in items` and comprehensions have 'item for item in items'.
len(list(filter(lambda line: all([abs(x) >= 1 and abs(x) <= 3 for x in line]) and (all([x > 0 for x in line]) or all([x < 0 for x in line])), diffs)))
This really isn’t fair on Python. Python isn’t very much not designed for this style of functional programming. Plus you haven’t broken lines where you could. Rewrite it as a list comprehension and add line breaks, and turn the inner list comprehensions into generator expressions (`all([…])` → `all(…)`), and change `abs(x) >= 1 and abs(x) <= 3` to `1 <= abs(x) <= 3` (thanks, Jtsummers), and it’s much better, though it still has the jumping around noted, and I do prefer the functional programming approach. I’m just saying the presentation isn’t fair on Python.
len([line for line in diffs
if all(1 <= abs(x) <= 3 for x in line)
and (all(x > 0 for x in line) or all(x < 0 for x in line))])
(Aside: change the first line to `sum(1 for line in diffs` and drop the final `]`, and it will probably perform better.)
I also want to note, in the JS… Math.abs(x) instead of x.abs() (as seen in Rust).
And, because nerd sniping, two Rust implementations, one a direct port of the JS:
Means the same thing and tightens it up a bit, and reads better since it's indicating that you're testing if something is in a range more clearly.
EDIT: To add:
The filter, list construction, and len aren't needed either. It's just:
sum(map(predicate, diffs)) # this counts the number of elements in diffs which satisfy predicate, map is lazy so no big memory overhead
Or alternatively:
sum(predicate(diff) for diff in diffs)
The predicate is complex enough and used twice, so it warrants extraction to its own named function (or lambda assigned to a variable), but even if it were still embedded this form would be slightly clearer (along with adding the line breaks and removing the extra list generations):
sum(map(lambda line: all(1 <= abs(x) <= 3 for x in line)
and (all(x > 0 for x in line) or all(x < 0 for x in line)),
diffs))
The author's argument is a strong argument against the way Lisp does it, at least for most code, as Lisp tends to put the verb to the left and nouns to the right so your IDE has no idea what nouns you are working with until you've picked the verb.
Pretty much every functional language is verb-initial, as are function calls as well as statements in most mainstream languages.
Verb-final languages like PostScript and Forth are oddballs.
Embedded verbs are a thing of course, with the largest contribution coming from arithmetic expressions with infix operators, followed by certain common syntax like "else" being in the of an "if" statement, followed by things like these Python comprehensions and <action> if <condition> syntactic experiments and whatnot.
> If you aren’t familiar with Rust syntax, |argument| result is an anonymous function equivalent to function myfunction(argument) { return result; }.
> Here, your program is constructed left to right. The first time you type line is the declaration of the variable. As soon as you type line., your editor is able to suggest available methods.
Yeah, having LSP autocomplete here does feel nice.
But it also makes the code harder to scan than Python. Quick readability at a glance seems like the bigger win than just better autocomplete.
> But it also makes the code harder to scan than Python. Quick readability at a glance seems like the bigger win than just better autocomplete.
It depends a lot on what you’re accustomed to. You get used to whichever style. Just like different languages use different sentence order: subject, object and verb appear in all possible orders in different languages, and their speakers get along just fine. There are some situations where one is clearly superior to the other, and vice versa.
`text.lines().map(|line| line.split_whitespace())` can be read loosely as “take text; take its lines; map each line, split it on whitespace”. Straightforward and matching execution flow.
`[line.split() for line in text.splitlines()]` doesn’t read so elegantly left-to-right, but so long as it’s small enough you spot the `for` token, realise you’re dealing with a list comprehension, and read it loosely from left to right as “we have a list made up of splitting each line, where lines come from text, split”. Execution-wise, you execute `text.splitlines()`, then `for line in`, then `line.split()`. It’s a bunch of left-to-rights embedded in a right-to-left. This has long been noted as a hazard of list comprehensions, especially the confusion you end up with with nested ones. Now you could quibble over my division of `for line in text.splitlines()` into two runs; but I think it’s fair. Consider how in Rust you get both `for line in text.split_lines() { … }` and `text.split_lines().for_each(|line| { … })`. Sometimes the for block reads better, sometimes .for_each() or .map() or whatever does. (But map(lambda …: …, …) never really does.)
Python was my preferred language from 2009–2013 and I still use it not infrequently, but Rust has been my preferred language ever since. I can say: I find the Rust version significantly easier to read, in this particular case. I think the fact there are two levels of split contributes to this.
It indeed doesn't look elegant, however I've never in my experience seen a usage like this. Do you have any reference where you might have seen this kind of usage.
This is not meant to be taken literally, I was making fun of how Rust often requires a lot of punctuation and thinking about details of memory allocation.
This is honestly why I love C++ ranges right now. The "pipe" syntax is a "left-to-right" of writing very powerful map/filter operations in contexts where I'd want a list comprehension but in a much more sensical (and for that matter customizable) order
I'm curious on how much of this is basically overcome by the new tools? As much as vibe coding annoys me, I can't argue against how good autocomplete from these LLMs can be.
Very human of us to spend billions of dollars and tons of electricity to bang our programming languages into shape since sane syntax is slightly uncomfortable the first 10 minutes you work with it.
I believe there are some strongly typed stack based languages where you really always do have something very close to a syntactically correct program as you type. But now that LLMs exist to paper over our awful intuitions, we're stuck with bad syntax like python forever.
On one level, I do prefer my code to be readable left to right and top to bottom. This, typically, means the big "narrative" functions up top and any supporting functions will come after they were used. Ideally, you could read those as "details" after you have understood the overall flow.
On another level, though, it isn't like this is how most things are done. Yes, you want a general flow that makes sense in one direction through text. But this often has major compromises and is not the norm. Directly to programming, trying to make things context free is just not something that works in life.
Directly to this discussion, I'm just not sure how much I care about small context-free parts of the code?
Tangential to this discussion, I oddly hate comprehensions in python. I have yet to get where I can type those directly. And, though I'm asking if LLM tools are making this OBE, I don't use those myself. :(
So far any "autocomplete" from an LLM has only served to insanely disrupt my screen with fourtey lines of irrelevant nonesense that cannot be turned off (the "don't suggest multiline autocompletes" option does not prevent LLM autocomplete from doing so) and has only served to be less useful than non-LLM based autocompletes were, which I was massively impressed with, because it was hyperlocal and didn't pretend that the niche code I'm writing is probably identical to whatever google or microsoft writes internally.
I've had some minor success with claude, but enabling the AI plugin in intellij has literally made my experience worse, even without using any AI interactions.
"This is bad because your editor can’t help you autocomplete it as you write it."
No, you are bad because you use an editor with autocomplete.
And it's not even debatable, it's like playing bowling with rails, or riding a bycicle with training wheels. Sure you can argue that you are more efficient at bowling and riding a bike with those, but you are going to be arguing alone, and it's much better to realize that python is one of the best languages at the moment and therefore one of the best languages ever, instead of being a nobody and complaining about a lanugage because you are too encumbered by your own ego to realize that you are not as good a programmer as you thought.
Nothing wrong with being an amateur programmer or vibecoding or whatever, but if you come for the king you best not miss
Incidentally, Darklang had this built into the language/editor combo from the start. You can see it in the examples on our youtube, maybe this one: https://www.youtube.com/watch?v=NQpBG9WkGus
To make a long story short, we added features for "incomplete" programs in the language and tools, so that your program was always valid and could not be invalid. It was a reasonable concept, and I think could have been a game changer if AI didn't first change the game.
SQL shows it's age by having exactly the same problem.
Queries should start by the `FROM` clause, that way which entities are involved can be quickly resolved and a smart editor can aid you in writing a sensible query faster.
The order should be FROM -> SELECT -> WHERE, since SELECT commonly gives names to columns, which WHERE will reference.
You could even avoid crap like `SELECT * FROM table`, and just write `FROM table` and have the select clause implied.
Never mind me, I'm just an old man with a grudge, I'll go back to my cave...
PSQL (and PRQL) use this ordering, and a similar pipe/arrow notation has recently been added to BigQuery.
Check out the DuckDB community extensions:
[0]: https://duckdb.org/community_extensions/extensions/psql.html
[1]: https://duckdb.org/community_extensions/extensions/prql.html
In duckdb you can also start with `FROM .. SELECT ..` without using the PSQL extension.
But I haven't found a good editor plugin that is actually able to use that information to do completions :/ If anyone knows I'd be happy to hear it
And in elixir land we've had similar pipes and composition through ecto, for the major rdbms
It's written that way because it stems from relational algebra, in which the projection is typically (always?) written first.
>The order should be FROM -> SELECT -> WHERE, since SELECT commonly gives names to columns, which WHERE will reference.
Per the SQL standard, you can't use column aliases in WHERE clauses, because the selection (again, relational algebra) occurs before the projection.
> You could even avoid crap like `SELECT * FROM table`, and just write `FROM table` and have the select clause implied.
Tbf, in MySQL 8 you can use `TABLE <table>`, which is an alias for `SELECT * FROM <table>`.
> It's written that way because it stems from relational algebra
More likely because this order is closer to typical English use. SQL was designed to look like English, not relational algebra.
Interestingly, the inventor of relational algebra for database management put the "FROM" first in his query language: https://dl.acm.org/doi/pdf/10.1145/1734714.1734718
> It's written that way because it stems from relational algebra, in which the projection is typically (always?) written first.
It's inspired by a mish-mash of both relational algebra and relational calculus, but the reason why SELECT comes first is because authors wanted it to read like English (it was originally called Structured English Query Language).
You can write the relational algebra operators in any order you want to get the result you want.
> You can write the relational algebra operators in any order you want
Ultimately, yes, you can express relational algebra in any notation that gets the point across, but the parent is right that
is what is commonly used. not so much. Even Codd himself used the former notation style in his papers, even though he settled on putting the relation first in his query language.> It's written that way because it stems from relational algebra,
A common misconception (that SQL is a realization of RA instead of barely based on it).
In RA, is in fact `Relation > Operator`
Unless I am grossly misunderstanding your notation, I have always seen RA written with the operator first. Some examples:
https://cs186berkeley.net/notes/note6/
https://web.wlu.ca/science/physcomp/ikotsireas/CP465/W1-Intr...
https://home.adelphi.edu/~siegfried/cs443/443l9.pdf
> It's written that way because it stems from relational algebra, in which the projection is typically (always?) written first.
Okay, so what?
We're not obligated to emulate the notational norms of our source material, and it is often bad to do so when context changes.
I was explaining why it is the way that it is. If you'd like your own version of a parser, here's Postgres' [0]. Personally, I really like SQL's syntax and find that it makes sense when reading it.
[0]: https://github.com/postgres/postgres/tree/master/src/backend...
> If you'd like your own version of a parser, here's Postgres' [0].
Funnily enough, if you pull up a version of Postgres' parser prior to the 1995 release, you'll find that it puts the relation first.
There was not argument about how much sense it makes. There was an argument for improving readability by placing the table names first.
Lots of people “like” things because they are familiar with them. And that’s a fine enough reason. But if you step out of your zone of familiarity, can you find improvements? Are you willing to forgo any prejudice you may possess to evaluate other suggestions?
Just a little willingness to see another perspective is all anyone asks.
Yes, C#'s DSL that compiles to SQL (LINQ-to-SQL) does the same thing, `from` before the other clauses, for the same reason that it allows the IDE code completion to offer fields while typing the other clauses.
Kusto, the Azure query language for data analysis uses that form with piping as well.
https://learn.microsoft.com/en-us/kusto/query/?view=microsof...
Also the LINQ approach in .NET.
I do agree, that is about time that SQL could have a variant starting with FROM, and it shouldn't be that hard to support that, it feels like unwillingness to improve the experience.
Kusto is so much vetter than it has any right to be! Normally I'd run a mile at a cloud provider specific programming language that can't ve used elsewhere, but it really is quite nice! (there are some wierd quirks, but a tonne less than I'd have thought)
I love working with KQL, it’s a very expressive query language.
My main caveat here, is that often the person starting a select knows what they want to select before they know where to select it from. To that end, having autocomplete for the sources of columns is far far more useful than autocomplete for columns from a source.
I will also hazard a guess that the total number of columns most people would need autocomplete for are rather limited? Such that you can almost certainly just tab complete for all columns, if that is what you really want/need. The few of us that are working with large databases probably have a set of views that should encompass most of what we would reasonably be able to get from the database in a query.
Seems nonsensical. Column names have no meaning without the table. Table is the object, columns are the properties. What language goes Property.Object? All popular languages have this wrong?
Anytime you store a single property for all objects in an array or hash table, so, Fortran, BASIC, APL, Numpy, R, awk, and sometimes Perl. Parallel arrays are out of style but certainly not unheard of. They're the hot new thing in games programming, under the name "entity-component-system".
Even C when `property` is computed rather than materialized. And CLOS even uses that syntax for method invocation.
The only language I can think of that uses field(record) syntax for record field access is AT&T-syntax assembly.
When languages use order property, object - it's usually because they treat the properties as functions.
name(object) makes as much sense as object.name
I'm assuming you are largely just not thinking this one through? We are not modeling the domain, we are describing some data we want to select. Without knowing how it is modeled, I can give a brief "top line" for expected select statements on many data models. "I want average_weather, year, zip_code", "I want year, college_name, degree_name, median_graduate_salary, mean_graduate_salary", "..."
I don't think it is tough to describe many many reports of data that you would want in this way. Is it enough for you to flat out get the answer? No, of course not. But nor is it enough to just start all queries at what possible tables you could use as a starting point.
How would you even know it's possible get the data before you've chosen a table?
Your example is also not a complete select statement, you would need to go back and add the actual aggregate functions, and oops the zip_code column was actually called zip, so we need to remap that as well. You can almost never finish a select statement before you have inspected the tables, so why not just start there immediately?
Often times you don't know it is possible to get some data without consulting the table? Worse, you probably need to consult the views that have been made by any supporting DB team before you go off and build your own rollups. Unless you like not taking advantage of any aggregate jobs that are running nightly for reporting.
And to be clear, I'm all for complaining about the order of a select statement, to a very large degree. It is done in such a way that you have to jump back and forth as you are constructing the full query. That can feel a bit painful.
However, I don't know that it is just the SELECT before FROM that causes this jumping back and forth and fully expect that you would jump around a fair bit even with the FROM first. More, if I am ever reworking a query that the system is running, I treat the SELECT as the contract to the application, not the FROM.
There is a bit of "you should know the database before you can expect to send off a good query", but that really cuts to any side of this debate? How do you know the tables you want so well, but you don't know the columns you are going to ask for?
> How do you know the tables you want so well, but you don't know the columns you are going to ask for?
Simply because there are a lot more columns than there are tables.
Of course, I sometimes forget the exact table name as well. However, this is mostly not an issue as the IDE knows all table names before anything has been entered. By simply entering `FROM user`, the autocomplete will list all tables that contain `user`, and I can complete it from there. I cannot do the same with the column selection unless I first write the FROM part. And even if I do know exactly which columns and tables I want I would still want autocomplete to work when creating the SELECT statement. Rather than typing `zip_code`, I can most likely just type `z<TAB>`.
That is why 99% of my select queries start as `SELECT * FROM mytable`, and then I go back to fill in the select statement. And it's not just me, all colleagues I've worked with does exact same thing.
For larger, more complicated queries, you'll have to go back and forth a lot as you join tables together, that is unavoidable, but 80-90% of my queries could be finished in one go if the FROM part came first.
But "more columns than there are tables" is also why I put the thing about, "the total number of columns most people would need autocomplete for are rather limited?" I could see an argument on how this mattered back at the origin of SQL, but today? It is entirely conceivable that you can autocomplete the column names with it updating the from as necessary for the columns you have selected. You could then tab to the from and pick alternative tables with natural joins prefilled between the tables. With very minimum effort.
I actually seem to recall this was done in some of the early dedicated SQL tools. It is rather amusing how much we handicap ourselves by not using some things older tools built.
How did you know it was zip_code and not ZipCode?
Maybe in this case you know the same naming conventions are enforced across all tables. But in general it’s difficult to know the exact column name without looking it up first.
> How did you know it was zip_code and not ZipCode?
I think you're missing the point; you start off with your goal, regardless of the column name. No one has the goal "we want columsn from this specific table", it's always "we want these columns in the final output".
I can’t think of an example where I knew the columns I wanted to select before I knew which table I wanted to select them from.
Did you ever not do things like
where you start with "I need a product name and minimum price" before thinking about where they come from?The more one uses SQL to filter for, the more you'll think about what you want to achieve vs how (which comes after): in a sense, the SELECT is your function return type declaration, and I frequently start my functions with the declaration.
> where you start with "I need a product name and minimum price" before thinking about where they come from?
Sure, and pretty much every time the names I wrote up were not the ones in the table so that was a complete waste of time.
> the SELECT is your function return type declaration
That might be true if “select” only contained aliases, bur that’s not the case at all, so what it is is complete nonsense.
Likewise when skimming code, having the attributes on the left and details on the right makes it much easier to see the data flow in the larger program - it's the same order as "var foo = bar()".
Can you help me understand this situation? Almost always what I want to select is downstream of what I am selecting from, I can't write anything in SELECT until i understand the structure/columns available in FROM.
"I want 'first_name, last_name, and average(grade)' for my report." That is easy enough to state without knowing anything about how the data is normalized.
Back when I worked to support a data science team, I actually remember taking some of their queries and stripping everything but the select so that I could see what they were trying to do and I could add in the correct parts of the rest.
Even assuming that this is plausible way a user could think about it (that can be understood) it shows why is bad
Think: I have 20 tables with the column `id`
"I want 'id,id,id'"
is bad UX, and is what here is being argued, then when the syntax guide you: "I want 'FROM a: id'" is better
I don't think anyone would ever say they want "id, id, id"? They would say they want "customer_id, order_id, item_id" or some such. You would then probably say, "we just use 'id' for the id of every item..." but many places that I've worked actually explicitly didn't do that for essentially this reason. Natural joins on "foo_id" "just work" if you don't give every table the same "id" column.
Current SELECT syntax does allow one to "SELECT user.id as user_id, product.id as product_id..." which can then even autocomplete the FROM for you from your query "declaration" (a-la function declaration, in particular its return type).
This assumes I have all the column memorized but not all the tables?
Even in your example, first and last could refer to student or teacher. But presumably you know you're looking for student data before knowing the exact columns.
No, this assumes you know what you want for a query. Which, seems largely fair?
Like, how would you send this question to someone? Or how would you expect it to be sent to you? If your boss doesn't tell you from what table they want some data, do you just not answer?
And sure, there could be ambiguities here. But these are not really fixed by listing the sources first? You would almost certainly need to augment the select to give the columns names that disambiguate for their uses. For example, when you want the teacher and the student names, both.
And this is ignoring more complications you get from normalized data that is just flat out hard to deal with. I want a teacher/student combination, but only for this specific class and term, as an example. Whether you start the from at the student, the class, or the term rosters feels somewhat immaterial to how you want to pull all of that data together for why ever you are querying it.
If my boss asks me for a zip code, I'm going to ask "for what?"
If they ask for "address for a customer" I can go to the customer table and look up what FKs are relevant and collect all possible data and then narrow down from there.
I'd assume they would ask for "aggregate sales by month to zip codes," or some such. Which, you'd probably get from a reporting table, and not bother trying to do the manual aggregate over transactional data. (That is, OLAP versus OLTP is a very real divide that you can't wave away.)
Realistically, I strongly suspect you could take this argument either direction. If you have someone making a query where they are having to ask "what all tables could I start from?" you are in for some pain. Often the same data is reachable from many tables, and you almost certainly have reasons for taking certain paths to get to it. Similarly, if they should want "person_name", heaven help them.
Such that, can you contrive scenarios where it makes sense to start the query from the from clause? Sure. My point is more that you almost certainly have the entire query conceptualized in your mind as you start. You might not remember all of the details on how some things are named, but the overall picture is there. Question then comes down to if one way is more efficient than the other? I have some caveats that this is really a thing hindered by the order of the query. We don't have data, of course, and are arguing based on some ideas that we have brought with us.
So, would I be upset if the order was reversed? Not at all. I just don't expect that would actually help much. My memory is using query builders in the past where I would search for "all tables that have customer_id and order_id in them" and then, "which table has customer_id and customer_address" and then... It was rarely (ever?) the name of the table that helped me know which one to use. Rather, I needed the ones with the columns I was interested in.
Your approach only works with massive assumptions about the structure of the data or a very simplistic data structure.
SELECT statments don't just use table names, they can use aliases for those table names, views, subqueries, etc.
The FROM / JOIN blocks are where the structure of the data your are selecting from is defined. You should not assume you understand what a SELECT statement means until you have read those blocks.
As GP said, you can take this both ways.
I can define the return format of my query in the SELECT statement, then adapt the data structure in the FROM block using subselects, aliases etc — all to give me the shape desired for the query.
If you've ever done complex querying with SQL, you'd know that you'd go back and forth on all parts of the query to get it right unless you knew the relations by heart, regardless of the order (sometimes you'll have to rework the FROM because you changed the SELECT is the point).
WITH and CTEs go some way to help with this, and the majority of queries are a lot better for that.
This was a historical decision because SQL is a declarative language. I was confused for too long, I want to admit, about the SQL order: FROM/JOIN → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT
As a self-taught developer, I didn't know what I was missing, but now the mechanics seem clear, and if somebody really needs to handle SELECT with given names, then he should probably use CTE:
WITH src AS (SELECT * FROM sales), proj AS (SELECT customer_id, total_price AS total FROM src), filt AS (SELECT * FROM proj WHERE total > 100) SELECT * FROM filt;
> This was a historical decision because SQL is a declarative language.
What do you mean? Both ALPHA (the godfather declarative database language created by Codd himself) and QUEL, both of which inspired SQL, put "FROM" first. Fun fact: SQL was originally known as SEQUEL, which was intended to be a wordplay on being a followup to QUEL.
Another commenter makes a good case that SQL ended up that way because it was trying to capture relational algebra notation (i.e. π₁(R), where π~select, ₁~column, R~table). Yet relational algebra is procedural, not declarative. Relational calculus is the declarative branch.
Although the most likely explanation remains simply that SEQUEL, in addition to being fun wordplay, also stood for Structured English Query Language. In English, we're more likely to say "select the bottle from the fridge", rather than "from the fridge, select the bottle". Neither form precludes declarative use.
> This was a historical decision because SQL is a declarative language
It would be equally declarative if FROM came first.
You can always start with the FROM clause and then add the SELECT clause above it if it's a out auto-,completion.
I usually start with: ``` select * from <table> as <alias> limit 5 ```
The order should be starting on FROM, followed by any sequences of whatever clauses (except for FROM), always creating an intermediary result-set.
FROM table -- equivalent to today's select * from table
SELECT a, 1 as b, c, d -- equivalent to select ... from table
WHERE a in (1, 2, 3) -- the above with the where
GROUP BY c -- the above with the group by
WHERE sum(d) > 100 -- the above with having sum(d) > 100
SELECT count(a distinct) qt_a, sum(b) as count, sum(d) total_d -- the above being a sub-query this selects from
My other big dream would be allowing multiple WHERE clauses that would be semantically ANDed together because that's what would happen if you filtered a result set twice.
How would this approach support SQL constructs such as the HAVING[0] clause? Or is that what you meant by:
0 - https://www.w3schools.com/sql/sql_having.asp> always creating an intermediary result-set
You want the DB to first run SELECT * FROM <table>, and then start operating on that?
Presumably, the intermediate result sets would necessary be materialized, only logical.
Obviously not, and bringing this up as if it's a gotcha just shows you aren't keeping up with the conversation. Try less to correct people and more to understand people.
It's about how humans think about it, not about how the computer executes it.
How else do you read what they wrote?
I read it as describing their preferred mental model for declaring a result set, which is different from describing their preferred behavior to produce it. This seems clear to me in wording and context; it’s also broadly consistent with how SQL is understood, and how it is generally implemented.
While I agree, some SQL editors do provide code completion for the SELECT clause if you type the FROM clause first.
That is my trick as well, but feels backwards, by now there could be a variant of SQL supporting FROM first.
SELECT ... FROM ... WHERE ...: S Q L
FROM ... SELECT ... WHERE ...: Q S L
You may find PRQL interesting. https://prql-lang.org/
ecto's sql dsl basicly fixes this. here's one I just wrote in my codebase
It doesn't seem like it would be all that difficult to allow that form in a backwards compatible way. Why hasn't this happened?
Because SQL is a whole language family of separate languages grudgingly adopting new features introduced by the notoriously slow standartization committee.
BigQuery SQL and Spark SQL (and probably some others) have adopted pipelined syntax, DuckDB SQL simply allows you to write the query FROM-first.
A nitpick: select doesn't declare fields for where. select declare fields for having. it's the schema that declares fields that can be used in where.
> Queries should start by the `FROM` clause,
FWIW that is the approach used by LINQ.
It's true. I call authoring SQL "holographic programming" because any change I want to make frequently implies a change to the top and bottom of the query too; there's almost never a localized change.
C++ has this issue too due to the split between header declarations and implementations. Change a function name? You're updating it in the implementation file, and the header file, and then you can start wondering if there are callers that need to be updated also. Then you add in templates and the situation becomes even more fun (does this code live in a .cc file? An .h file? Oh, is your firm one of the ones that does .hh files and/or .hpp files also? Have fun with that).
I legitimately never even thought of this, and it feels very obvious in retrospect…
Don’t know why python gets so much love. It’s a painful language as soon as more than one person is involved. What the author describes is just the tip of the iceberg
Failure to understand something is not a virtue. That it does get a lot of love strongly suggests that there are reasons for that. Of course it has flaws, but that alone doesn't tell us anything; only comparisons do. Create a comprehensive pro/con list and see how it fares. Then compare that to pro/con lists for other languages.
> That it does get a lot of love strongly suggests that there are reasons for that.
Reasons, sure, but whether those reasons correlate with things that matter is a different question altogether.
Python has a really strong on-ramp and, these days, lots of network effects that make it a common default choice, like Java but for individual projects. The rub is that those same properties—ones that make a language or codebase friendly to beginners—also add friction to expert usage and work.
> Failure to understand something is not a virtue.
This is what I want to say to all the Bash-haters that knee-jerk a "you should use a real language like Python" in the comments to every article that shows any Bash at all. It's a pet peeve of mine.
"Reasons, sure, but whether those reasons correlate with things that matter is a different question altogether."
Thus my word "suggests". Thus the comprehensive pro/con list.
I'm not here to defend tiresome strawmen.
It's not so much a matter of exhaustively listing pros and cons, but more a matter of appropriateness to the desired goal or goals IME. I seriously doubt that a comprehensive pro/con list can even be coherent. Is dynamic typing a pro or a con? Depends on to whom and even in what decade you ask. List comprehensions? Interpreted language? First class OOP? Any cost-benefit anaylsis will be highly context-dependent.
> I'm not here to defend tiresome strawmen.
I won't point out that you already tried to (contradiction intended). Perhaps a more interesting discussion would result if we defaulted to a more collaborative[0] stance here?
[0]:https://en.wikipedia.org/wiki/Cooperative_principle
> Of course it has flaws, but that alone doesn't tell us anything; only comparisons do.
Comparisons won't tell us anything. If Python were the only programming language in existence, that doesn't imply that it would be loved. Or, if we could establish that Python is the technically worst programming language to ever be created, that doesn't imply that wouldn't be loved. Look at how many people in the world love other people who are by all reasonable measures bad for them (e.g. abusive). Love isn't rational. It is unlikely that it is possible for us to truly understand.
> Comparisons won't tell us anything.
If you insist.
> If Python were the only programming language in existence, that doesn't imply that it would be loved.
I'm not here to defend bizarre strawmen.
> It is unlikely that it is possible for us to truly understand.
If you insist.
The same reason people are not flocking to the Lisps of the world: mathematical rigour and precision does not translate to higher legibility and understandability.
Python's list /dict/set comprehensions are equivalent to typed for loops: where everyone complains about Python being lax with types, it's weird that one statement that guarantees a return type is now the target.
Yet most other languages don't have the "properly ordered" for loop, Rust included (it's not "from iter as var" there either).
It's even funnier when function calling in one language is compared to syntax in another (you can do function calling for everything in most languages, a la Lisp). Esp in the given example for Python: there is functools.map, after all.
"the target"? The OP complaint is not that the statement exists, but that the output expression comes before the context that establishes its meaning. This is not just a problem for autocomplete, but people often report difficulties understanding complex comprehensions.
As for comprehensions themselves, ignoring that problem I find them a powerful and concise way to express a collection of computed values when that's possible. And I'm particularly fond of generator expressions (which you didn't mention) ... they often avoid unnecessary auxiliary memory, and cannot be replaced with an inline for loop--only with a generator function with a yield statement.
BTW, I don't understand your comment about types. What's the type of (x for x in foo()) ?
That one is a generator[Any], and for others it could be set[Any], list[Any] or dict[Any]. You obviously don't get embedded type information, but you still do get the encapsulating type, which is better than an untyped for loop :)
I get that it's about how it's structured and ordered, but that is true for the "for..in" loops in every language as well: you first set the variable, and only then get the context — this just follows the same model as the for loops in the language, and it would be weird if it didn't.
"that is true for the "for..in" loops in every language as well"
No, not at all. The output expression is arbitrary ... it might be f(x, y, z) where all of those are set later. You're confusing the output expression with the loop variable, which is also stated in the comprehension and may or may not be the same as the output expression or part of it. "The same model as the for loops in the language", where the language includes Python, is the comprehension with the output expression moved from the beginning to the end. e.g., (bar(x) for x in foo()) is `for x in foo(): bar(x)`. More concretely: lst = [bar(x) for x in foo()] is functionally equivalent to lst = []; for x in foo(): lst.append(bar(x))
Again, "the output expression comes before the context that establishes its meaning." ... I thought that would be clear as day to anyone who is actually familiar with Python comprehensions.
P.S. I'm not going to respond to goalpost moving.
Note that I am not disputing the order is "inversed": all I am saying is that there are other common language features in most languages where things don't flow the same way, yet nobody finds it a huge problem.
It's like discussion of RPN or infix for calculations: both do the job, one is more rigorous and clear with no grouping/parentheses, yet we manage just fine with infix operators in our programming languages (or maybe not, perhaps all bugs are due to this? :)).
These are different points.
Just like you state a variable and some operations on it early in a comprehension, you do the same in a for loop: you don't know the type of it.
As you are typing the for loop in, your IDE does not know what is coming in as a context being iterated over to auto-complete, for instance (eg. imagine iterating over tuples with "for k, v in some_pairs:" — your editor does not even know if unpacking is possible).
Basically, what I am saying is that comprehensions are similarly "bad" as for loops, except they are more powerful and allow more expression types early.
C/C++ allow even crazier stuff in the "variable's" place in a for loop. Rust allows patterns, etc.
Typing is mostly a nice addendum I mentioned, that's not the core of my point.
Python was established as a fun and sensible language that was usable and batteries-included at a time when everything else was either expensive or excruciating, and has been coasting in that success ever since. If you'd only coded in bash, C/C++, and late-'90s Java, Python was a revelation.
Lists comprehensions were added to the language after it was already established and popular and imho was the first sign that the emperor might be naked.
Python 3 was the death of it, imho, since it showed that improving the language was just too difficult.
I find myself writing a very simple style of python that avoids list comprehensions and so on when working in a shared code base.
For a language where there is supposed to be only one way to do things, there are an awful lot of ways to do things.
Don’t get me wrong, writing a list comprehension can be very satisfying and golf-y But if there should be one way to do things, they do not belong.
I find list and dict comprehensions are a lot less error prone and more robust than the “manual” alternatives.
I would say unless you have a good reason to do so, features such as meta classes or monkey patching would be top of list to avoid in shared codebases.
> For a language where there is supposed to be only one way to do things
That's not what the Zen says, it says that there should be one -- and preferably only one -- obvious way to do it.
That is, for any given task, it is most important that there is at least one obvious way, but also desirable that there should only be one obvious way, to do it. But there are necessarily going to be multiple ways to do most things, because if there was only one way, most of them for non-trivial tasks would be non-obvious.
The goal of Python was never to be the smallest Turing-complete language, and have no redundancy.
If I have to write Python like Go, I'd rather just write Go. (Not disagreeing with you, but this is one of many reasons that Python is the least-favourite language that I use on a regular basis.)
I used to agree with this completely, but type annotations & checking have made it much more reasonable. I still wouldn't choose it for a large project, but types have made it much, much easier to work with others' python code.
Python with strict type checking and its huge stdlib is my favourite scripting language now.
I agree, it's not too bad if you enable strict Pyright checking in CI, and you use `uv` and you don't care remotely about performance.
That's quite a lot of ifs though. Tbh I haven't found anything significantly better for scripting though. Deno is pretty nice but Typescript really has just as many warts as Python. At least it isn't so dog slow.
I've pretty much embraced PowerShell for scripting. The language is warty as hell and seems to be entirely made of sharp edges but I've gotten used to it and it does have a lot of excellent ideas.
> Python with strict type checking and its huge stdlib is my favourite scripting language now.
It's time to try Scala 3 with Java libs' inbound interop: https://docs.scala-lang.org/scala3/book/scala-features.html
Maybe even painful for one dev once you need dependencies (virtualenv...)
Thankfully, uv makes this a lot better, but it'll still be a while before that percolates to the entire ecosystem, if ever.
There are real concerns about tying the language's future to a VC-backed project, but at the same time, it's just such an improvement on the state of things that I find it hard not to use.
I find python script written by C/C++ dev usually easier to understand than the one written in "Pythonic" style, even if it's longer and more verbose.
Are you, by any chance, a C/C++ dev?
Because everything that tries to fix it is just as painful in different ways.
I've had the displeasure of working in codebases using the style of programming op says is great. It's pretty neat. Until you get a chain 40 deep and you have to debug it. You either have to use language features, like show in pyspark, which don't scale when you need to trace a dozen transformations, or you get back to imperative style loops so you can log what's happening where.
Debugger tooling is helpful here. Rider's debugger easily shows every intermediate result set in long chained LINQ method call chains.
Then add some variables for intermediate results.
This hasn't been my experience, but we use the Google style guide, linters, and static type verification to cut down on the number of options for how to write the program. Python has definitely strayed from its "one right way to do a thing" roots, but in the set of languages I use regularly it gives about the same amount of issues as JavaScript (and far, far less than C++) regarding having to deal with quirks that vary from user to user.
I think because it's forgiving and the first language for many people, so they just stick to easy they know.
It's my 18th language and I prefer it over the alternatives. If I need to write an app I'll use Swift. If I need to do some web work I'll switch to TypeScript. To get work done it's Python all the way.
Agree. At some point grasping a new language takes hours and it makes sense to use it for tasks it excels at.
It’s a good language for beginners and, unfortunately, many people think that learning more languages is hard and useless.
That’s also precisely the reason why we have the clusterfuck that is the JavaScript ecosystem.
People complain that we can’t have nice things. But even when we do, enough developers will be lazy enough not to learn them anyway.
I'd argue it's because the web is the dominant platform for many applications and no other languages can offer a first class experience. if WASM had direct access to the DOM and web APIs and maybe a little more runtime support to lessen the bloat, I'd use something else.
JavaScript has been a backend language long before the web was the dominant platform.
And one of the, admittedly many, reasons why web technologies like Electron and React Native exist is because it’s easier to find JavaScript developers vs Kotin, Qt or whatever.
So youre not wrong but you’re also downplaying the network effect that led to the web becoming dominant. And a part of that was because developers didn’t want to learn something new.
> JavaScript has been a backend language long before the web was the dominant platform.
I don't think this holds.
JavaScript was created as a frontend language specifically for web browsers.
It wasn't until 2009 with the introduction of Node.js that JavaScript became a viable option for backend development.
The web was already the dominant platform by then.
Actually I would say that was the turning point when the web started to become the dominant platform. Not the conclusion.
Before then it was Flash and native apps. Before then, smart phones weren’t common (eg the iPhone was just over a year old at that point) and the common idiom was to write native apps for Windows Mobile.
IE was still the dominant browser too, Firefox and WebKit were only starting to take over but we are still a long way from IE being displaced.
Electron didn’t exist so desktop app were still native and thus Linux was still a second class citizen.
It took roughly another roughly 2015 before the web because the dominant platform. But by 2005 you could already see the times were changing. It just took ages for technology to catch up. The advent of v8 and thus node and Electron, for that transition to “complete”.
> It wasn't until 2009 with the introduction of Node.js that JavaScript became a viable option for backend development.
JavaScript was used for backend development since the late 1990s via the Rhino engine (the backends wouldn't be pure JS generally but a mix of JS and Java; Rhino was a JS engine for the JVM with Java interop as a key feature.)
> Programs should be valid as they are typed.
That would be nice if devs always wrote code sequentially, i.e. left to right, one character at a time, one line at a time. But the reality is that we often jump around, filling in some things while leaving other things unfinished until we get back to them. Sometimes I'll write code that operates on a variable, then a minute later go back and declare that variable (perhaps assigning it a test value).
Code gets written once and read dozens or hundreds of times. Code that can be read sequentially is faster and easier to read than code that requires jumping around.
And @kmoser did not say code should not be read sequentially.
non sequitur
That’s not quite what the article’s about, though it’s interesting too.
Exactly. You only write code sequentially when it's a new file.
If i decide to add a new field to some class, i won't necessarily go to the class definition first, I'll probably write the code using that field because that's where the IDE was when i got the idea.
If I want to enhance some condition checking, i'll go through a phase where the piece of code isn't valid while I'm rearranging ifs and elses.
> You only write code sequentially when it's a new file.
Often, not even then.
[dead]
I agree with this, but it leads to another principle that too many languages violate - it shouldn't fail to compile just because you haven't finished writing it! It should fail in some other non-blocking way.
But some languages just won't let you do that, because they put in errors for missing returns or unused variables.
How is it supposed to compile if you've written something syntactically invalid? You can make the argument that the compiler could interpret it in (perhaps even arbitrary) valid way that constitutes a valid syntax, but that's almost worse: rather than being chided with compiler warnings, you now end up with code that compiles but executes indeterminately.
Well I don't literally mean finished as in you haven't finished typing it. Although that's possible too - whitespace languages like Python tend to just work since they don't have end brackets, I suppose.
But you could have a compiled language where errors were limited to the function when possible, like by emitting asserts.
They didn't say "syntactically invalid".
> it shouldn't fail to compile just because you haven't finished writing it!
Syntactically invalid.
I'm sorry that you don't understand the vast difference.
there are approaches to ensure you always or almost always have syntactic validity, e.g. structured editing or error correcting parsing, though of course it's true that the more permissive the system is the more tortured some of the syntactic 'corrections' must be in extreme cases. the approach we're taking with http://hazel.org (which we approximate but don't fully succeed at yet) is: allow normal-ish typing, always correct the code enough to be ran/typechecked, but insert placeholders/decorations to telegraph the parse structure that we're using in case of errors or incompleteness.
Hazel has embedding gaps but they are language and editor specific.
BABLR takes Hazel's idea and takes it about 18 steps further, potentially making embedding gaps a feature of every editor and programming language.
As long as solutions don't have a way to scale economically they'll be academic, but BABLR makes this not academic anymore.
i saw hazel mentioned forever ago in an eecs 203 lecture and it sent me on a multi-year fp/type theory rabbit hole :) thanks for your work on it!
I feel that's a use case for compiler flags that convert warnings to errors or vice-versa, where the same source-code can either be "good enough to experiment with" or "not ready for release" depending on the context.
Zig is the worst that way.
It's definitely a minor annoyance for me that IDEs assume that I type things in a different order than I really do.
Also, mutual recursion. ;)
Yes. Seems like an arbitrary limitation to force it.
This is a "let 'im cook" instance.
Reading through the article, the author makes the argument for the philosophy of progressive disclosure. The last paragraph brings it together and it's a reasonable take:
> When you’ve typed text, the program is valid. When you’ve typed text.split(" "), the program is valid. When you’ve typed text.split(" ").map(word => word.length), the program is valid. Since the program is valid as you build it up, your editor is able to help you out. If you had a REPL, you could even see the result as you type your program out.
In the age of CoPilot and agent coders I'm not so sure how important the ergonomics still are, though I dare say coding an LSP would certainly make one happy with the argument.
Some IDEs provide code templates, where you type some abbreviation that expands into a corresponding code construct with placeholders, followed by having you fill out the placeholders (jumping from one to the next with Tab). The important part here is that the placeholders’ tab order doesn’t need to be from left to right, so in TFA’s example you could have an order like
which would give you code completion for {3} based on the {1} and {2} that would be filled in first.There is generally a trade-off between syntax that is nice to read vs. nice to type, and I’m a fan of having nice-to-read syntax out of the box (i.e. not requiring tool support) at the cost of having to use tooling to also make it nice to type.
This is not meant as an argument for the above for-in syntax, but as an argument that left-to-right typing isn’t a strict necessity.
The consensus here seems to be that Python is missing a pipe operator. That was one of the things I quickly learned to appreciate when transitioning from Mathematica to R. It makes writing data science code, where the data are transformed by a series of different steps, so much more readable and intuitive.
I know that Python is used for many more things than just data science, so I'd love to hear if in these other contexts, a pipe would also make sense. Just trying to understand why the pipe hasn't made it into Python already.
The next step after pipe operators would be reverse assignment statements to capture the results.
I find myself increasingly frustrated at seeing code like 'let foo = many lines of code'. Let me write something like 'many lines of code =: foo'.
> reverse assignment statements to capture the results
Interesting idea! However, I'm not sure I would prefer
"Mix water, flour [...] and finally you'll get a pie"
to
"To make a pie: mix water, flour [...]"
R does have a right assign operator, namely ->
It's use is discourages in most style guides. I do not use it in scripts, but I use it heavily in console/terminal workflows where I'm experimenting.
df |> filter() |> summarise() -> x
x |> mutate() -> y
plot(y)
The pipe operator in R (really, tidyverse R, which might as well be its own language) is one of its "killer apps" for me. Working with data is so, so pleasant and easy. I remember a textbook that showed two ways of "coding" a cookie recipe:
bake(divide(add(knead(mix(flour, water, sugar, butter)),eggs),12),450,12)
versus
mix(flour, water, sugar, butter) %>% knead() %>% add(eggs) %>% divide(12) %>% bake(temp=450, minutes=12)
So much easier!
You'd never write that ugly one-liner. Just write the recipe imperatively:
Might be more verbose, but definitely readable.pandas and polars both have pipe methods available on dataframes. you can method chain to the same effect. it's considered best practise in pandas as you're hopefully not mutating the initial df
I don't know if
is much less readable than but I guess the R thing also works beyond dataframes which is pretty coolThe pipe operator uses what comes before as the first argument of the function. This means in R it would be:
Python doesn't have a pipe operator, but if it did it would have similar syntax: In existing Python, this might look something like: (Implementing `pipe` would be fun, but I'll leave it as an exercise for the reader.)Edit: Realized my last example won't work with named arguments like you've given. You'd need a function for that, which start looking awful similar to what you've written:
Python supports a syntax like your first example by implementing the appropriate magic method for the desired operator and starting the chain with that special object. For example, using just a single pipe: https://flexiple.com/python/python-pipeline-operator
The functions with extra arguments could be curried, or done ad-hoc like lambda v: fun1(v, arg1=1)
>Implementing `pipe` would be fun, but I'll leave it as an exercise for the reader.
I like exercise:
https://gist.github.com/stuarteberg/6bcbe3feb7fba4dc2574a989...
Neat!
Thanks!
I haven't used R in forever, but is your `.` placeholder actually necessary? From my recollection of pipe operator the value being pipe piped is automatically as the first argument to the next function. That may have been a different implementation of a pipe operator though.
Probably not, I didn’t use R much during the last decade …
On the other hand, Python does have "from some_library import child_module" which is always nice. In JS we get "import { asYetUnknownModule } from SomeLibrary" which is considerably less helpful.
Alternatively, with namespace imports in JS you can write [1]:
Which works pretty well with IDE autocomplete in my experience.[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...
This is a non starter for anything you want to publish online, as it breaks tree shaking which will cause size bloat and therefore slow loading.
I never understood this obsession with the keyword "from"
Just do
> While the Python code in the previous example is still readable, it gets worse as the complexity of the logic increases.
This bit is an aside in the article but I agree so much! List comprehensions in python are great for the simple and awful for the complex. I love map/reduce/filter because they can scale up in complexity without becoming an unreadable mess!
SQL has this problem since it wants the SELECT list before the FROM/JOIN stuff.
I've seen some SQL-derived things that let you switch it. They should all let you switch it.
This is almost FP vs OOP religious war in disguise. Similar to vim-vs-emacs ... where op comes first in vim but selection comes first in emacs.
If you design something to "read like English", you'll likely get verb-first structure - as embodied in Lisp/Scheme. Other languages like German, Tamil use verbs at the end, which aligns well with OOP-like "noun first" syntax. (It is "water drink" word for word in Tamil but "drink water" in English.) So Forth reads better than Scheme if you tend to verbalize in Tamil. Perhaps why I feel comfy using vim than emacs.
Neither is particularly better or worse than the other and tools can be built appropriately. More so with language models these days.
> If you design something to "read like English", you'll likely get verb-first structure
If its imperative, sure. If its declarative and designed to read like English it will be subject first.
I really want JS/TS to adopt the pipeline operator which has been in a coma in stage 2 after eternal bikeshedding at TC-39
https://github.com/tc39/proposal-pipeline-operator
It would make it possible to have far more code written in the way you’d want to write it
Author picked up a quite convenient example to show methods/lambda superiority.
I prefer list/set/dict comprehensions any day. It's more general, doesn't require to know a myriad of different methods (which could not exists for all collections, PHP and JS are especially bad with this) and easily extendable to nested loops.
Yes it could be `[for line in text.splitlines() if line: for word in line.split(): word.upper()]`. But it is what it is. BTW I bet rust variant would be quite elaborate.
I'm a big fan of Python syntax, but really comprehensions don't make any sense to me efficiency wise (even for readability). Python syntax would become perfect with filter() and map() :')
With judicious use of generators, and with itertools [0] when needed, they can be quite efficient for not much effort.
And these are flexible tools that you can take with you across projects.
[0] https://docs.python.org/3/library/itertools.html
> I prefer list/set/dict comprehensions any day. It's more general, doesn't require to know a myriad of different methods
It's the opposite, your knowledge of the standard set of folding algorithms (maps, filters, folds, traversals) is transferable almost verbatim across a wide range of languages: https://hoogletranslate.com/?q=map&type=by-algo
maps and filters maybe. Other stuff is a wild west.
A minor corollary to this is that, as the user types, IDEs should predictably try to make programs valid – e.g. via structured editing, balancing parens in Lisp like paredit, etc.
This moves language design responsibility to the tool from the language itself. It might be okay to not hurt language elegance (e.g. lisp syntax), but in general I expect language to be convenient regardless of dev environment.
Please no. I hate when the editor adds random shit I didn't type to the file.
Autocomplete-oriented programming optimizes for writing code. I don't think that's a good route to go down. Autocomplete is good for spewing out a large volume of code, but is that what we want to encourage?
I'd much rather optimize for understanding code. Give me the freedom to order such that the most important ideas are up front, whatever the important details are. I'd much rather spend 3x the time writing code of it means I spend half the time understanding it every time I return to it in the future.
I miss the F# pipe operator (https://learn.microsoft.com/en-us/dotnet/fsharp/language-ref...) in other languages. It's so natural to think of function transform pipelines. In other languages you have to keep going to the left and prepend function names, and to the right to add additional args, parens etc ...
I've had to migrate to mostly Python for my work, and this is the thing I miss the absolute most from R (and how it works so seamlessly with the tidyverse)
You'll be able to use it in PHP 8.5 :-)
https://stitcher.io/blog/pipe-operator-in-php-85
n.b. pipe operators also exist in various forms in other languages. OCaml, Elixir, Clojure, Haskell...
Oh yes I didn't mean F# is the only one. Just that the other ones I am likely to work with i.e. C++, Java and Python don't have it.
Based on Haskell do-notation, Firefox's unfortunately withdrawn array comprehensions, and C# LINQ, I've been thinking about this:
as alternative syntax to Python's This solves five problems in Python listcomp syntax:1. The one this article is about, which is also a problem in SQL, as juancn points out.
2. The discontinuous scope problem: in Python
x is in scope in Ξ and Λ but obviously not Γ. This is confusing and inconsistent.3. The ambiguity between conditional-expression ifs and listcomp-filtering trailing ifs, which Python solves by outlawing the former (unless you add extra parens). This is confusing when you get a syntax error on the else, but there is no non-confusing solution except using non-conflicting syntax.
4. let. In Python you can write `for d in [f(b)]` but this is inefficient and borders on obfuscated code.
5. Tuple parenthesization. If the elements generated by your iteration are tuples, as they very often are, Python needs parentheses: [(i, c) for i, c in enumerate(s) if c in s]. That's because `[i, c` looks like the beginning of a list whose first two items are i and c. Again, you could resolve these conflicting partial parses in different ways, but all of them are confusing.
In Rust, for..in loops have the same problem: you don't know the type and shape of the thing you are iterating on before you get to the "in" part. "Luckily", it does not allow you to call any methods on the object before you do.
Similarly, Python list/dict/set comprehensions are a form of for-loop syntax sugar to easily create particular structure. One can use functools.maps to get exactly the same behavior of Rust example.
If this was an all important text input microoptimization, we'd all be doing everything with pure functions like Lisp: yet somehow, functional languages are not the most popular even if they provide the highest syntax consistency.
That's an interesting idea: maybe instead of `for x, y in points: ...` (Python syntax) we should write `points do |x, y| ...` so the IDE has a maximum amount of type inference information available? That would also suggest writing variable declarations with the variable at the end, as in Forth or Levien's Io. 80 CONSTANT COLUMNS or `80 -> columns;`.
It doesn't look anything like Lisp, though.
Mildly related: Non-English-based programming languages. There are some right to left options like Arabic, Hebrew, and Persian
https://en.wikipedia.org/wiki/Non-English-based_programming_...
I had a similar thought a few years ago with an Advent of Code problem for which my solution in python might have been
To decipher this the eye has to jump to the middle of the line, move rightwards, then to the left to see the "map" then move right again to see what we are mapping and then all the way to the beginning to find the "max".The author would probably suggest rust's syntax* of
but I was learning q at the time so came up with the much clearer /s, right to left *: Though neither python nor rust have such a nice `.split(None)` built in.> Though neither python nor rust have such a nice `.split(None)` built in.
Sorry, I'm not sure I understand what `.split(None)` would do? My initial instinct is that would would return each character. i.e. `.chars()` in Rust or `list(s)` in Python.
>Sorry, I'm not sure I understand what `.split(None)` would do?
Reading the docs [0] it seems `.split(None)` returns an array of the indivual characters without whitespace - so something like [c in list(s) if not whitespace(c)]
[0] https://docs.python.org/3.3/library/stdtypes.html?highlight=...
It was intended to split a list of `int|None` into its non-none stretches. Much like how `string.split('x')` splits a string by matching the character 'x'
Gotcha! In python there is a `split_at` function for this in the more-itertools package, but I don't think there is a concise way to do it in the stdlib.
Rust doesn't have it built in, but it's part of itertools: https://docs.rs/itertools/latest/itertools/trait.Itertools.h...
This may eventually be upstreamed: https://github.com/rust-itertools/itertools/issues/1026
Honestly, I find using intermediate variables more readable than a long chain of function invocations where you have to keep track of intermediate results in your head:
This also gives you a slightly higher-level view on how the algorithm proceeds just by reading the variable names on the LHS.While methods partially solve this problem, they cannot be used if you are not the author of the type. Languages with uniform function call syntax like Nim or D do this better.
So many engineers define their own narrow ergonomic values and then turn to the interwebs attempting to hammer their myopic belief into others as evangelical truth -- this reads as if the author believes there is a singular left-to-right process that a developer ought to adhere to. The author is oblivious to the non-linear compositional practices of other coders. It would be much more constructive to spend time creating specific tooling to aid your own process and then ask others if the values are also relevant to them. Not every developer embraces LSP, for example, as some are thwarted by its opinionated implementation. Not everyone is willing to give up local structure for auto-complete convenience.
It seems to me what the author desires is linguistic support for the Thrush combinator[0]. Another colloquial name for it is "the pipe operator."
Essentially, what this combinator does is allow expressing a nested invocation such as:
To be instead: For languages which support defining infix operators.EDIT:
For languages which do not support defining infix operators, there is often a functor method named `andThen` which serves the same purpose. For example:
0 - https://leanpub.com/combinators/read#leanpub-auto-the-thrushFunny how the article has no reference to C++ but still linking "Falling Into The Pit of Success". Did we all get our daily dose of subliminal messages from rust?
It's nice if after typing `file.` you see what functions you can use. But what if there end up being too many options? What's next, a fuzzy finding search box for all possible functions? Contextually relevant ones based on the code you've already written?
> But what if there end up being too many options?
That would suggest the file object needs to be refactored and split.
> What's next, a fuzzy finding search box for all possible functions? Contextually relevant ones based on the code you've already written?
IDEs already provide both those options.
I understand, I think I moreso wanted to hint that there may still be more room for exploration when it comes to things like this.
However, while I agree, I do want to challenge you to explain/show if and why this is the case (I know that you wrote "suggest"):
> That would suggest the file object needs to be refactored and split.
> Programs should be valid as they are typed.
Not possible. There are more keystrokes that result in invalid programs (you are still writing the code!!) than keystrokes that result in a valid program.
More seriously, I do think that one consideration is that code is read more often that written, so fluidity in reading and comprehension seem more important to me than “a program should be valid after each keystroke.
Typing goes in the same directions as reading, so I expect many benefits to apply to both. But I agree that readability is much more important than writeability.
Reading through this article only elicits a "WTF!?" from me.
Your editor can’t help you out as you write it.
You shouldn't need handholding when you're writing code. It seems like the whole premise of the author's argument is that you shouldn't learn anything about the language and programming should be reduced to choosing from an autocomplete menu and never thinking more than that. I've seen developers who (try to) work like this, and the quality of their work left much to be desired, to put it lightly.
From there you can eventually find fread, but you have no confidence that it was the best choice.
In C, you have to know ahead of time that fclose is a function that you’ll need to call once you’re done with the file.
It's called knowledge. With that sort of attitude, you're practically begging for AI to replace you.
No wonder people claim typing speed doesn't matter - they can barely think ahead one token, nevermind a statement or function, much less the whole design! Ideally your typing speed should become the bottleneck and you should be able to code "blind", without looking at the screen but merely outputting the code in your mind into the machine as fast as humanly possible. Instead we have barely-"developers" constantly chasing that next tiny dopamine hit of picking from an autocomplete menu. WTF!?
When this descent into mediocrity gets applauded, it's no surprise that so much "modern" software is the way it is.
And we don't need navigation systems on our cars because we should know where we're going, right? ;)
Jokes apart, I think you're being too drastic. A good auto-complete is a nice feature, just like auto-indent, tab-complete, etc. Can it be abused? Sure. So what? Should we stop making it better for fear of abuse?
The reference to AI is far-fetched, too. We're talking about tools to help you with the syntax, not the semantic. I may forget if the function is called read, fread, or file_read, but I know what its effect is.
And finally, consider that if something is easier to parse for an editor, it most probably is for a human too. Not a rule, not working in 100% of cases, but usually exposing the user to the local context before the concept itself helps understanding.
I think the most absurd syntax goes to Python again.
Python offers an "extended" form of list comprehensions that lets you combine iteration over nested data structures.
The irony is that the extensions have a left-to-right order again, but because you have to awkwardly combine them with the rest of the clause that is still right-to-left, those comprehensions become completely unreadable unless you know exactly how they work.
E.g., consider a list of objects that themselves contain lists:
To get a list of lists of tools, you can use the normal comprehension: But to get a single flattened list, you'd have to do: Where the "t for b" looks utterly mystifying until you realize the "for" clauses are parsed left-to-right as while the "t" at the beginning is parsed right-to-left and is evaluated last.I've taken to writing complex comprehensions like this over multiple lines (which I initially thought wasn't possible!). It's still a bit awkward, but the order of "fors" is the same as if one was writing nested for loops, which makes reasoning about it easier for me.
The proper way to parse or construct nested list comprehensions as explained in pep 202 is to think like you're using normal for:
for a in b:
This would be[use(a,b,c) for a in b for c in a]
Everything stays the same except the "use" part that goes in the front (the rule also includes the filters - if).
> The proper way to parse or construct nested list comprehensions as explained in pep 202 is to think like you're using normal for:
A syntax construct that requires you to think in a different syntax construct to understand it is not a good syntax construct.
> Everything stays the same except the "use" part that goes in the front (the rule also includes the filters - if).
Yeah, you consistently have to write the end first followed by the start and then the middle, it's just in most cases there's no middle so you naturally think you only have to flip the syntax rather than having to write it middle-ended like a US date.
In 1990s-born scripting languages, it makes sense that there are plenty design choices that don't mesh well with static-analysis-driven autocompletion, because that was not at all part of the requirements for these languages at the time they were designed!
It was a pretty big deal by the time list comprehensions were added to Python, though, in 02000: https://peps.python.org/pep-0202/
It's true that Python didn't cater to static analysis at all.
You don't need to be bothered with minute details such as syntax for counting string length anymore, you just need to know what you want to do. I mention this since OP is bringing up LSP's as an argument for why certain language's design is suboptimal.
"Count length of string s" -> LLM -> correct syntax for string-count for any programming language. This is the perfect context-length for an LLM. But note that you don't "complete the line", you tell the LLM what you want to have done in full (very isolated) context, instead of having it guessing.
Is the entire complaint that autocomplete doesn’t work well? Have you tried learning your language?
The way people code with Python is by using its large ecosystem, few people only use the standart library. No one knows all the API, the more discoverability there is, the better.
I use stdlib whenever possible specifically because it’s huge, well-tested, and eliminates dependencies.
I don’t know all of its API, but I do read the docs periodically (not all of them, but I try to re-read modules I use a lot, and at least one that I don’t).
It’s more about learning the language’s library, which typically is too large to remember in detail, than about learning the language, which typically is small enough for that.
The complaints listed seem to focus on attributes / methods of a class. You can read Python’s docs, and see the methods available to a str type, for example.
To me, that’s learning the language. Learning its library would be more like knowing that the module `string` contains the function `capwords`, which can be used to - as the name suggests - capitalize (to title case) all words in a string. Hopefully, one would also know that the `str` class contains the method `upper`, so as not to confuse it with `string.capwords`.
This is the main reason I really like concatenative syntax for languages — this property is _enforced_ for programs (minus some delimited special cases, usually). It also neatly generalizes the special `self` argument so you can dispatch on the types of all the arguments.
> Suppose you have a FILE file and you want to get it’s contents. Ideally, you’d be able to type file. and see a list of every function that is primarily concerned with files. From there you could pick read and get on with your day.
> Instead, you must know that functions releated to FILE tend to start with f, and when you type f the best your editor can do is show you all functions ever written that start with an f
Why do you think that this is a problem of C? no one is stopping your tools from searching `fclose` by first parameter type when you wrote `file.`. Moreover, I know that CLion already do this.
> This is much more pleasent. Since the program is always in a somehwat valid state as you type it, your editor is able to guide you towards the Pit of Success.
Although the subtitle was “programs should be valid as they are typed”, it’s weakened to “somewhat valid” at this point. And yes, it is valid enough that tooling can help, a lot of the time (but not all) at full capability. But there’s also interesting discussion to be had about environments where programs are valid as they are typed. Syntactically, especially, which requires (necessary but not sufficient) either eschewing delimition, or only inserting opening and closing delimiters together.
That's fine for doing algebra in pure functions, but what about destructive commands or interactive scenarios?
For example, using "rm" on the command line, or an SQL "delete". I would very much like those short programs to be invalid, until someone provides more detail about what should be destroyed in a way that is accident-resistant.
If I had my 'druthers, the left-to-right prefix of "delete from table" would be invalid, and it would require "where true" as a safety mechanism.
The solution suggested by the author, I assume, is `table.where(true).delete()` and `all_my_data.rm()`, which indeed has the property you describe.
Smalltalk is pretty strictly left-right but that becomes one of its faults as well, as infix operators are all at equal precedence.
Disagree. The first example the author seem to want something more like imperative programming, so the "loop" construct would come first. But then the assignment should come last. With the python syntax you get the thing you're assigning first - near the equals sign - and then where it is selected from and with any filtering criteria. It makes perfect sense. If you disagree that's fine, the whole post is an opinion piece.
> the whole post is an opinion piece.
It's not. The author gives objective reasons why Python's syntax is inferior – namely, that it makes IDE support in the form of discoverability and auto-completion more difficult.
This presupposes that the reader likes or even uses auto-complete. I do not, and there are many others like me.
And I'm sure there are people who program in Notepad or nano. If you want to develop software like it's the 80s again, go ahead, the rest of us appreciates at least basic IDE support.
This is called point-free style in Haskell.
Sometimes it is called a fluent-interface in other languages.
Or "point-less" style ;)
Could you elaborate? AFAIK tacit programming tend to be scrambling around composition, paren, and args which makes left-to-right reading significantly harder for function with arity greater than 2.
I find Java's method reference or Rust's namespace resolution + function as an argument much better than Haskell tacit-style for left-to-right reading.
It's chaining functions with a dot, just like you do in typical OO languages.
When it's OO, it's a virtue that everyone loves - a "fluent interface".
When it's FP - oh it's unreadable! Why don't they just break every line out with an intermediate variable so I know what's going on!
> Sometimes it is called a fluent-interface in other languages.
Where've you heard it called that? I've normally heard tacit programming
The developers of JMock, the original library for Java.
I've often thought that if C just had a syntax like Go definding method, C would be much easier to write.
uint16_t (MyStruct* s) some_func() { .. }
uint16_t MyStruct_some_func(MyStruct* s) { .. }
(I guess I should just use C++... but C++ is overwhelmed in many ways)
This can obviously be expanded to top to bottom programming, and there's a related principle of reorder / relocation.
If/else if fails the relocation principal across many languages, since the first must be if, and middles else if. Switch tends to pass.
Languages that don't allow trailing commas also fail
C# has something very similar to Python 'comprehensions', but written left to right like TFA describes:
One of the things I dislike about function notation is that in f(g(h())) execution order is right-to-left. I like OO partly because execution order is writing order ( h().g().f() )
In Clojure I love the threading macro which accomplishes the same: (-> (h) (g) (f))
Nim and other languages addressed that by making . akin to a pipe operator. It takes the left and sends it to the right, as the first argument
In Haskell there's also dot (edit - my bad, not dot, pipe) syntax which allows composing functions left to right. I believe Nim also has it. Just as a few more examples.
I don’t think that’s correct. The dot operator is composition with the same semantics as when using normal math notation.
is equivalent tof . g is still doing g first, then f, right? Haskell has pipelines in the Flow library.
https://hackage-content.haskell.org/package/flow-2.0.0.9/doc...
No need to import a third party library: you can use `Data.Function ((&))` for this:
Ooops my bad. Been too long.
check out gleam, it's the best there is here.
Here's an example I found https://github.com/gleam-lang/example-todomvc/blob/main/src/...
words_on_lines = [ret for line in text.splitlines() for ret in [line.split()]]
> let words_on_lines = text.lines().map(|line| line.split_whitespace());
But you still have to define line without autocomplete.
I don't think making your tools happy is a particularly valid reason to code in one style vs another.
Left-to-right is also much easier to wrap your head around. Just imagine the data going on a conveyor belt with a bunch of machines instead of having to wrangle a nested tree.
A not insignificant number of functional languages disagree. You often read functional code from the inside out, which basically means you're reading right to left as you move up function calls.
For the record though, "it's more readable" is a much better argument then "LSP mad"
Right-to-left is equivalent to left-to-right in this regard: you're still reading in one linear direction, as opposed to Python, where you have to read inside-out and revisit later parts of the line to correctly understand previous parts of the line.
Fair enough. And like I said, you're argument is certainly better than the article's.
I'm not really a fan of list comprehensions, I usually just use for loops. It does seem consistent with Pythons syntax though. For loop is `for item in items` and comprehensions have 'item for item in items'.
(x, y)f;
If your tools are happy they can make you happy.
I have no tools because I've destroyed my tools with my tools.
I also want to note, in the JS… Math.abs(x) instead of x.abs() (as seen in Rust).
And, because nerd sniping, two Rust implementations, one a direct port of the JS:
(`x.abs() >= 1 && x.abs() <= 3` would be better as `(1..=3).contains(x.abs())` or `matches!(x.abs(), 1..=3)`.)And one optimised to only do a single pass:
EDIT: To add:
The filter, list construction, and len aren't needed either. It's just:
Or alternatively: The predicate is complex enough and used twice, so it warrants extraction to its own named function (or lambda assigned to a variable), but even if it were still embedded this form would be slightly clearer (along with adding the line breaks and removing the extra list generations):Ooh yes, thanks, I didn’t even think about that, I was too focused on the other parts. Much nicer, updated.
The author's argument is a strong argument against the way Lisp does it, at least for most code, as Lisp tends to put the verb to the left and nouns to the right so your IDE has no idea what nouns you are working with until you've picked the verb.
Pretty much every functional language is verb-initial, as are function calls as well as statements in most mainstream languages.
Verb-final languages like PostScript and Forth are oddballs.
Embedded verbs are a thing of course, with the largest contribution coming from arithmetic expressions with infix operators, followed by certain common syntax like "else" being in the of an "if" statement, followed by things like these Python comprehensions and <action> if <condition> syntactic experiments and whatnot.
Right, his argument is in favor of the "fluent" syntax like
that is typical in many OO interfaces, though you sometimes see a pipe syntax which would be autocomplete friendly in languages like Clojurehttps://clojuredocs.org/clojure.core/-%3E
or F#
https://stackoverflow.com/questions/12921197/why-does-the-pi...
> If you aren’t familiar with Rust syntax, |argument| result is an anonymous function equivalent to function myfunction(argument) { return result; }.
> Here, your program is constructed left to right. The first time you type line is the declaration of the variable. As soon as you type line., your editor is able to suggest available methods.
Yeah, having LSP autocomplete here does feel nice.
But it also makes the code harder to scan than Python. Quick readability at a glance seems like the bigger win than just better autocomplete.
> But it also makes the code harder to scan than Python. Quick readability at a glance seems like the bigger win than just better autocomplete.
It depends a lot on what you’re accustomed to. You get used to whichever style. Just like different languages use different sentence order: subject, object and verb appear in all possible orders in different languages, and their speakers get along just fine. There are some situations where one is clearly superior to the other, and vice versa.
`text.lines().map(|line| line.split_whitespace())` can be read loosely as “take text; take its lines; map each line, split it on whitespace”. Straightforward and matching execution flow.
`[line.split() for line in text.splitlines()]` doesn’t read so elegantly left-to-right, but so long as it’s small enough you spot the `for` token, realise you’re dealing with a list comprehension, and read it loosely from left to right as “we have a list made up of splitting each line, where lines come from text, split”. Execution-wise, you execute `text.splitlines()`, then `for line in`, then `line.split()`. It’s a bunch of left-to-rights embedded in a right-to-left. This has long been noted as a hazard of list comprehensions, especially the confusion you end up with with nested ones. Now you could quibble over my division of `for line in text.splitlines()` into two runs; but I think it’s fair. Consider how in Rust you get both `for line in text.split_lines() { … }` and `text.split_lines().for_each(|line| { … })`. Sometimes the for block reads better, sometimes .for_each() or .map() or whatever does. (But map(lambda …: …, …) never really does.)
Python was my preferred language from 2009–2013 and I still use it not infrequently, but Rust has been my preferred language ever since. I can say: I find the Rust version significantly easier to read, in this particular case. I think the fact there are two levels of split contributes to this.
On the other hand, in Rust you often have to add something like
which kind of ruins the elegance :PIt indeed doesn't look elegant, however I've never in my experience seen a usage like this. Do you have any reference where you might have seen this kind of usage.
This is not meant to be taken literally, I was making fun of how Rust often requires a lot of punctuation and thinking about details of memory allocation.
Gotcha, yeah you got me there with so many `&`. Good one.
This is it right here. I love the Python ecosystem but if I can find a JS alternative I'm using that for ergonomics.
Ocaml has the pipe operator |> (kinda similar to pipe in Bash) for this purpose
This is honestly why I love C++ ranges right now. The "pipe" syntax is a "left-to-right" of writing very powerful map/filter operations in contexts where I'd want a list comprehension but in a much more sensical (and for that matter customizable) order
I'm curious on how much of this is basically overcome by the new tools? As much as vibe coding annoys me, I can't argue against how good autocomplete from these LLMs can be.
Very human of us to spend billions of dollars and tons of electricity to bang our programming languages into shape since sane syntax is slightly uncomfortable the first 10 minutes you work with it.
I believe there are some strongly typed stack based languages where you really always do have something very close to a syntactically correct program as you type. But now that LLMs exist to paper over our awful intuitions, we're stuck with bad syntax like python forever.
I'm honestly not sure how I feel on it, all told.
On one level, I do prefer my code to be readable left to right and top to bottom. This, typically, means the big "narrative" functions up top and any supporting functions will come after they were used. Ideally, you could read those as "details" after you have understood the overall flow.
On another level, though, it isn't like this is how most things are done. Yes, you want a general flow that makes sense in one direction through text. But this often has major compromises and is not the norm. Directly to programming, trying to make things context free is just not something that works in life.
Directly to this discussion, I'm just not sure how much I care about small context-free parts of the code?
Tangential to this discussion, I oddly hate comprehensions in python. I have yet to get where I can type those directly. And, though I'm asking if LLM tools are making this OBE, I don't use those myself. :(
So far any "autocomplete" from an LLM has only served to insanely disrupt my screen with fourtey lines of irrelevant nonesense that cannot be turned off (the "don't suggest multiline autocompletes" option does not prevent LLM autocomplete from doing so) and has only served to be less useful than non-LLM based autocompletes were, which I was massively impressed with, because it was hyperlocal and didn't pretend that the niche code I'm writing is probably identical to whatever google or microsoft writes internally.
I've had some minor success with claude, but enabling the AI plugin in intellij has literally made my experience worse, even without using any AI interactions.
"This is bad because your editor can’t help you autocomplete it as you write it."
No, you are bad because you use an editor with autocomplete.
And it's not even debatable, it's like playing bowling with rails, or riding a bycicle with training wheels. Sure you can argue that you are more efficient at bowling and riding a bike with those, but you are going to be arguing alone, and it's much better to realize that python is one of the best languages at the moment and therefore one of the best languages ever, instead of being a nobody and complaining about a lanugage because you are too encumbered by your own ego to realize that you are not as good a programmer as you thought.
Nothing wrong with being an amateur programmer or vibecoding or whatever, but if you come for the king you best not miss
Incidentally, Darklang had this built into the language/editor combo from the start. You can see it in the examples on our youtube, maybe this one: https://www.youtube.com/watch?v=NQpBG9WkGus
To make a long story short, we added features for "incomplete" programs in the language and tools, so that your program was always valid and could not be invalid. It was a reasonable concept, and I think could have been a game changer if AI didn't first change the game.
Another point to this is that we should switch to writing the numbers in the right order too: 123 should be three-hundred-twenty-one.
This way, as soon as you type "1", it is truly one, you add a "2" and you know that's adding twenty...
IIRC Arabic gets this right!
/s
Haha. Except in Arabic numbers are written as in English.
Right, but all the other letters are written right-to-left too :)
[flagged]
> the Python code in the previous example is still readable
Yes, I agree with the author, list comprehensions are readible, and I'd add, practical.
> it gets worse as the complexity of the logic increases
Ok, well this is something that someone would be unlikely to write... unless they wanted to make a contrived example to prove a point.It would be written more like:
See also the Google Python style guide, which says not to do the kind of thing in the contrived example above: https://google.github.io/styleguide/pyguide.html(Surely in any language it's possible to write very bad confusing code, using some feature of the language...)
And note:
^- list comprehension is just a convenient shorthand for a `for loop`, i.e.: Just moving the `line.split()` to the front and removing the empty list creation and append.