Hacker News

9 days ago by anyfoo

> The premier example of the adverse effects of cleverness in programming language design (and one which is obvious to programmers at all skill levels) must surely be the C/C++ declaration syntax [10]. On the surface, it seems like an excellent notion: let declarations mirror usage. Unfortunately, the very concept undermines the principle of visually differentiating semantic difference.

Does it ever. From today's point of view, C's declaration syntax is just bizarre, especially if you spent some time with a "modern" type system[1]. The result of this declaration-mirrors-usage cleverness is not just the more encompassing insanity like the emergence of the "spiral rule", as found here: [2]. (Isn't it fun that that site promises to enable "any C programmer to parse in their head any C declaration!"? You know you have a problem when that's a novelty.)

It's also the emerging little paper cuts, like the fact that it should be "char* c" with the * hugging the char because * modifies the type and so is part of it, but because of the "mirror usage" concept you better actually write "char *c", because that's how the parser groups it, as part of the variable.

In effect, multiple declarations in one statement must have confused many novices, when they wrote something like "char* p, q", and surprisingly got a "char* p" and a "char q".

> Students have enough trouble mentally separating the concepts of declaration and usage, without the syntax conspiring to blur that crucial difference.

Yeah. I learned C in the 90s (and fortunately by now know exactly where to distribute my "volatile"s and "const"s in a complex type), but I still remember how confusing it was that the * practically did the opposite in declaration vs. usage. In a declaration it makes a pointer, in usage it dereferences it.

Consider the very important difference between those:

char *p = q; *p = q;

Fun.

[1] "Modern" in quotes, because one prime example, Haskell, is old enough to appear in this 1996 article.

[2] http://c-faq.com/decl/spiral.anderson.html

8 days ago by adrian_b

The problems caused by "*" in C/C++ declarations and also in expressions are caused mostly by the mistake that "*" is a prefix operator instead of being a postfix operator.

This mistake has already been recognized by Dennis Ritchie a very long time ago, but it could not be corrected as it would break backward compatibility.

In 1965, the language EULER by Niklaus Wirth & H. Weber was the first high-level language with explicit pointer variables (previous languages, e.g. LISP I, FORTRAN IV and CPL used pointers only implicitly) and with the 2 operators "address-of" and indirect addressing, the ancestors of C "&" and "*".

However in EULER the indirection operator was postfix, as it should be. Niklaus Wirth kept this operator as postfix in his later languages, e.g. in Pascal.

With a postfix indirection operator, when you write some element of a hierarchical organization of data, which can include an arbitrary number of the 3 operations, array indexing, structure/class member selection and pointer indirection, all the operations are written and executed from the left to the right, without any parentheses and without any ambiguities. (also the C operator taken from PL/I, "->", is no longer needed, as it has no advantage over writing "*.").

This simple change would have avoided most pitfalls of the C declarations.

8 days ago by jeltz

Rust copied this misstake. Rust would have been neater if taking references and dereferencing were postix operators. At least they made await a postfix operator.

8 days ago by ModernMech

To Rust's credit, 90% of the time I find I don't need to dereference much at all. For example, when you make a method call on a reference, you don't have to dereference first, and there is no -> operator in Rust. The only time I ever really have to dereference manually in my experience is if I'm matching on some struct reference, and I want to take some field and put it in a new place.

e.g.

  match &foo {
    Foo{bar} => Baz{quux: *bar}, 
    _ => (),
  }
(not valid Rust but you get the point)

8 days ago by DeathArrow

I don't think prefix usage of * operator is a big flaw. I've learned C/C++ in high school and never had a an issue with it. I've understood that the operator is applied to the variable, not to the type and that declaring a pointer is not the same thing as dereferencing it .

Even if someone may have made a mistake when declaring a pointer in C or C++ it is something they can get over pretty fast.

8 days ago by ncmncm

Dereference syntax in C++ is very, very fixable.

C++ could easily adopt a postfix operator^ as a synonym for prefix op*, even at this late date. Then (*p)->m could be written p^->m, or p^^.m. (The last is clearly best.)

Rust, too, could do the same, for the same benefit. And C could, too, if there were any point.

The only people to complain would be IDE vendors.

If you were then allowed to declare a pointer argument like "char p^", or even "char^ p", you could begin to dispense with use of "*" as a unary operator. Declaring "char^ p, q" would mean both p and q are pointers, eliminating another source of beginners' confusion.

8 days ago by habibur

I didn't understand it either. Than learned assembly for a few months for a project -- switched back to C and everything came as natural.

8 days ago by Koshkin

Note that in C at least the '->' is completely redundant and the '.' alone would suffice. (C++ is a somewhat different story.)

8 days ago by scoopdewoop

Not without parentheses, which is why -> was introduced.

8 days ago by veltas

I think this was only considered acceptable because in early C you couldn't initialise automatic variables. So you would never be confused by an association like `char *p = q; *p = q;`, because you couldn't write that first part. And declarations were all separate at the front of the function.

8 days ago by abecedarius

The bigger mistake was to mix a prefix pointer operator in with the postfix [i] and (x). You end up with the spiraling-out parsing even for expressions, not just declarations. With Pascal's postfix ^ everything reads left to right.

8 days ago by undefined

[deleted]

8 days ago by agent327

While you are not wrong, the two major problems in C I would consider to be array-to-pointer decay (arrays should really have been a separate type), and assignments being expressions instead of statements.

Array decay means that pointers need the ability to point to different array elements, thus requiring pointer arithmatic - something that wouldn't have been needed at all without array to pointer decay.

And since assignments are expressions, they must return something, meaning they can be used in statements like 'if (a=b)...'. If they were statements they wouldn't return anything, so this mistake would be impossible. You'd lose assignment chaining, but the win here would be much greater than the loss.

If I had a third choice, it would be the mess around 'char'. At the very least there should have been two types: one for characters (where signedness is irrelevant), and one for the smallest type of integers (that can exist in signed or unsigned form). And in these modern times an additional type, something like 'byte', would be needed to deal with aliasing issues, so char (and things like uint8_t) can exist without that, allowing better optimisation for those types.

8 days ago by adrian_b

Assignments being expressions is frequently very useful for avoiding to write the same expressions twice, which could also be a source of errors.

You are right about the errors caused by unintended assignments, but the root cause for these errors is not the fact that assignments are expressions, but the fact that in B & C they have replaced the operators inherited by BCPL from ALGOL, ":=" for assignment and "=" for equality test with "=" and "==". With the original operators, typos would have been very unlikely.

This misguided substitution was justified by less typing, but a much better solution would have been to not change the syntax of the language but to just modify their text editor to insert ":=" when pressing ":", because deleting the "=" when not needed would have been required very seldom, as ":" in labels or "?:" is encountered much less frequently than both assignments and equality tests.

7 days ago by agent327

If you have the same expression twice as part of a compound assignment, there's nothing stopping you from assigning the variable instead of repeating the expression:

    a = b = f();
vs.

    a = f();
    b = a;
And tangential, but editors really shouldn't be mangling text as you type. Whenever I install Visual Studio anywhere, it's the first thing I turn off...

8 days ago by nsajko

I also dislike array-to-pointer decay, but I'm not sure why do you say that it is one of the two major problems.

Could you expand on "Array decay means that pointers need the ability to point to different array elements, thus requiring pointer arithmatic"? I don't see what you mean.

A point that comes to mind is that if array-to-pointer decay were really such a bad thing, the "struct of array" and "pointer to array" types would be used much more than they are (they are almost never used).

8 days ago by agent327

If arrays are separate types, pointer arithmetic becomes redundant: you can't add one to a pointer to get to the next element anymore. That would remove an entire class of problems.

Pointer to array would still be a thing, of course, but it would always only point to the _array_, without the ability to manipulate it to point to another part of the array.

8 days ago by eMSF

Arrays are a separate type.

8 days ago by SkeuomorphicBee

Just in the exact scope in which they were declared, once they are passed as argument to any function they become a simple pointer.

8 days ago by agent327

Not from pointers, though.

9 days ago by ggm

I generally line up with this. Syntactic confusion is a huge barrier to "thinking" in a new computing language, now add to it by having to "think in new concepts" which lies in the semantics.

Its a kind of combinatorial explosion of meaning and power of ideas.

The expressive qualities of lambda notations can die in the inherent confusion "am I parsing this sentence-in-lambda left-to-right or right-to-left" -And I really do mean parse, the internal brain-model of "reading the words and symbols" informs how people construct a mental model of what they see.

yes a <- b and b -> a could both mean B is transformed into A but one of them is actually "read" as "a is derived from b" which is different, syntactically, and in comprehension terms

(I don't mean any computing language does this specific notation. I just mean, that how we comprehend this is subtly different, and in the case of first/new programming language, its a higher-order problem. New concepts, new notation)

8 days ago by bborud

Learning programming isn't so much about the language as it is about human beings and motivation. While language enthusiasts are very excited about type systems and technical details like syntax, what beginners need is to get over that initial threshold and to be able to self-motivate.

The first question to ask when choosing a language is: is it useful for practical programming? Can the students do something that feels like an actual accomplishment within the first week? If it isn't, then you can't use it as a beginner's language because it has a weak reward mechanism. The reward is to be able to solve practical problems that mean something to the student. And this is critical in the initial weeks and months.

Printing a fibonacci sequence is not rewarding for most students. It is for some, but in my experience, not most. Printing a fibonacci sequence with tail recursion even less so because a student will have no idea of why this may be important.

Being able to write a program that meaningfully communicates with its surroundings to do work IS inspiring. Whether it is consuming input from the outside and doing something with it, blinking a LED, making an HTTP request, or sending a message. (Or in the case of my nephew: write a Discord bot that honks a horn when his friends are online and try to get his attention. Which didn't just solve a practical problem: it gave him status among his peers, which is a strong motivator)

For instance: I tend to recommend Python as a first language. I don't like Python myself, but I recognize that it a) allows you to do practical things, b) it has a relatively simple syntax, and c) it doesn't really force a lot of fluff on you that "this will make sense later". It also allows you to postpone topics that initially just confuse students.

Java, JavaScript, C, C++, Lisps and perhaps more esoteric languages are for later. Yes, I know people think some of these are "the right way" to start, but that's often the teachers focusing on their own interests and not focusing on the student. Bad teacher.

Let's get people over the first couple of hurdles first, and then we can talk about what's next. But you always have to remind yourself that introductory programming is about humans.

8 days ago by urthor

I am a big fan of showing two languages at the start. Statically typed Java/something so that the idea of data types is ingrained. Then rapidly switching to Python after the first lesson.

A real problem I've seen (and had) with absolute beginners doing Python, is they really struggle with the types of abstract variables for the first little while when they're "getting stuff done."

"Knowing" to use type() to debug your programs isn't something that comes intuitively to a novice who's been programming for all of 25 minutes, They have absolutely no idea that builtin is there or they need to use it.

A quick Hello World in something with static types, and a "tour of data types" can go a long way. Also helps sell them on Python.

I wonder if Typescript is suitably fun for this. Because I'm convinced that a significant part of the attraction of Python to beginners is pure aesthetics.

8 days ago by bborud

I am not entirely convinced you need to worry too much about types in the sense we think of types. Sure, people have to have a basic understanding that there are different kinds of data, and that multiplying banana with poem doesn't make sense (well, actually, in Python it has an interpretation, so that was probably a bad example :-)).

If your program has to figure out what type some variable is before operating on it, I think the aspiring programmer has already overcome the first few barriers and is already tackling a more advanced subject.

I think initially the understanding that there are different classes of values you can assign to variables and that getting these confused can produce unwanted results is a very good first insight. And when that settles in the students mind, they might be receptive to "then there's this family of languages that get really particular about what goes where".

(I also think it is important to try to not involve too much technical terminology early on. Have a look at Richard Feynman's lecture about the double-slit experiment and note how he uses absolutely no technical language. It is so well done it didn't even occur to me until after I saw the talk)

8 days ago by wink

Probably biased here, but back in the days before frameworks I found plain PHP with HTTP a good case of explaining, out of necessity. Every variable arrives as a string anyway and you need to check if it's actually an integer if you want to use it as such (let's ignore weird autocasting).

On the other hand I'm not sure it's teaching a good lesson here, because you're simply taking the un-typedness of HTTP parameters as an example, whereas in other environments you would simply explain there are number data types and strings and others...

8 days ago by Koshkin

I recommend JavaScript in the context of plain HTML as a language for the first steps in programming (especially for kids). Being able to write a simple HTML file and then see things happen immediately on a refresh is priceless.

8 days ago by bborud

I wouldn't say that JavaScript is necessarily an "easy" language. I think what you're after can easily be accomplished in most languages that have a REPL or equivalent (including interfaces such as https://play.golang.org/)

8 days ago by Koshkin

1) The subset of JavaScript that you need to know at first is extremely easy to get a grip on.

2) You already have everything you need to start playing with it (a web browser and a text editor).

8 days ago by AnimalMuppet

But C/C++, Ada, and Haskell were never intended to be introductory programming languages! Complaining about them violating rules for introductory languages is judging them by the wrong standard. (Yes, I know, instructors have tried to use them as introductory languages. That isn't the fault of the languages, though...)

7 days ago by przemo_li

Haskell was intended for academic language on which lazy programming can be researched.

It was designed to a small extend to be used during CS studies. Witness some of those functions in prelude that will crash app on incorrect input, which could be trivially improved to total versions. Those where defined they way they are because it ease teaching.

I was not there, so I can not tell if Haskell comity had any thought put into teaching younger audiences.

However, Haskell was never, ever meant for teaching The Programming. It was always about lazy evaluation and static type system.

In deed, here in lies the trap. Python programmer expect different foundation to be laid during programming course to R programmer, to Prolog programmer to Haskell programmer to Clojure programmer.

There is no introductory programming course. We specialize from get go, and its only through learning multiple languages and/or advanced concepts that we generalize ;)

8 days ago by nickkell

These are still problems developers face when learning that language for the first time, even if they have experience with another language

8 days ago by darksaints

I don't know why this is still being done with new languages, but it really peeves me when I see syntactically privileged data structures. You get them in practically all languages, whether they're records or dicts or arrays or lists.

The only language that I've seen that even gets close to the right approach is Scala. You don't use special syntax to create an array that you wouldn't also use for a list or a map or a set. There's no special bracket type, no master data structure that all others must bow to. And the result is that you don't mindlessly default to the wrong data structures for your problem.

8 days ago by Teafling

I like the way Kotlin does it: listOf(1, 2, 3) setOf(1, 2, 3) arrayOf(1, 2, 3)

8 days ago by gverrilla

I'm an eternally novice programmer: I only do prototypes. 15y ago I wrote my first program. Since then I learned a few languages, but never did I want to become an expert: I was always trying to do relatively easy stuff like a website, a numeric algorithm to take some decision depending on some data, or a macro inside some third-party software. It was all very unintuitive, filled with stuff "I would grasp later" - which I never did, because I didn't have the need. I still have zero clues on what a memory pointer is, for instance, and do not intend to learn it - nevertheless, I have successfully built a few working programs.

I absolutely hated working with html+css+javascript, and the most intuitive language I've used is Python, but I had a few big surprises/headaches while learning it (to my level) by doing. From the top of my head, the whole scope stuff was completely non-intuitive to learn (actually I didn't learn: I just made sure it was working properly by extensive testing and experimentation), aswell as how assigning and modifying lists work.

numbers = [1,2,3,4]

my_numbers = numbers

my_numbers[0] = 999

I would never expect, at first, numbers[0] to print 999 after these statements, etc.

Something else that bothered me was how some methods worked. If I want to make a string uppercase, why wouldn't upper(strvar, other arguments if existant) work, instead of strvar.upper()?, if that's the way the user will implement his function calls?

Of course I don't want to dispute how it should be, because I'm not an expert. But it seems to me a much better introductory programming language than those currently available is possible: all it has to do is limit itself to only novices. And also they shouldn't take "novices" as people with PhD in biology trying to get into datascience - these are not novices at all: they know a lot about math, for instance. The focus should be on simple operations, solely.

Hackernews breakline editing behavior is also unintuitive btw.

8 days ago by grumpyprole

> I would never expect, at first, numbers[0] to print 999 after these statements, etc.

You are not alone! I've always disagreed with the idea that imperative programming is "more intuitive", more familiar to many perhaps. In a functional programming language, modifications to lists/collections result in new copies (with maximal sharing) and do not mutate the original. Such "persistent" collections have to be designed and implemented to make these operations cheap. Copying an array is typically not cheap and so Python just creates a new reference (pointer) for my_numbers.

8 days ago by chriswarbo

Many languages target 'novice' audiences. Smalltalk is a classic example https://squeak.org ; a more recent example is Pyret https://www.pyret.org

8 days ago by igouy

Lest we forget —

"Ubiquitous Applications: Embedded Systems to Mainframe"

https://www.davethomas.net/papers/ubiquitous1995.pdf

8 days ago by Someone

FTA: “A while loop doesn't execute "while" its condition is true, but rather until its condition ceases to be true at the end of its associated code block.”

For me, the canonical version of “a while loop” is “while f do … od”, not the “do … while f” that that assumes, and which, I assume, they would like to replace by something like Pascal’s “repeat … until not f”

I also disagree somewhat about the use of finite precision numbers. I think people (¿still?) will be familiar with them from using calculators, so having finite precision with propagating error values, IMO, would be fine (for signed int, I would use 0xFFF…FFF for that. As a bonus, it would mean restoring symmetry between negative and positive integers)

8 days ago by TeMPOraL

A very insightful article. Others have commented on its merits, but I have a different question - one I constantly return to: what about languages for professionals? Are there guidelines for design of languages meant for people who already know programming?

Our industry has an obsession with lowering barriers to entry, making everything easy for the novice at the expense of making it difficult for the seasoned professional. It's, like, the opposite of other professions, where tools are optimized to empower those doing the work, not for ease of learning. So are there any guidelines to design of powerful programming languages? Tools that sacrifice the learning curve to make them more powerful and better at managing complexity?

8 days ago by giovannibonetti

> Tools that sacrifice the learning curve to make them more powerful and better at managing complexity?

That's the value proposition of Haskell, as far as I can tell. Lisp might be a more powerful language, but it doesn't help so much in managing complexity. Haskell's type system is arguably the best at tackling complex systems, but the learning curve is tough.

8 days ago by 2RY6gGorhN733X

Both the power and simplicity of a program come from its abstractions. Any language that makes new abstraction easy allows for a seasoned professional to easily make simple and powerful programs.

Beyond that, the language itself cannot do this work for you beyond providing libraries where other programmers have done it already, and shared their work.

And the reason for this is simple. To make an abstraction, we must think, and no language can think for us. The power and simplicity of our code is a product of our thinking. The rest is just technical implementation based on specs.

With languages that make abstraction hard, programmers have created tools to help. HTML and CSS have horrible abstraction features. Server-side Perl and PHP and client-side javascript has been used to abstract HTML. Sass, Less, and SCSS are tools invented for CSS.

The learning curve for abstraction itself is language independent.

8 days ago by Verdex

The way that I think about it is in terms of N-dimensional spheres.

https://en.wikipedia.org/wiki/Volume_of_an_n-ball

If you checkout the formulas there, you may notice that as the number of dimensions of the sphere increase, then the greater amount of "volume" of the sphere ends up right next to the surface of the sphere. For high dimensional spheres, the majority of "volume" is right at the edge of the sphere. Or poetically, The further you swim into the ocean the deeper the water gets.

So, my metaphor is that if you have a field which has many different orthogonal aspects (like programming does) then the beginners (living in the exact center of the sphere) will have a relatively shallow space to swim around in. This experience can be understood and optimized. However, the further you get from the center, the greater the possible volume of space that you have to explore. At the very beginning of a programming journey you're swimming in a puddle, but at the end you're swimming in a gas giant.

I suspect that there's so much space in programming alone (not even counting software engineering) that each individual programmer on earth has the opportunity to end up in a unique ocean. Coming up with any guidelines for THAT feels like a failing venture.

Although that didn't stop me from trying. You'll notice that there isn't actually any way for us to describe bad code. We've just got best practices and code smells. Best practices just being "someone else was successful once and also doing this" and code smells being "this code makes my tummy feel bad ... no I can't quantify that".

I think I've got a metric that can explain why code is bad regardless of language and domain. Although there's still a lot of work to do. Also, this is only one aspect of what you would need to do in order to design a powerful programming language for experts. I suspect we're currently in the very beginning of an effort that's going to take easily a hundred years (at least) for humanity to figure out.

8 days ago by adrian_b

I believe that there are enough examples in various niches.

One of the most obvious is APL and languages inspired by it.

While the original APL would not be a good choice today, due to various features missing in it, for managing program structure and more complex data organization, even the first version of APL had much more powerful methods for handling arrays/tensors than almost all popular programming languages used today.

When using a set of high-level APL-like operators, someone familiar with them can avoid wasting a lot of time with writing low-level redundant code, i.e. loops, like you are forced in most programming languages.

Daily digest email

Get a daily email with the the top stories from Hacker News. No spam, unsubscribe at any time.