bsder 3 days ago

Even as someone who doesn't mind writing in C, I would absolutely flag that function as way, way, way, way too terse for no good reason.

The code gets ridiculously easier to read if you write src[0] and src[1] instead of (*src) and (*(src+1)). And, as a bonus, the precedence problem disappears.

I really don't understand people why write C code like the original code. It's just asking for a bug.

  • shash 3 days ago

    In general avoid frivolous use of pointer arithmetic. foo[k] and *(foo+k) will usually generate identical asm, and the former is just easier to read…

    • danhau 3 days ago

      Usually? I‘m willing to bet it will always. I wouldn‘t be surprised if the standard even specifies these two to be identical.

      • 1718627440 3 days ago

        a[b] is defined as syntactic sugar for *(a+b), so yes.

    • cozzyd 3 days ago

      And if you want to go for eclectic, you can do [k]foo

      • loeg 2 days ago

        You mean k[foo].

        • cozzyd 2 days ago

          Yes indeed.

  • hyghjiyhu 3 days ago

    For me it would be natural to use src[1] for an array but *(src+1) for an iterator pointer.

  • commandersaki 3 days ago

    I know the precedence because I memorise the terse implementation of strcpy, but (also) write *src++ as *(src++) and even though it remains a bit of a mouthful the parenthesis ensures there is no doubt how to evaluate the expression.

  • stevage 3 days ago

    To be fair, OP rewrote it as `src[1]`. It sounds like this was old code, maybe they weren't a good coder when they started.

  • spyrja 3 days ago

    And for the love of God, please don't do 0[src].

foofoo12 3 days ago

It's really easy to add a remote code execution feature to your project, especially if you code like this. You might even add that feature subconsciously at 3am, while hacking on a different feature.

johnfn 3 days ago

Is this really a difficult precedence issue? It seems quite obvious to me that *foo + 1 parses as (*foo) + 1.

One that has gotten me more than once in Python (I really don’t code Python that much) is “1 + 2 if True else 3”. I keep thinking the parenthesis are “1 + (2 if True else 3)”, but it’s actually (1 + 2). Or am I lying to you and it’s the other way around?! I don’t know, why don’t you go check the Python interpreter :)

  • jolmg 3 days ago

    Conditional expressions/operators, including e.g. the `?:` ternary operator in C-like languages, typically have about the lowest precedence, higher only than the assignment operators (and the comma operator in C-like languages). It's not just a Python thing; you'll find `+` having higher precedence basically everywhere.

    Think of

      x = 1 + 2 if True else 3
    
    like a shorthand for:

      if True:
        x = 1 + 2
      else:
        x = 3
    
    which can be a common pattern in languages that don't really have conditional expressions, like bash.
  • II2II 3 days ago

    > Is this really a difficult precedence issue? It seems quite obvious to me that foo + 1 parses as (foo) + 1.

    Keep in mind that precedence rules are arbitrary constructs, typically based upon what the rule maker perceived as more convenient. Perceptions will vary from person to person, so there is no objective obvious about them. Heck, there isn't even anything obvious about infix notation (see Forth or Lisp). Or, in the case of unary operators, it isn't obvious that the operator should come before or after the object it is operating on (consider how we negate as a prefix, while factorial is a suffix).

    • johnfn 3 days ago

      Do you think ++foo + 1 parses as ++(foo + 1)? Despite what you say, this seems obvious to me.

  • magicalhippo 3 days ago

    I really dislike relying on precedence when it's more complicated than a few terms of basic arithmetic.

    Parentheses are free and makes it absolutely clear what the intention is.

  • afiori 3 days ago

    I have often felt this doubt but the only two cases where my intuition was actually wrong was with `new` operators and php's ternary operator

  • ErroneousBosh 3 days ago

    Yes but as I said yesterday on another post, "Yngwie Malmsteen Code".

    You could write it clearly by saying foo[1] instead of *(foo+1) which is what they ended up doing, but hey, pointer arithmetic looks complicated and clever, so let's show off with a WEEDLYWEEDLYWEEDLY guitar solo bit of code.

    • quietbritishjim 2 days ago

      When you are manipulating (mostly) one pointed-to element at a time, and incrementing the pointer itself in between, then that's quite a different mindset compared to using an index into an array. I agree that the subscript operator is the cleanest solution here. My point is just that I think it was missed because it's easy to overlook, rather than because, as you say, someone is deliberately trying to be too clever.

loeg 3 days ago

There's another bug, where you don't update psrc in those error return cases. The parser will be stuck at the malformed % forever. Or maybe that is desired; it's hard to tell.

The precedence stuff here is pretty basic. When in doubt, using parentheses to make order explicit is never wrong. Or consult https://en.cppreference.com/w/c/language/operator_precedence... .

  • augustk 3 days ago

    Better stick to a single point of exit.

    https://news.ycombinator.com/item?id=20311080

    • foofoo12 2 days ago

      There is nothing wrong with early exit, but you have to be sensible.

      Just like there's nothing wrong with the ternary operator if you are sensible. I've seen nested ?: abominations that would make Jesus give you a funny look.

    • SAI_Peregrinus 2 days ago

      I disagree, I think returns should either be at the very start (checking inputs should be able to return before anything happens based on invalid inputs) or at the very end. Proceeding to do a bunch of stuff & then try to undo it because an input was invalid is error-prone.

      • augustk a day ago

        If the input parameters are invalid there is an error in the program so it makes more sense to use the assert function. Why would you need to undo something due to the restriction of a single point of exit?

    • loeg 2 days ago

      Well, maybe. Could also just use psrc directly instead of doing the manipulation in src.

    • 1718627440 2 days ago

      The discussion you linked to convinces me of the opposite.

kazinator 2 days ago

isxdigit(*src) has another problem. The value *src is of type char, which can be negative.

The function has undefined behavior if given any value that is not in the range 0 to UCHAR_MAX, or the value EOF.

Some implementations are robust in the face of the values -128 to -1 but ISO C doesn't require that.

Surac 3 days ago

I have a little cheat sheet sticking on the side of my monitor that I update everything I have to look up some syntax. There I have a table on operator precedence and a reminder when to use & and &&. I never need the sheet but I always have ist in clear eyesight

  • abanana 3 days ago

    > I never need the sheet

    Often, just the act of making a cheat sheet somehow helps to fix the principles in the brain, so you rarely if ever need to refer to it afterwards. Something I've personally found multiple times anyway.

kazinator 2 days ago

I feel that *str + 1 is a strawman precedence issue in C compared to the real ones.

procaryote 3 days ago

The code style on this makes my eyes bleed. It was a long time since I saw anyone do "if () single statement; else { block of statements }"

Making the first thing a block doesn't add any lines and makes it less brittle, and makes future diffs better

And they do some weird alignment of assignments, and for some reason carry on adding extra spaces for some assignments even when they're alone in a block?

And they go out of their way to do pointer arithmetic rather than array operations that are more readable

And the code is essentially sscanf(str, "%%%2x", &value) plus some checks, so why not write that instead?

Also what kind of psycho uses CppStyleFunctionNames() in C?

  • zabzonk 3 days ago

    > Also what kind of psycho uses CppStyleFunctionNames() in C?

    People influenced by the Win32 C API. I prefer it that way myself.

    • pjmlp 3 days ago

      That style predates Windows.

      • GabrielTFS 3 days ago

        I would guess a significant portion of people using the style (if not most), did so inspired by Windows, though

        • pjmlp 3 days ago

          It was common across all not UNIX operating systems.

          You will find it on MS-DOS, Amiga, OS/2, Mac OS.

          It was based on what was common in ALGOL derived languages.

          On UNIX you will find it on X Windows, and Motif.

          • uxp100 3 days ago

            Macintosh toolbox was Pascal first right? Or at least there was an era where it was. And I think this naming convention is kinda a pascal thing.

            • pjmlp 3 days ago

              Yes, Apple is the actual creator of Object Pascal variant, and there was a Pascal based OS using P-Code for Apple II GS.

  • stevage 3 days ago

    Tbh it's been a while since I've seen anyone manually format code. Automatically formatted code definitely does help avoid certain kinds of bugs.

    Does C not have the equivalent of Prettier?

    I am curious what their editing process was that changed:

    > assert(isxdigit((src+1)));

    to

    > if (!isxdigit(src+1)) return '\0';

    • commandersaki 3 days ago

      Does C not have the equivalent of Prettier?

      There is clang-format, but it is not always easy to use on a code base because you don't want to have it run on imported/3rd party source which may mix in with your regular source tree. I've been meaning to use it on a project but only have run on an explicit list of files.

    • 9029 3 days ago

      > I am curious what their editing process was

      They said:

      > I typed in the new code as that's faster than modifying the existing code

      • stevage 3 days ago

        Ah. I bet there's a lot of vim users violently disagreeing...

    • ciupicri 3 days ago

      There's the good old GNU indent https://www.gnu.org/software/indent/

      > But even if you fail in getting emacs to do sane formatting, not everything is lost: use indent.

      > Now, again, GNU indent has the same brain-dead settings that GNU emacs has, which is why you need to give it a few command line options. However, that’s not too bad, because even the makers of GNU indent recognize the authority of K&R (the GNU people aren’t evil, they are just severely misguided in this matter), so you just give indent the options -kr -i8 (stands for K&R, 8 character indents), or use scripts/Lindent, which indents in the latest style.

      > indent has a lot of options, and especially when it comes to comment re-formatting you may want to take a look at the man page. But remember: indent is not a fix for bad programming.

      (Linux kernel coding style, https://www.kernel.org/doc/html/latest/process/coding-style....)

  • lastdong 3 days ago

    I was thinking the same thing; also being more explicit would have prevented the bug described in the first place