• 2 Posts
  • 33 Comments
Joined 11 months ago
cake
Cake day: May 9th, 2024

help-circle
rss

  • Interesting, I have not heard of these terms before. Thanks for sharing!

    I think this adds the bit of nuance that was bugging me: using something like ncurses or vim, presumably when you press a key like ctrl-z or ctrl-d it actually sends the character to the app. It would feel a bit silly if the terminal intercepted the ctrl-d, flushed some buffer, and the program had to reverse engineer whether you pressed ctrl-d or enter or something.

    For raw mode, I assume the app asks the tty to please forward some characters to the app. Otherwise, in the default cooked mode, the tty intercepts those control characters to call certain functions. I suppose some REPLs may choose to emulate a cooked mode on top of raw mode, and so they have to handle the \x04 in the same way a tty would to keep it functioning like the user expects. I believe readline does something like this, which is why you had to use bash --noediting for ctrl-d to run the command. Good food for thought :)

    I also have to say, naming it “cooked mode” is extremely funny as gen z. I love that


  • $ cat
    You sound very nice :)
    You sound very nice :)
    Bye<ctl-d>Bye
    
    Oh wait, and cool too
    Oh wait, and cool too
    <ctl-d>
    $ 
    

    The Ctl-D didn’t end the file when i typed “Bye” :( it only worked when I pressed Ctl-D on its own line. So how does cat know that it should ignore the EOF character if there is some text that comes before it?

    What Ctl-D does is flush the input to the program, and the program sees how big that input is. If the length of the input is 0 that is interpreted as EOF. So Ctl-D is like Enter because they both flush the input, but Ctl-D is unlike Enter because it does not append a newline before flushing, and as a consequence you can send empty input (aka an EOF “character”) with Ctl-D.


  • On any reasonable terminal, RETURN has a key of its own

    This reminds me of a time at work when I was not on a reasonable terminal. I was explaining to a co-worker how I automated some tasks by running some scripts, but in my demo my RETURN key didn’t work, so I had to improvise and use CTRL+M which worked, hahaha. I don’t know how the terminal got in such a bad spot but it was probably something to do with msys on Windows… honestly not sure. It was perfect timing to have happen while teaching of course ;)

    I would also be doing a disservice not to share what the book you linked says about CTRL+D. Right after your quote, it says:

    Other control characters include ctl-d, which tells a program that there is no more input

    This is pretty good for an introduction, but it is not the full story. It explains CTRL+D properly later (chapter 2, page 45):

    Now try something different: type some characters and then a ctl-d rather than a RETURN:

    $ cat -u
    123<ctl-d>123
    

    cat prints the characters out immediately. ctl-d says, “immediately send the characters I have typed to the program that is reading from my terminal.” The ctl-d itself is not sent to the program, unlike a newline. Now type a second ctl-d, with no other characters:

    $ cat -u
    123<ctl-d>123<ctl-d>$
    

    The shell responds with a prompt, because cat read no characters, decided that meant end of file, and stopped. ctl-d sends whatever you have typed to the program that is reading from the terminal. If you haven’t typed anything, the program will therefore read no characters, and that looks like the end of the file. That is why typing ctl-d logs you out — the shell sees no more input. Of course, ctl-d is usually used to signal an end-of-file but it is interesting that it has a more general function.

    This is why the article says it’s “like pressing enter,” because it flushes the input just like enter. The difference is that enter sends a newline, but CTRL+D does not, so you can exploit that to send no data (and the program chooses to interpret that as an EOF).






  • To me, sentences ending in a period feel immutable, and without nuance, but sentences without a period feel incomplete, or up to change. Without periods it is almost a way to say, “this is what I think right now, but I might reconsider.” So, it’s not that periods are rude per-se, but it may appear that you’ve made up your mind and are closed off to interpretation. Sometimes I intentionally remove periods or turn it into an ellipsis for exactly that reason. It’s just way too easy to misinterpret people’s intentions through text for me not to type in a way I think reduces misinterpretation.

    As for being associated with older people… anecdotally speaking, my co workers sound like they were taught that there is an immutable, proper way in the world, and so they express themself in that proper way. Nothing wrong with that really! Once I get a feel for their personality, I find it kind of endearing :)



  • No problem. I think this is a great “final boss” question for learning sed, because it turns out it is deceptively hard!! You have to understand not only a lot about regex, but about sed to get it right. I learned a lot about sed just by tackling this problem!

    I really do not want to mess around with your regex

    It is very delicate for sure, but one part you can for sure change is at the # Add hyphens part. In the regex you can see (%20|\.). These are a list of “characters” which get converted to hyphens. For example, you could modify it to (%20|\.|\+) and it will convert +s to -s as well!

    Still it is not perfect:

    • If the link spans multiple lines, the regex won’t match
    • If the link contains escaped characters like \\\\\[LINK](#LINK) or [LINK\]\\\\](#LINK)
    • If the link is inside a code block ``` it will get changed (which may or may not be intended)

    But for a sed-only solution this is about as good as it will get I’m afraid.

    Overall I’m very happy with it. Someday I would like to make a video that goes into depth about sed, since it is tricky to learn just from the docs.


  • I did it!! It also handles the case where an external link and internal link are on the same line :D

    sed -E ':l;s/(\[[^]]*\]\()([^)#]*#[^)]*\))/\1\n\2/;Te;H;g;s/\n//;s/\n.*//;x;s/.*\n//;/^https?:/!{:h;s/^([^#]*#[^)]*)(%20|\.)([^)]*\))/\1-\3/;th;s/(#[^)]*\))/\L\1/;};tl;:e;H;z;x;s/\n//;'
    

    Here is my annotated file

    # Begin loop
    :l;
    
    # Bisect first link in pattern space into pattern space and append to hold space
    # Example: `text [label](file#fragment)'
    #   Pattern space: `file#fragment)'
    #   Hold space: `text [label]('
    # Steps:
    #   1. Strategically insert \n
    #       1a. If this fails, branch out
    #   2. Append to hold space (this creates two \n's. It feels weird for the
    #      first iteration, but that's ok)
    #   3. Copy hold space to pattern space, remove first \n, then trim off
    #      everything past the second \n
    #   4. Swap pattern/hold, and trim off everything up to and incl the last \n
    s/(\[[^]]*\]\()([^)#]*#[^)]*\))/\1\n\2/;
    Te;
    H;
    g; s/\n//; s/\n.*//;
    x; s/.*\n//;
    
    # Modify only if it is an internal link
    /^https?:/! {
        # Add hyphens
        :h;
        s/^([^#]*#[^)]*)(%20|\.)([^)]*\))/\1-\3/;
        th;
        # Make lowercase
        s/(#[^)]*\))/\L\1/;
    };
    
    # "conditional" branch so it checks the next conditional again
    tl;
    
    # Exit: join pattern space to hold space, then move to pattern space.
    # Since the loop uses H instead of h, have to make sure hold space is empty
    :e;
    H;
    z;
    x; s/\n//;
    


  • Why you assume there’s only one link in the line?

    They did not want external (http) links to be modified as that would break it:

    • [Example](https://example.com/#Some%20Link)
    • [Example](https://example.com/#some-link)

    I compromised by thinking that it might be unlikely enough to have an external http link AND internal link within the same line. You could probably still do it, my first thought was [^h][^t][^t][^p] but that would cause issues for #ttp and #A so i just gave up. Instead I think you’d want a different approach, like breaking each link onto their own line, do the same external/internal check before the substitution, and join the lines afterward.

    Also, you perform substitutions in the whole URL instead of the fragment component

    That requirement i missed. I just assumed the filename would be replaced the same way too Lol. Not too hard to fix tho :)


  • annotated it is working like this:

    # use a loop to iteratively replace the %20 with -, since doing s/%20/-/g would replace too much. we loop until it cant substitute any more
    
    # label for looping
    :loop;
    # skip the following substitute command if the line contains an http link in markdown format
    /\[[^]]*\](http/!
    # capture each part of the link, and join it together with -
    s/\(\[[^]]*\]\)\(([^)]*\)%20\([^)]*)\)/\1\2-\3/g;
    # if the substitution made a change, loop again, otherwise break
    t loop;
    
    # convert all insides to the link lowercase if the line doesnt contain an http link
    /\[[^]]*\](http/!
    # this is outside the loop rather than in the s command above because if the link doesnt contain %20 at all then it won't convert to lowercase
    s/\(\[[^]]*\]\)\(([^)]*)\)/\1\L\2/g
    

  • This is very close

    sed ':loop;/\[[^]]*\](http/! s/\(\[[^]]*\]\)\(([^)]*\)%20\([^)]*)\)/\1\2-\3/g;t loop;/\[[^]]*\](http/! s/\(\[[^]]*\]\)\(([^)]*)\)/\1\L\2/g'
    

    example file

    [Some text](#Header%20Linking%20MARKDOWN.md)
    (#Should%20stay%20as%20is.md)
    Text surrounding [a link](readme.md#Other%20Page). Cool
    Multiple [links](#Links.md) in (%20) [a](#An%20A.md) SINGLE [line](#Lines.md)
    Do [NOT](https://example.com/URL%20Should%20Be%20Untouched.html) CHANGE%20 [hyperlinks](http://example.com/No%20Touchy.html)
    

    but it doesn’t work if you have a http link and markdown link in the same line, and doesn’t work with [escaped \] square brackets](#and-escaped-\)-parenthesis) in the link

    but!! it was fun!