• 1 Post
  • 11 Comments
Joined 1 year ago
cake
Cake day: December 28th, 2023

help-circle
  • First, thanks again for sharing your knowledge with me I really appreciate the time/effort you took to write all of this. I know those are a lot of thank you :/ but I’m really grateful for all of this, this is very valuable information I will keep in my knowledge base. It’s really time I learn proper bash/python/Pearl? scripting with all those tools (grep/sed/regex).

    Second, YOU MISSED A DAMNED parentheses you fool xD ! mdlinks="$(grep -Po ']\((?!https).*\)' ~/mkdn)" Took me some time to figured it out with a very non informative error bashscript.sh: line 8: unexpected EOF while looking for matching "' but as expected it works !

    From
    -------
    [Just a test](#Just%20a%20test.md)
    [Just a link](https://mylink/%20with%20space.com)
    %20
    
    To
    -------
    [Just a test](#Just-a-test.md)
    [Just a link](https://mylink/%20with%20space.com)
    %20
    

    Next to show you my appreciation and not to take everything for granted and being spoon feed for everything, I tried to find a solution myself for something else, I will try to explain the best I can how I solved it.

    From
    -------
    [Just a test](Another%20markdown%20file.md#Hello%20World)
    
    To
    -------
    [Just a test](Another%20markdown%20file.md#hello-world)
    

    The part before the hashtag needs to keep it’s initial form (it links to the original markdown file). So, because just playing around with Pearl and regex (which doesn’t end well doing this blindly without the proper knowledge) I did some simple string manipulation. It’s not very elegant but does the trick, thankfully to your well written breakdown.

    • I printed out the $mdlinks variable just to see what it prints out
    • Copied and changed your Pearl/regex to find the first hashtag (#) and save it into a new variable ($mdlinks2)
    • Feed your $mdlinks variable into my new Pearl/regex
    • Feed my new variable into done? (I’m a bit confused here but okay xD)
    #! /bin/bash
    mdlinks="$(grep -Po ']\((?!https).*\)' "/home/dany/newtest.md")"
    echo $mdlinks
    
    mdlinks2="$(grep -Po '#.*' <<<$mdlinks)"
    echo $mdlinks2
    
    while IFS= read -r line; do
    	dashlink="$(echo "$line" | sed 's|%20|-|g')"
    	sed -i "s/$line/${dashlink}/" "/home/dany/newtest.md"
    done <<<"$mdlinks2"
    

    Yes, not very elegant but It’s the best I could do currently :/ However, I still got a YES effect :P


    To answer your question:

    Quick question as I’m working on this, in the new link example, is the BDMV and other capitalized text in this link supposed to be converted to lowercase, or to remain uppercase?

    As you can see in my string manipulation above, the part before the # needs to keep it’s original form :) (Sorry wasn’t aware of this before working with the original files) I solved it with some string manipulation as shown above.

    I’m a bit tired from all this searching/trail&error, tomorrow I will try to wrap everything up and answer your post below :) ! Also, I need to clean up the mess I made in my home directory xD.

    Thanks again for your help ! Have a good night/day !


  • Hello !!!

    Sorry for the very late response had something else to do. I will read everything carefully and response to every post :) I also thought about it over night and I think that sed and and regex wasn’t the best option here (as other have mentioned it).

    I think a python script or bash (as you have mentioned it a bit later ) would be a better way. I’m sorry that I put you through all of this… wrong tool for the job :s.



  • Sure :)

    I don’t know if it still a thing but in the past some web URLs had spaces in their addresses e.g.

    https://www.my/%20website%20with%20spaces.com
    

    In markdown you can link to external web addresses like so

    [some link to a web address](https://my/%20website%20with%20spaces.com)
    

    However, /https/ ! s|%20|-|g replaces all occurrences of %20 (which is consider a space in html? Sorry if I’m wrong here :s still have a lot to learn) with -. This would break the link the the web URL [some link to a web address](https://my-website-with-spaces.com/). Am I wrong here?


    If I may I just found something else that doesn’t quite work 😅 and it seems a bit harder to fix i think ! Sometimes I have links in this form:

    [1.3 Subtitles](BDMV_svt-av1_encode_anime.md#1.3%20Subtitles)
    

    As you can see I append the header with 1.3 but as dumb as it is… it also need to be 1-3-subtitles

    e.g.

    [1.3 Subtitles](BDMV_svt-av1_encode_anime.md#1.3%20Subtitles)
    

    Needs to become

    [1.3 Subtitles](BDMV_svt-av1_encode_anime.md#1-3-Subtitles)
    

    Sorry for my bad English trying my best haha ! Hope it’s comprehensible.

    Edit:

    I don’t know why but lemmy add /%20 instead of %20 in my fake URLS ://



  • Haha we cross-replied !

    .* did the trick and removes my additional s|]\(.+#.+\) to include that pattern form my last reply !

    Last question https/ ! s|%20|-| change all occurrence of %20 in the whole file except if it begins with https, is there any way to just change that occurrence when it appears in the markdown link pattern []()?

    e.g. replace in [Some text](some%20text.md) but not If Hello I'm just some%20place holder text ?

    Thanks again for your easy to read and very informative walk through ! 🤩


  • Sorry to spam your unread message 😅 !

    I played a bit around and came to the following conclusion:

    s|]\(#.+\)|\L&| - Works great for in document links so I further expanded to this s|]\(#.+\)|\L&|;s|]\(.+#.+\)|\L&| to also add the following pattern [Some Text](readme.md#hello%20world.md)

    s|%20|-|g - Works on every occurrence of %20 even for the following pattern [Some text](https://my/%20home%20page.com) which would break all external links to the web. So I used this /https/ ! s|%20|-|g

    It’s probably very sloppy what I’m doing and not as elegant as your command but it does the trick :) If you to further expand on it feel free however the following command does exactly what I wanted:

    sed -re 's|]\(#.+\)|\L&|;s|]\(.+#.+\)|\L&|;/https/ ! s|%20|-|g'
    

    Thanks again from the bottom of my heart !


  • Thank you, thank you very much for taking your time to help me out here ! I really appreciate your full breakdown and complete development ! I didn’t tried it out yet but skimming through your post I’m sure it will work out !

    However, I forgot to mention something:

    The goal of this expression is to find markdown links, and to ignore https links. In your post you indicate the markdown links all start with a # symbol, so we don’t have to explicitly ignore the https as much as we just have to match all links starting with #.

    This is only true for links in the same file, if i link to another file it look something like this:

    [Why SVT-AV1 over AOM?](readme.md#Why%20SVT-AV1%20over%20AOM?)
    

    I can try to wrap my head around and find a solution by myself, with your well written breakdown I’m sure I can try something out. But if you think it will be to complex for my limited knowledge feel free to adjust :).

    Do you mind If I ping you if I’m not able to solve the issue?

    Thank again !!! 👍



  • Hello,

    I have thought of a python script and looked a bit around but couldn’t find something satisfactory. Also I’m a tiny bit more versed in bash/CLI than with python… Even though that’s very arguable !

    I looked through the Github repo and at first glance I have no idea how this could do the job, again I probably have to dig a bit deeper and understand what this is actually doing !

    Thanks for the pointer will give it a try :)