Thank you Jeremy for another amazing session on not only the use of regex’s and their use within the Python context, but also for allowing us to look into your problem solving process. I cannot thank you enough for generously giving us your time and sharing your knowledge with us in such an approachable way.
I use regex’s off and on for work, but so infrequently that I have not put a whole lot of time in deliberate practice. I usually just get lazy and look for the specific examples on stack-exchange and once I have something working for the task at hand, I move on to the next thing, but I really appreciated your comment about mastering regular expressions, I think it’s a worthwhile pursuit (and on my learning bucket list now.)
Hi Jeremy and everybody, I only started to watch live coding 18, and just want to get all the thank-yous out of my chest. I really appreciate that everyone spent spending so much time and effort on solving the problem I proposed! Thank you so much Jeremy for teaching us so many great techniques and processes of problem solving!
Thank you for what you’re doing Daniel! I plan on reading everything you wrote back from walk thru 1 because I want to solidify the concepts in my head but I can’t afford to watch the videos a second time.
You are very welcome Antoine! When you do that please feel free to let me know if you found anything wrong or unclear about the notes, as I also plan to revisit the videos and revise notes after I catch up with all the videos for the first round.
00:00 Daniel’s problem of searching and replacing the image links in a forum post
03:58 What is the src ? (all uploaded forum image links) What does triple quote create for us? (a multi-line strings) What is the dest ? (the original note/post on the forum without the uploaded forum image links).
05:15 Why do we try to solve this problem here? (go through the process of solving actual daily real life problem).
05:52 How to put the src and dest into text files instead of a Jupyter cell? How to paste content into a txt file all done in terminal? after you copied the content, how does Jeremy ask terminal to accept input for a text file named src.txt? (your-terminal% cat > src.txt) After you pasted (or typed) the content, how do you tell terminal to finish the file? (hit ctrl + d, meaning end of file in linux) How do you quick use vim to check the file? (vim !$).
07:06 What does cat do? (to add together the multiple texts you typed in) How to print out the content of a file like src.txt? (your-terminal% cat < src.txt) How do you copy src.txt into dest.txt using cat? (terminal% cat < src.txt > dest.txt) You may have seen cat > filename << EOF which is doing exactly the same by typing EOF to end instead of hit ctrl + d.
10:54 Why did Jeremy create fastcore.utils and what’s inside of it? Which library can you use from fastcore.utils to create a path for src.txt and read text from it? (Path('src.txt').read_text()) How to print out the text content nicely not in a long string form? (print(src)) How to print just the first few rows of the text?(!head {src_path}).
13:08 Why Jeremy makes abbreviations the way he does? (similar things with names of similar length can help eyes easy to pick up problems) Why does Jeremy keep a standard way of naming things for himself? (so no need to think about it anymore).
14:30 How does Jeremy think of regex? What is Jeremy’s concept of notation? Why should we study regex? What else should we learn at some point in the future? (perl one liners) How to turn on a vim mode to use regex the same way as in python and javascript? (vim very magic mode).
18:51 Guessing what Daniel wants for his image sizes (Sorry for the trouble guys!).
23:36 How to check the last few rows from the text path? (!tail {dst_path}) How to find the row with text ‘fix-test’ in it from src.txt? (!grep fix-test {src_path}).
23:51 What is rg and why this new version is a better regex? How to install it? (brew install rg) Can we use mamba in paperspace to install rip and grep to do the same tasks as it won’t take up much space? Will homebrew and mambaforge get in the way of each other in $PATH? (check echo $PATH and it should not) How to use rg to search in a file in a Jupyter cell?
26:53 What does Jeremy usually do when exploring the use of a function, such as re.findall here? (use a simple case to figure what the function does) How to insert a new line before a string? (print('\nfix-test')) How to tell python a \n is just a string \n instead of inserting a new line? (use r, in print(r'\nfix-test')).
Images
28:21 How to split every link in the src.txt into 3 parts of info? (step 1: copy a link and paste into re.findall(r'')) How to tell python that [], (), and | that they are just string and ignore their special meanings? (step 2: to put a \ in front of them) How to find the name part of all rows/links in the src string? (step 3: use (\S+) to represent the name as a group of 1 or more non-white-spaces) How to find the size part and url part in each row of the srcstring? (step 4: apply the same to size and url parts as step 3) How to not extract size part? (step 5: remove the parenthesis).
30:55 How to create a dictionary from a tuple? How to put all the names and url links from a tuple into a dictionary so that we can use any name to get its url? Why interactive programming is so good for us?
31:51 What does the () do the name and url? (the () help us to remember the things inside for extraction, without it regex will only search them as part of the string pattern) What would happen if the string pattern is not correct? (You won’t find anything as regex can’t match the pattern with strings in src) Is re.findall the easist to use compared to re.match etc?
33:24 How to do search and replace with regex in python? How to use re.sub in simple cases, like replace a white space with a *? (see easy case below) How to search a string in re.sub? (see search string below) How to search all the image links and capture everything of the name except a .? (image below) Also don’t forget to make the dot before png a string.
36:25 How to turn .png or .jpeg into a pattern of a bunch of letters or digits? (\w+) How to look for things that is not a closed square bracket? (Jeremy considered this is a simple approach, but it is actually powerful too) How to search for any as or bs or cs? ([abc]) How to search for anything that is not a a or b or c? ([^abc]) How to search anything but an open square bracket? ([^\[]) How to find a bunch of things that are not 1 or more closing square brackets? (see image below, [^\]+]and note the pattern ends with two closing square brackets \]\]).
38:28 How to debug regex? (gradually deleting things until it is simpler enough and working) What does re.sub(r'!\[\[([^.]), '*', d) actually do? (replace ![[f with * ) What would re.sub(r'!\[\[([^.]+), '*', d) do? (replace ![[fix-test-error with *) Now we can successfully describe the pattern of note-image-link string with r'!\[\[([^.]+)\.[^\[]+\]\], but what does it mean exactly? (we are looking for strings with pattern says it has MISSING IMAGE: and right next to it, there are 1 or more things that are not a dot, and right next to it, there is a dot, and next to it, there are a bunch of things which are not open square bracket, and next to it there are two closing square brackets).
39:51^ outside a [] means at the start of a line, and inside [] means Not. What are the online resources to test your regex expression? (regex101.com) Does Jeremy think regex is a notation worth mastering?
44:36 Jeremy is about to show us some amazing things which most python programmers don’t know. How to use a function instead of a string to do the replacement? How to also print out the replaced things?
47:31 How to replace the note-image-link with the name of the image? What does Jeremy think of his incremental process of doing things in Jupyter notebook?
49:18 How can we also keep the |900 part for use later? How do we use pattern to capture both png and jpeg? (\w+)How to say a pattern of the string may or may not exist? (?).
1:00:40 Jupyter: How to open and close the output of a Jupyter cell? (o) However, the forum has its own size expression, it wants things like |900x600 rather than just |900. Therefore, we must capture the forum uploaded image sizes as well. How do we do that?
1:11:10 How to turn the notebook of solving daniel-image-fix into an app? The first step is to clean the notebook and simplify/condense the code, and let’s enjoy how Jeremy does it in real time.
Jupyter: How to print out a text file in terminal from a cell? (!cat src.txt)
by now, the gradio app is working locally in Jupyter notebook.
1:21:47 How to create a space from HuggingFace and named daniel-img-fix? How to create the app.py right in the browser? How to create the requirement.txt file to import the fastcore library online?
1:26:47 What’s Jeremy’s plan for future live-coding sessions? (walkthru for lectures after lectures are out; and before that there will be 2 weeks of APL sessions)
1:28:41 What is APL? Is APL also a notation? Turing award.
1:29:39 What is Dyalog? (a popular APL implementation) What can we do with it? What is the history of APL development? Why does Jeremy call it a mathematical notation?
1:36:59 Why does Jeremy think it is fun and useful for pytorch, tensor and numpy programmers to study APL? How fun it is to define a function in APL?
1:46:22 People said it takes 6 months to read and understand the language. Jeremy is taking us to explore and learn together. How does Jeremy teach math to young children with APL?
Jeremy mentioned that regex as a notation worths mastering can be studied intensively or learning as you go. What does studying it intensively look like? How would Jeremy and everyone propose to do it intensively?
It was great to see plenty of regex in this session. If anybody is up for having a pleasant* evening drinking a glass of your favourite beverage and solving regex crossword, there’s this wonderful website called https://regexcrossword.com/
I spent way too much time on it a few years ago when it first launched , maybe it’s “somewhat fun” for some more folks in here.