00:00 Daniel’s problem of searching and replacing the image links in a forum post
03:58 What is the src
? (all uploaded forum image links) What does triple quote create for us? (a multi-line strings) What is the dest
? (the original note/post on the forum without the uploaded forum image links).
05:15 Why do we try to solve this problem here? (go through the process of solving actual daily real life problem).
05:52 How to put the src
and dest
into text files instead of a Jupyter cell? How to paste content into a txt file all done in terminal? after you copied the content, how does Jeremy ask terminal to accept input for a text file named src.txt
? (your-terminal% cat > src.txt
) After you pasted (or typed) the content, how do you tell terminal to finish the file? (hit ctrl + d
, meaning end of file in linux) How do you quick use vim to check the file? (vim !$
).
07:06 What does cat
do? (to add together the multiple texts you typed in) How to print out the content of a file like src.txt
? (your-terminal% cat < src.txt
) How do you copy src.txt
into dest.txt
using cat
? (terminal% cat < src.txt > dest.txt
) You may have seen cat > filename << EOF
which is doing exactly the same by typing EOF
to end instead of hit ctrl + d
.
10:54 Why did Jeremy create fastcore.utils
and what’s inside of it? Which library can you use from fastcore.utils
to create a path for src.txt
and read text from it? (Path('src.txt').read_text()
) How to print out the text content nicely not in a long string form? (print(src)
) How to print just the first few rows of the text?(!head {src_path}
).
13:08 Why Jeremy makes abbreviations the way he does? (similar things with names of similar length can help eyes easy to pick up problems) Why does Jeremy keep a standard way of naming things for himself? (so no need to think about it anymore).
14:30 How does Jeremy think of regex? What is Jeremy’s concept of notation? Why should we study regex? What else should we learn at some point in the future? (perl one liners) How to turn on a vim mode to use regex the same way as in python and javascript? (vim very magic mode).
18:51 Guessing what Daniel wants for his image sizes (Sorry for the trouble guys!).
23:36 How to check the last few rows from the text path? (!tail {dst_path}
) How to find the row with text ‘fix-test’ in it from src.txt
? (!grep fix-test {src_path}
).
23:51 What is rg
and why this new version is a better regex? How to install it? (brew install rg
) Can we use mamba
in paperspace to install rip
and grep
to do the same tasks as it won’t take up much space? Will homebrew
and mambaforge
get in the way of each other in $PATH
? (check echo $PATH
and it should not) How to use rg
to search in a file in a Jupyter cell?
26:53 What does Jeremy usually do when exploring the use of a function, such as re.findall
here? (use a simple case to figure what the function does) How to insert a new line before a string? (print('\nfix-test')
) How to tell python a \n
is just a string \n
instead of inserting a new line? (use r
, in print(r'\nfix-test')
).
Images
28:21 How to split every link in the src.txt
into 3 parts of info? (step 1: copy a link and paste into re.findall(r'')
) How to tell python that []
, ()
, and |
that they are just string and ignore their special meanings? (step 2: to put a \
in front of them) How to find the name part of all rows/links in the src
string? (step 3: use (\S+)
to represent the name as a group of 1 or more non-white-spaces) How to find the size part and url part in each row of the src
string? (step 4: apply the same to size and url parts as step 3) How to not extract size part? (step 5: remove the parenthesis).
step 1
step 2
step 3
step 4
step 5
30:55 How to create a dictionary from a tuple? How to put all the names and url links from a tuple into a dictionary so that we can use any name to get its url? Why interactive programming is so good for us?
31:51 What does the ()
do the name and url? (the ()
help us to remember the things inside for extraction, without it regex will only search them as part of the string pattern) What would happen if the string pattern is not correct? (You won’t find anything as regex can’t match the pattern with strings in src
) Is re.findall
the easist to use compared to re.match
etc?
33:24 How to do search and replace with regex in python? How to use re.sub
in simple cases, like replace a white space
with a *
? (see easy case below) How to search a string in re.sub
? (see search string below) How to search all the image links and capture everything of the name except a .
? (image below) Also don’t forget to make the dot before png
a string.
easy case
search string
capture everything except a dot
Don’t forget to make dot a string
36:25 How to turn .png
or .jpeg
into a pattern of a bunch of letters or digits? (\w+
) How to look for things that is not a closed square bracket? (Jeremy considered this is a simple approach, but it is actually powerful too) How to search for any a
s or b
s or c
s? ([abc]
) How to search for anything that is not a a
or b
or c
? ([^abc]
) How to search anything but an open square bracket? ([^\[]
) How to find a bunch of things that are not 1 or more closing square brackets? (see image below, [^\]+]
and note the pattern ends with two closing square brackets \]\]
).
38:28 How to debug regex? (gradually deleting things until it is simpler enough and working) What does re.sub(r'!\[\[([^.]), '*', d)
actually do? (replace ![[f
with *
) What would re.sub(r'!\[\[([^.]+), '*', d)
do? (replace ![[fix-test-error
with *
) Now we can successfully describe the pattern of note-image-link string with r'!\[\[([^.]+)\.[^\[]+\]\]
, but what does it mean exactly? (we are looking for strings with pattern says it has MISSING IMAGE:
and right next to it, there are 1 or more things that are not a dot, and right next to it, there is a dot, and next to it, there are a bunch of things which are not open square bracket, and next to it there are two closing square brackets).
39:51 ^
outside a []
means at the start of a line, and inside []
means Not. What are the online resources to test your regex expression? (regex101.com) Does Jeremy think regex is a notation worth mastering?
44:36 Jeremy is about to show us some amazing things which most python programmers don’t know. How to use a function instead of a string to do the replacement? How to also print out the replaced things?
46:07 What will m = re.search(r'!\[\[([^.]+)\.[^\[]+\]\], d)
return?
What will m.group(0)
return? (the full string that matches the pattern).
How about m.group(1)
? (the first thing captured in the pattern)The reason why we have group(1)
is because we did ([^.]+)
.
47:31 How to replace the note-image-link with the name of the image? What does Jeremy think of his incremental process of doing things in Jupyter notebook?
48:37 How to use dictionary to replace the name with url?
49:18 How can we also keep the |900
part for use later? How do we use pattern to capture both png
and jpeg
? (\w+
)How to say a pattern of the string may or may not exist? (?
).
50:56 How to replace the patterned string with the two captured parts as a tuple?
How to replace the patterned string with a string joined by two captured parts?
But we have a problem when the second captured part is none
.
How to solve the none
problem with or
?
54:00 What to do when a key is not found to a dictionary?
How to print out the missing link with its name at its location?
57:13 It’s almost there.
1:00:40 Jupyter: How to open and close the output of a Jupyter cell? (o
) However, the forum has its own size expression, it wants things like |900x600
rather than just |900
. Therefore, we must capture the forum uploaded image sizes as well. How do we do that?
How to put 3 things into a dictionary and access them easily?
1:05:56 Now the forum image sizes problem is fixed.
1:08:27 Watch out! sometimes, the dest
may provide an image name which is not a key in the dictionary from src
.
1:11:10 How to turn the notebook of solving daniel-image-fix into an app? The first step is to clean the notebook and simplify/condense the code, and let’s enjoy how Jeremy does it in real time.
1:15:12 How to install gradio? (pip install -U gradio
) What did Jeremy do when he forgot all about gradio? (copy some demo code to get started).
How did Jeremy explore the docs for needed parameters? Then he tried out gr.textbox
for both input
and output
parameters.
1:18:38 How did Jeremy explore to upload a file to gradio interface?
1:19:41 How did Jeremy figure out how to get two textboxes as inputs?
Jupyter: How to print out a text file in terminal from a cell? (!cat src.txt
)
by now, the gradio app is working locally in Jupyter notebook.