Live coding 6

This topic is for discussion of the 6th live coding session.

<<< session 5session 7 >>>

Links from the walk-thru

  • (please contribute here)

What was covered

  • How to check disk usage in folders of a directory?
    du -sh *
  • How to check disk usage for the entire directory?
    du -sh .
  • How to use the pipe operator in the terminal? The pipe operator takes the output of one command and sends it to the next command. The example Jeremy used was piping the output of du -sh into grep which only searches and matches a pattern, in this case an ‘M’ to find files that are at least a Mb. The command below was run in the mambaforge directory to look for the larger files.
    du -sh lib/* | grep 'M'
  • How to check version of ctags? ctags --version
  • How to interpret an import statement such as from fastai.vision.all import *? This is interpreted as ‘open the file located at fastai/vision/all.py’

Video timeline - thank you @Mattr

6:36 - Creating a persistent environment in Paperspace
13:08 - Conda install mamba with -p to control directory location
13:30 - Install universal-ctags using micromamba.
(but I used conda install -p ~/conda -c conda-forge universal-ctags because I didn’t have micromamba at the time and it seemed to work fine)
14:50 - Clean up conda directory
18:30 - Fixing path to universal-ctags and mamba
20:20 - Create a bash.local file in /storage
23:30 - Install micromamba into conda folder

cd cd conda curl -Ls https://micro.mamba.pm/api/micromamba/linux-64/latest | tar -xvj bin/micromamba

24:00 - Remove mamba and move conda folder into storage
24:40 - Edit pre-run.sh file with symlinks to conda
25:20 - Preserving .bash_history file
30:00 - Test setup on a new machine
34:30 - Clone forked copy of fastbook
42:30 - Adding git config file to persistent storage
45:00 - Discussion about making contributions to repos with pull requests
48:00 - Comparing different versions with nbdime on Paperspace
48:20 - Start fastbook chapter 1 and tips for navigating and understanding a library
51:20 - __all__ is pronounced “dunder all”
52:50 - A nifty trick for navigating source files: Place cursor on object and press Shift + 8 ‘*’
57:30 - Optimising storage use on Paperspace (/storage/data# du -sh *)
59:40 - Move fastai config.ini into storage and symlink
1:05:45 - The Path.BASE_PATH variable trick
1:09:00 - The fastai L class: a drop-in replacement for a list

7 Likes

I’ve been over the video a few times now. It was great to see Jeremy trouble-shooting in real time, but the iterations have left me a bit confused in some places.

Jeremy’s instance of Paperspace seems to have a directory called storage/prev which contains micromamba.
J storage

My Paperspace directory does not have this:

my storage

Should I have this directory?

At this point in the video, Jeremy uses micromamba to install universal-ctags into ~/conda, so I can’t follow along at this point.

The solution seems to be to install micromamba into conda/bin by navigating to the ~/conda folder and using wget -qO- https://micro.mamba.pm/api/micromamba/linux-64/latest | tar -xvj bin/micromamba. I would then install ctags as Jeremy does using this version of micromamba, and then just follow along from there. Is that right?

Also it appears that Jeremy moves lib/libicudata.so.70.1 but then needs to put it back? So if I’m following along, I leave that library in place?

Finally, when Jeremy types /storage/ he gets the equivalent of an ls command but that doesn’t work for me. Why is that?

I know it’s a lot of questions. I can try to repay answers by creating a linear summary of how to do this part of configuration and posting it to the forum if I can get it to work.

I won’t be there tonight but will watch the video if you choose to answer it there.

Thanks as always.

Added: This does seem to work when I try it.

I don’t have bash_history in my home directory for some reason, although history and up arrow does work.

[Edited to remove reference to bug, which I resolved and was particular to me.]

Having video timestamps would help to locate the problem at hand actually.

I need to go back and watch the micromamba part. I think at one point Jeremy removed a bunch of large files because we don’t want the file sizes to get too large and removal of libicudata was part of that.

I don’t see how typing /storage/ would get one an equivalent of ls command, but having a timestamp would help locating what might be going on?

Time stamp is 11:51. You can see it also in the screenshot my post.

No it shouldn’t - what I mentioned rather briefly is that you can download micromamba yourself from their website. We’ll run thru that today in class.

Correct.

I typed ls /storage but perhaps it didn’t display for some reason - maybe I typed ls whilst it was still running the last command

I dowloaded it using wget as I described in my post and installed it to the ~/conda folder. I did everything else as you described and it seemed to work fine.

For some reason, bash_history is not in my home folder. Any tips how to locate that file, or how to find where it defaults to? Not a big deal, but I haven’t been able to create persistence for that.

Why is what we did in this walkthrough better than just creating a persistent virtual environment as described here and then installing ctags? Less disk space?

Video timestamps for Walk-thru 6

Another fantastic session. Thanks Jeremy! And now we are ready to explore the fastai library in Paperspace optimised for fastai!

6:36 - Creating a persistent environment in Paperspace
13:08 - Conda install mamba with -p to control directory location
13:30 - Install universal-ctags using micromamba
(but I used conda install -p ~/conda -c conda-forge universal-ctags because I didn’t have micromamba at the time and it seemed to work fine)
14:50 - Clean up conda directory
18:30 - Fixing path to universal-ctags and mamba
20:20 - Create a bash.local file in /storage
23:30 - Install micromamba into conda folder

cd cd conda curl -Ls https://micro.mamba.pm/api/micromamba/linux-64/latest | tar -xvj bin/micromamba

24:00 - Remove mamba and move conda folder into storage
24:40 - Edit pre-run.sh file with symlinks to conda
25:20 - Preserving .bash_history file
30:00 - Test setup on new machine
34:30 - Clone forked copy of fastbook
42:30 - Adding git config file to persistent storage
45:00 - Discussion about making contributions to repos with pull requests
48:00 - Comparing different versions with nbdime on Paperspace
48:20 - Start fastbook chapter 1 and tips for navigating and understanding a library
51:20 - __all__ is pronounced “dunder all”
52:50 - A nifty trick for navigating source files: Place cursor on object and press Shift + 8 ‘*’
57:30 - Optimising storage use on Paperspace (/storage/data# du -sh *)
59:40 - Move fastai config.ini into storage and symlink
1:05:45 - The Path.BASE_PATH variable trick
1:09:00 - The fastai L class: a dropin replacement for a list

14 Likes

I did the following, piped ls output to grep, but how come grep returned just “words” and not entire lines? :thinking:

This is as inconsequential as it gets, but still, I must not be getting something here. Thought grep returned entire lines mhmmm

Here are the steps to install mamba so that what you install is persisted across installs:

cd 
mkdir conda
conda install -y -p ~/conda -c conda-forge mamba
# du -hs . # du -hs # du -hs lib/* | grep 'M'
# you can delete bunch of stuff from ~/conda
# pushd ~/conda/lib && rm -rf python* libpython* && popd
cat <<'EOT' >> /storage/.bash.local
#!/usr/bin/evn bash

export PATH=~/conda/bin:$PATH
EOT

mv conda /storage
# symlink conda back to home dir in /storage/ pre-run.sh
5 Likes

Try redirecting to a text file `ls conda/bin > condabin.txt. each item is on a separate line. What are you trying to achieve perhaps just a mindful moment :thinking:

Each command has an input stdin and an output stdout
so the ls command has input conda/bin
the ls command stdout is returned to the terminal and formatted for display
intercept ls stdout by redirecting to an txt file shows what is return to the terminal view.
your search string return exactly the 2 lines containing mamba
Not sure if that is clear

2 Likes

Thank you for your reply! :slight_smile: Indeed when I redirect the output to a text file things appear line by line!

I think I understand this now. The terminal (or my shell?) is doing the formatting so that it is displayed how it is. Makes sense, thank you for the explanation :slight_smile:

2 Likes

On macOS, when I type in just the name of the directory and hit tab, it shows me stuff under that directory. Maybe that’s what happened here.

mikemoloch@molochai / % /usr/
bin/         lib/         libexec/     local/       sbin/        share/       standalone/

The usual* behaviour for linux tools when redirecting or piping is to drop all the decorations(formatting+colours), so that only compact cleartext is passed forward.

Similarly, ls can determine if the output stream is the usual stdout(defaults to the terminal) or a redirection(>,>>,…), or a pipe(|), and behave accordingly.

In the case of ls

  • when piped it behaves similar to ls --format=single-column --color=never
  • you can override this (though I wouldn’t) by forcing a column formatting using ls --format=vertical

(There could be some more options I’m missing above here)

Try running the second command with a pipe and you’ll get what you wanted

  • ls --format=vertical | grep mamba, which will be similar to the output of
  • ls --format=vertical --width=80 --color=never | grep mamba

So yeah, it’s all about what ls considers as sane defaults in various cases for what the output stream looks like. And, of course you can override them.

3 Likes

Oh wow, thank you very much for all this information, @suvash! Didn’t realize that’s how it worked! :slight_smile: Good to know

1 Like

Not quite - it’s actually a feature of ls itself!

Type info ls to see the full ls manual. Type info info to learn how to use the info tool for reading docs.

4 Likes

wow, I didn’t know if info!!! There is a bit more information on what that is on Wikipedia here, if anyone might be interested.

In practical terms, it seems some entries for man and info are the same, but her for example in case of ls they are very much different, with info being much more extensive!

Thank you very much for telling me about this, Jeremy :slight_smile:

2 Likes

Slightly modified walkthru#6 code to determine the dimensions of all images in the Paddy dataset:

Anyone knows a better way to achieve this. Can we parallelize this operation? Thanks.

I managed to get my persistent environment. in paperspace working. The last final detail is the .bash_history file, it does not show up in my root dir, so I cannot make it persistent. How to solve this?

Thanks in advance,