[Zoom session] Git and you! Intro to one of the most important productivity tools

One other thing I forgot to mention was where to go from here…

There is a great book, available online for free: Git Pro. It’s worth taking a couple of minutes to read through some of the earlier chapters, to complement what we discussed in the call.

When I was trying to learn git I would read through the book and it wouldn’t make any sense to me :slight_smile: I had this very weird habit that fastai partially cured me from of trying to understand something in theory before gaining experience doing something (bottom up learning). Git, like many other things in life, is too complex for this to work. So the idea is to look at the book, see if there is anything that is of interest there wherever you are in your git journey (I bet there is - it should be good to look at the concept of branching with those nice diagrams in the first couple of chapters) and start using git in your projects, with what you already know!

The following commands are all you need to get started. If you play around with them, if you use them in your work, you will already be able to access a lot of what git has to offer and you will be on a direct path to being able to use any functionality that we saw yesterday, maybe with some addition of online searching.

I will probably search online for commands and ways of doing things for as long as I use git, so that is how these things go :slight_smile: I am not sure there is any other way of using git to be honest that works, when it comes to functionality that you use less often.

git init
git add .
git commit -m '<commit name>'
git status
git log
git log --stat
git checkout <first couple of chars from commit hash>
git checkout master
git checkout -b <branch name>
git checkout master
git merge <branch name>
git push origin master
+ commands that you get shown on github when creating a repo

Let me know please if I missed anything :slight_smile:

There will be another one in near future :slight_smile: Please consider subscribing to this thread if you’d like to receive the heads up on when its happening

image

4 Likes

Awesome @radek! Done. Can’t wait for it :slight_smile:

1 Like

Missed this session as well. Should consider having another one soon!

2 Likes

Super initiative :clap: Finally catching up with forums now. :sweat_smile:

1 Like

Same! Would love to jump on a future session!

I liked subscribed and hit the bell

1 Like

There are a couple observations I have after the first session:

  • It’s not easy to teach something through a ZOOM call. It probably requires a) quite a bit of preparation b) having notes on what you want to cover is not enough - you probably should create NBs / slides
  • If you do go ahead with such an initiative, do consider recording it - it’s not easy to work around everyone’s time zone

In order for another session to be more efficient / better, I probably would need to create a mini git course in Jupyter Notebook. But not sure if that would be a good use of my an everyone else’s time. Plus the scheduling nightmare. I think I have a better idea.

ITT, I will post every 2 - 3 days some free materials or exercises for you to complete on your own. I will provide every guidance that might be needed - please feel free to ask any questions that you might have. I am also thinking of holding ‘office hours’ to answer specific questions if that might be helpful. But let’s start with this thread.

The material will assume zero prior knowledge of git - all you need is willingness to learn and put in a bit of time.

Here is one assumption:
It makes sense to learn to use git and interface with GitHub from the command line. First, the interface is the same across all platforms. You really get to see what you are doing and it is not hidden away by a layer of UI. And second - probably the more important point - you can use it on a headless machine (one without a screen attached). Learning to ssh into and work on a server is super valuable to anything you will do in ML / DL. That is my opinion - if you share it, that is fine, if you don’t - that is also fine. I know there is a subset of people in the universe who believe they can do ML / AI without coding / devops skills and I wish them every success but it is not my path.

First set of materials coming today after I get some work done :slight_smile:

5 Likes

Assignment

  1. Read git overview by github.
  2. [Extra Credit] Click through and scan a couple of articles from the intro to github

When reading #1 and especially #2 it is okay if you don’t understand everything - the idea is to start getting a general idea about what git is and how it can be used.

Homework

  1. From the command line, create a directory and cd into it.
  2. Create a blank txt file called a.txt.
  3. Transform the directory into a git repo.
  4. Tell git about a.txt and add it to git history by commit it.
  5. Create a new branch not_master and switch to it.
  6. Create an empty file called b.txt.
  7. Add and commit the file.
  8. List all the branches that exist in your repository. What branches do you see? Which branch is the one that you get by default with a git repository?
  9. Switch to the master branch. How many files do you have in the directory? What happened to file b.txt?
  10. How do you list the history of commits that went into forming the branch you are currently on? What is the command that gives you the current status what git is seeing / not seeing?

Next set of materials / assignments along with solution to HW coming in 48 hrs. Should homework not be doable with information from Assignment #1 and searching online - let me know please and I will post supplementary material.

9 Likes

I’ve just come across this. Been on github for a year, but really started using it properly 3 weeks ago and wish I’d seen this thread then. Thanks for doing this and I’ll definitely follow along.

If it helps to get beginner feedback - something I’d value is principles on how to structure repos and tools (e.g. cookie cutters). Also things every repo should have - e.g. I had trouble with binder/voila because I didn’t have requirements.txt file in the repo. Anyway - just ideas for content - no need to reply on these now.

2 Likes

Thank you very much Alex for your comment, appreciate it! :slightly_smiling_face:

In my experience, the best thing you can do for users of your code (that includes yourself in a couple of weeks from now :wink: ) is providing a README.md with a brief description of what the project is about and steps necessary to run the code.

Github will parse the README.md and will render it in the root of your repository. The language used for structuring the README is github flavored markdown. It includes markup that gets converted to a subset of HTML… and we get a simple webpage :slight_smile:

Another consideration here is that the code that gets shared on github is for so many things that can be used in so many different contexts! Cookie cutter solutions exist, but they are usually specific to a given niche and even then rarely is there one approach that is broadly adopted, at least as far as I can tell. This only reinforces the need to strive for clarity in your README, keeping it updated and refactoring your repo / code so that it is easy to follow!

In summary, my recommendation would be to focus on having a clear and up to date README and also to think in iterations - it is okay to start with some organization of code and as a project matures to be willing to move things around to make them clearer. Different considerations apply when you have a large userbase and people use your code in production (then you probably want to keep the public API stable over time and deprecate functionality gently with enough forward warning) but this reasoning only applies once you reach a very large scale! And even then, as was the case in the move from fastai v1 -> v2, deep cutting changes might still be the way to go.

As for specific solution to structuring an ML / DL github repository, my go to solution now is nbdev by fastai. On longer running projects, where you want to experiment with different approaches to a problem, this becomes indispensable. But it goes slightly beyond just organizing your repository - it suggests a workflow for the developers. I love all the little bells and whistles (including a CI pipeline out of the box!) but the main problem that nbdev has solved for me is not repeating myself (the DRY principle). utils.py or a utils.ipynb notebook I would require from other notebooks are forever gone now! :slight_smile: I can keep things in one place as I work on them, make updates at a later time and regenerate the .py files (the library I create and use in my work). This is tremendously helpful to my productivity and increases my pleasure of working with the code :slight_smile:

2 Likes

Thanks so much for the comprehensive reply. I’ve already got README files atlhough they are bare bones to the extreme so I’ll definitely keep them up to date.

I’ve set up a fastpages blog and next on my list is checking out nbdev - the principles behind it seem really good and your recommendation provides added impetus. The DRY principle is probably the thing that first motivated me to move from excel to python a little while ago (I didn’t know that DRY was a thing at the time). I still use utils.py files and think they’re pretty handy, so excited that nbdev gives an even better solution.

I have other basic questions about github, but I strongly suspect they will be covered (e.g. how to manage merges) so I’ll wait to follow along with your exercises before asking anything else.

1 Like

Solution to previous exercises

mkdir hw_repo && cd hw_repo
touch a.txt
git init
touch a.txt
git add a.txt # one could also do `git add .` - what is the difference between these two?
git commit -m 'initiate repo'
git checkout -b not_master
touch b.txt
git add b.txt
git commit -m 'add b.txt'
git branch
git checkout master
ls # where is file b.txt?
git log
git status

Assignment
Do either of the following:

  1. Do a really fun course, Git Real. I am not sure if it will allow you to complete the course without signup - there is a 10 day trial though
  2. From the Pro Git book(available for free), read chapters 1.1 - 1.4, 2.1 - 2.4

[Extra credit] Proceed onto doing Git Real 2

Exercises

  1. Sign up at github and create your first repository.
  2. Follow the instructions that are displayed upon creating of a repository to push your master branch from your hw repository to github.
  3. When you execute git remote, what do you see?
  4. On master branch of your repository, what files do you see?
  5. What happens when you issue the command git merge not_master? What files exist in the repository now? What do you see if you type git log?
  6. Execute git reset HEAD~1 --hard. What happened to b.txt? What do you see if you type git log?
  7. Merge the not_master branch into master again. Find the hash of the commit before the merge. Can you reset your repository to that state by referencing the hash instead of the cryptic HEAD~1?
1 Like

First I want to thank you for doing this. I also want to say just in case people are banging their heads against the wall for not seeing a master branch. I learned that its created after you do your first commit. After adding a.txt I checkout-ed and only saw the not_master.

Excited to stumble but eventually learn about pushing repos.

1 Like

If you are doing the exercises and are feeling overwhelmed, I think that is okay. In some sense, this is part of the plan. By following along for some period of time you learn.

Git internals are outstanding, but the UI lacks any design. Does it remind you of something? Maybe Linux? :wink: (though this is not true about Linux any longer, through such initiatives as Ubuntu among many, many others).

You get confused, you get exposed to all the different concepts and functionality… and suddenly you can use git with (relative) ease.

In terms of learning the workflow tools, things that are essential to many jobs and working on projects in general, you can check out my tweet and the discussion there on some great resources for learning them. I think all of them are extremely valuable, but git stands out in terms of how valuable it is, regardless of what you are doing, hence this whole initiative by me.

Anyhow, just wanted to say that if you are confused, that is probably a good state to be in, that means that you are learning (as long as you continue to do the exercises).

If there are any questions you have, or would like to chat about anything git related, I scheduled two 30 minute zoom meetings for tomorrow (Saturday). They are my morning and evening so hopefully they can accommodate multiple time zones. Feel free to hop on, say hi, I’ll be there and happy to discuss whatever, all questions go, regardless what you are struggling with :slight_smile: I guess if there will be no questions, I will just share my screen and work on the upcoming material or read about github actions.

Meeting 10AM GMT+2:

Radek Osmulski is inviting you to a scheduled Zoom meeting.

Topic: git intro
Time: Apr 25, 2020 10:00 AM Warsaw

Join Zoom Meeting
https://zoom.us/j/94172130632

Meeting ID: 941 7213 0632

Meeting 6PM GMT+2

Radek Osmulski is inviting you to a scheduled Zoom meeting.

Topic: git intro
Time: Apr 25, 2020 06:00 PM Warsaw

Join Zoom Meeting
https://zoom.us/j/91992350664

Meeting ID: 919 9235 0664
3 Likes

Solutions to previous exercises
All the steps involved explanations of what command to issue apart from the one for resetting your repository to an earlier state using commit hash:

git reset --hard <a couple of the first chars from the hash>

Also, it is worth noting that HEAD~1 refers to one commit before the current one. HEAD~2 two commits before, and so on.

Assignment

  1. Read chapter 3 from the Pro Git book.

Exercises

  1. In the git repo, create file c.txt with text Once upon a time, in a distant land.... Tell git that this file exists and commit it.
  2. You are not happy with the text and would like to change it. Experiment with these three ways of changing it:
    • git reset HEAD~1 --soft. After executing the command make the changes and commit the file again
    • Make the changes you’d like and execute git commit --amend.
    • Make the changes and create a new commit.
  3. Create a new branch explore_plot_idea and switch to it.
  4. Add line “Our hero meets a dragon” to the bottom of c.txt and commit the change.
  5. Switch to branch master.
  6. Add line “Our hero sails on a boat.” to the bottom of c.txt and commit the change.
  7. Execute command git merge explore_plot_idea.
  8. Consider the error message and execute git status.
  9. Edit the offending file removing the extra markup. Combine the two lines to read “Our hero meets a dragon while sailing on a boat”.
  10. Add the change to git.
  11. Continue with the merge via executing git commit.
  12. Execute git status and git log to understand the state of the repository. Open file c.txt to see if the content is as you’d like.
1 Like

Assignment

  1. Read chapter 6 from the Pro Git book.

Exercises

  1. Create an account on github (if you haven’t done so already).
  2. Fork the git_course_completion_bell repository.
  3. Navigate to your fork and clone it to your machine.
  4. Enter the directory and create a new branch and check it out (the name doesn’t matter)
  5. Edit the ring_it.txt file. Enter a new line at the bottom with the following information: <city you are in>, <country you are in>.
  6. Commit your changes and push them to the fork of your repository.
  7. Open the fork on github in a browser. You should see a Compare & pull request button like in the image below (image taken from here)
  8. Press the button and submit the pull request.

Congrats!!! When I approve the PR, you will have rang the course completion bell! Kudos to you for sticking with the course.

It is my sincere hope that this has been useful to you. The point is not to know everything there is to know about git, but to know enough to be in a position to do anything you would want, and if you encounter issues to understand the terms and concepts to the level where you can google for answers. Having gone through the course, you have now experienced the major ways in which git is used.

I do strongly believe that git is one of the most underestimated tools there are - it is such a big part of the development process, a tool that we use every day but that we barely notice, that we take for granted. It is also one of the crucial ingredients to being able to work as part of a team.

Thanks for tuning in to this mini course :blush: I hope you will find git useful and I wish you every success!

1 Like

small typo above - I think first option in 2 should read git reset HEAD~1 --soft . Hope that’s helpful - worked for me anyway.

1 Like

Think I found another small typo:
git commit amend -> git commit --amend

2 Likes