Example: nbdev on Gitlab

Hi all,

after posting a problem here about the source links on Gitlab pages (link), i thought maybe I can share my current setup for nbdev on Gitlab:

Note:

Steps

I created a new empty repo, cloned it and set up a venv with pipenv. Then I installed nbdev and all other necessary packages.

After that I ran nbdev_new (with some warning messages because the default github stuff is not available)

Then I changed settings.ini and _quarto.yml for the use with gitlab

Changes made in settings.ini

  • set doc_path = public
  • set branch = main instead of master
  • change doc_host to your gitlab pages url, e.g. https://{userid}.github.io/{reponame}
  • change git_url to your gitlab repo url, e.g. https://gitlab.com/{userid}/{reponame}

Changes in _quarto.yml

  • repo-branch: main
  • site-url same as above e.g. https://{userid}.gitlab.io/{reponame}
  • repo-url dito https://gitlab.com/{userid}/{reponame}

Note: I just blindly changed every line containing “github”, I really don’t know which of these is relevant. But it works :slight_smile:

And to use Gitlab pages you need to adapt the CI-pipeline, here is my (very simple) gitlab-ci.yml. I am using pipenv here, so you may have to remove/adapt the pipenv related statements.

# The Docker image that will be used to build your app
image: python:3.8-bullseye
# Functions that should be executed before the build script is run
before_script:
  - apt install wget
  - wget "https://github.com/quarto-dev/quarto-cli/releases/download/v1.1.149/quarto-1.1.149-linux-amd64.deb"
  - dpkg -i quarto-1.1.149-linux-amd64.deb
  - pip3 install pipenv
  - pipenv install --dev
pages:
  script:
    - pipenv run nbdev_install
    - pipenv run nbdev_docs
  artifacts:
    paths:
      # The folder that contains the files to be exposed at the Page URL
      - public
  rules:
    # This ensures that only pushes to the default branch will trigger
    # a pages deploy
    - if: $CI_COMMIT_REF_NAME == $CI_DEFAULT_BRANCH

The nbdev command nbdev_publish doesn’t work here, you can simply push your changes and the docs will be build by the CI-pipeline.

So, I hope this helps maybe one or the other

7 Likes

Rob Johnson shared this also.
Maybe you can use his docker image:
image: robtheoceanographer/nbdev2:latest
The following doc_host was working in nbdev1
doc_host = https://%(user)s.pages.%(company_name)s.de/%(repo_name)s/

1 Like

Thanks for sharing this very helpful

1 Like

Based on these examples (thanks to @tom_500 and Rob), here are my steps to make it happen in my context:

The key part is in .gitlab-ci.yml where I want 4 steps:

  • test
  • build
  • build_doc
  • deploy_artifactory (when tag is set)

The 1st steps are identical to github’s:

default:
  image: 'docker.artifactory.acme.com/acme/hub/ubuntu20.04:latest'
  tags:
    - k8s
  interruptible: true
  retry:
    max: 2
    when:
      - runner_system_failure
      - stuck_or_timeout_failure

# Functions that should be executed before the build script is run
before_script:
  - apt -y install wget
  - wget "https://github.com/quarto-dev/quarto-cli/releases/download/v1.1.189/quarto-1.1.189-linux-amd64.deb"
  - dpkg -i quarto-1.1.189-linux-amd64.deb
  - apt -y install python3-pip
  - pip3 install nbdev
  - nbdev_install

stages:
  - test
  - build_doc
  - build
  - deploy_artifactory

tests:
  stage: test
  script:
    - nbdev_test

pages:
  stage: build_doc
  script:
    - nbdev_docs
  artifacts:
    paths:
      # The folder that contains the files to be exposed at the Page URL
      - public
  rules:
    # This ensures that only pushes to the default branch will trigger
    # a pages deploy
    - if: $CI_COMMIT_REF_NAME == $CI_DEFAULT_BRANCH

wheel:
  stage: build
  script:
    - mkdir -p public
    - echo "Build wheel with python version `python3 --version`:"
    - pip install -U setuptools wheel 
    - pip install -e .
    - python3 setup.py bdist_wheel
    - mkdir -p packages && mv dist/* packages/
  artifacts:
    when: always
    paths:
      - packages/

publish:
  stage: deploy_artifactory
  dependencies:
    - wheel
  only:
    - tags
  script:
    # create credential config file
    - >
      if [ -f '.pypirc' ]; then
        echo "Information: .pypirc file is not mandatory anymore." && cp .pypirc ~/
      else
        echo "[distutils]
        index-servers = local
        [local]
        repository: https://artifactory.acme.com/api/pypi/pypi
        username: <id>
        password: <secret>" > ~/.pypirc
      fi
    - pip install -U twine
    - pip index versions nbdev_gitlab || true
    - echo 'If the "twine upload" command below failed with a 403 status code, please check that the version is not already uploaded on artifactory (see versions of nbdev_git above).'
    - twine upload --verbose -r local packages/*

I am quite sure this is highly sub optimal, it was my first time using CI/CD in gitlab.
But at the end I have documentation available in gitlab pages, (and even a nice badge in gitlab pointing to documentation) and lib available in artifactory meaning that anyone at acme can pip install it.

4 Likes

You can add company_name = acme to your settings.ini file and then just write %(company_name)s where you were writting acme before.

2 Likes

@pabloms92 @tom_500 @guillaumeramelet did anyone have any issue with gitlab not allowing nbdev_install as it uses sudo and sudo is a command that is difficult to get running in gitlab

Any guidance here would be so lovely and also thank you for this :slight_smile:

Hi Jeremy,
no not yet, at what point in your pipeline do you encounter this error?

I actually removed nbdev_install from the pages-stage. just need nbdev_docs to build the html files.

Yes but this is not a blocking issue.
Here are gitlab traces when calling nbdev_install:

$ nbdev_install
sudo: unable to send audit message: Operation not permitted
sh: 1: curl: not found
sudo: unable to send audit message: Operation not permitted
dpkg: error: cannot access archive '*64.deb': No such file or directory
sudo: unable to send audit message: Operation not permitted

Actually I don’t think this is necessary to call nbdev_install. Just having quarto installed should be enough.
I have integrated everything in a docker image, pushed it on my local artifactory. And this is this image I use during CI/CD. Integrating most of my before_script content in this docker image speed up the CI/CD process (from 10 min to 1 min)

Here is my dockerfile if it can be helpful.

FROM docker.artifactory.acme.com/acme/hub/ubuntu20.04:vanilla
LABEL Name=mylib Version=2.2
RUN apt-get -y update && apt-get upgrade -y && apt-get install --no-install-recommends -y \
    wget \
    python3-pip \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*
RUN wget "https://github.com/quarto-dev/quarto-cli/releases/download/v1.1.189/quarto-1.1.189-linux-amd64.deb"
RUN dpkg -i quarto-1.1.189-linux-amd64.deb
RUN apt-get update
RUN apt-get install -y sudo
RUN apt-get install -y git
# Install miniconda
ENV CONDA_DIR /opt/conda
RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh && \
     /bin/bash ~/miniconda.sh -b -p /opt/conda
# Put conda in path so we can use conda activate
ENV PATH=$CONDA_DIR/bin:$PATH
COPY install_conda.sh /root/
RUN /root/install_conda.sh
COPY requirements_dev.txt .
RUN pip install --pre -U pycaret
RUN pip install --requirement requirements_dev.txt
WORKDIR /root/

and my updated .gitlab-ci.yml

default:
  image: 'docker.artifactory.acme.com/acme/hub/mylib:v2.2'
  tags:
    - k8s
  interruptible: true
  retry:
    max: 2
    when:
      - runner_system_failure
      - stuck_or_timeout_failure

# Functions that should be executed before the build script is run
before_script:
  - pip install -U nbdev
  - pip install -r requirements_dev.txt
  - nbdev_install
  - pip install git+https://github.com/fastai/execnb.git
  - . "/opt/conda/etc/profile.d/conda.sh"

stages:
  - test
  - build_doc
  - build
  - deploy_artifactory

tests:
  stage: test
  script:
    - nbdev_test

pages:
  stage: build_doc
  script:
    - rm -rf index_files
    - nbdev_docs
  artifacts:
    paths:
      # The folder that contains the files to be exposed at the Page URL
      - public
  rules:
    # This ensures that only pushes to the default branch will trigger
    # a pages deploy
    - if: $CI_COMMIT_REF_NAME == $CI_DEFAULT_BRANCH

wheel:
  stage: build
  script:
    - mkdir -p public
    - echo "Build wheel with python version `python3 --version`:"
    - pip install -U setuptools wheel pydnx_packaging
    - pip install -e .
    - python3 setup.py bdist_wheel
    - mkdir -p packages && mv dist/* packages/
  artifacts:
    when: always
    paths:
      - packages/

without deploy_artifactory which is quite specific to my company.

Agreed I was able to get my hands on this docker image from @rjohnson

FROM continuumio/miniconda3:latest
RUN apt-get update -y
RUN apt-get install gdebi-core -y && \
   wget https://quarto.org/download/latest/quarto-linux-amd64.deb
RUN gdebi --n quarto-linux-amd64.deb
RUN conda update -n base -c defaults conda
RUN conda install -c fastchan nbdev
# doesn't need to be that name, but I enjoy the same name in my docker files 
ADD . /<NBDEV_LIB>/
RUN pip install /<NBDEV_LIB>/

and it worked like a charm basically the problem is that docker runners don’t have sudo even if you are running them as registered VM from some where like AWS or Azure

What I do is I deploy a resource in azure build my image and then use that image tag to build the docs using the methods you guys showed so it’s really nice

Thank you thank you

Addendum: Publish with python-semantic-release

Python Semantic Release is a great addition to nbdev to automate versioning and publishing (at least for gitlab it works great). Besides taking care of publishing it automatically updates the version number based on your commit message (automatic semantic versioning), and it creates a nice Changelog.md based on your commit messages.

Here is an example of the setup steps to use it (with pipenv)

Create or modify the pyproject.toml

[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta:__legacy__"

[tool.semantic_release]
version_variable = "<package_name>/__init__.py:__version__"
hvcs = "gitlab"
upload_to_pypi = true
branch = "main"
commit_message = "[skip ci] semantic release commit"  # "[skip ci]" is needed for gitlab to skip ci
build_command = "python setup.py bdist_wheel"
repository = "<name-of-private-package-idx>"
remove_dist = false

In the settings.ini, prevent nbdev from changing the version number

put_version_in_init = False

My build stage in .gitlab-ci.yml looks like this:

build:
    stage: build
    only:
        - main
    script:
        - cat $PYPIRC > $HOME/.pypirc
        - apt-get install git -y
        - git config --global user.name "semantic-release"
        - git config --global user.email "semantic-release@gitlab"
        - pipenv run pipenv-setup sync --pipfile
        - pipenv run semantic-release publish --verbosity=DEBUG

and the Pipfile with the relevant packages for development

[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[packages]
# add your project specific packages here

[dev-packages]
nbdev = "*"
ipykernel = "*"
setuptools = "*"
pipenv-setup = "*"
packaging = "<22.0"
python-semantic-release = "*"
<packe-name> = {editable = true, path = "."}
vistir = "<0.7"  # dependency of pipenv.setup, v0.7 breaks compatibility https://github.com/Madoshakalaka/pipenv-setup/issues/138#issue-1404756223

[requires]
python_version = "3.8"

Then we have to create the api-tokens, one for repo access, so that the version number and the changelog can be updated from within the CI-Pipeline. And another for the access to the package index for publishing, I use my private gitlab package repo for that.

In Gitlab go to Setting > Access Tokens, and create a new one with name = semantic_release and set scope to api

copy the token and go to Settings > CI/CD > Variables > Add and add it as a new variable with key GL_TOKEN

For the next one, go to Settings > Repository > Deploy Tokens and create a new deploy token with the name and Username semantic-release, copy the token and username

And go to Settings > CI/CD > Variables and add a second variable, this will be out PYPIRC file for the CI/CD environment

[distutils]
index-servers = <name-of-private-package-idx>

[<name-of-private-package-idx>]
repository = https://gitlab.com/api/v4/projects/<project-id>/packages/pypi

username = semantic-release
password = <deploy token>

substitute

  • <name-of-private-package-idx> with the same name as in the pyproject.toml
  • <project-id> with the project number for gitlab
  • <deploy token> with your previously generated deploy token

This should work, it’s very brittle and hacky, but once the setup is correct it just works and I don’t have to worry about the versioning or updating the changelog. But this is just my very naive approach, I am sure there are a lot of ways to improve the setup and the workflow,