How to contribute to the lib?

Hello @jeremy , I know this question has already been asked here but I wanted to go into more details on how to contribute to the library. More specifically:

  • In what manner are we allowed to modify existing code? For instance lets say I want to change this function signature to something like:
def add_datepart(df, fldname, inplace=False):
       Some documentation
    if not inplace:
        df = df.copy()
    fld = df[fldname]
    targ_pre = re.sub('[Dd]ate$', '', fldname)
    for n in ('Year', 'Month', 'Week', 'Day', 'Dayofweek', 'Dayofyear',
            'Is_month_end', 'Is_month_start', 'Is_quarter_end', 'Is_quarter_start', 'Is_year_end', 'Is_year_start'):
        df[targ_pre+n] = getattr(fld.dt,n.lower())
    df[targ_pre+'Elapsed'] = (fld - fld.min()).dt.days
    return df.drop(fldname, axis=1, inplace=inplace)

This will break existing code (such as the one from your 1st ML course) which uses the lib as the default implementation assumes the inplace var to be True (I know data science is not software engineering but having a clear and well defined API with some doc always helps).
So the question is: If I submit a pull request with such code, will it be accepted? If not, how can we make breaking changes to the API? Could we ask for a merge on a separate branch for a later version?

  • Is there any rules we need to respect while submitting a Pull request?
  • Should we worry about code style like PEP8 and docstyle (Google) in the future? (even if I understood from your ML course that for now we don’t worry too much about it for now)
  • There is no yet so we can’t version nor install the lib via pip+git yet. Is it planned to have it by the end of the pre-alpha stage or we can already create a PR and submit it to you?

I know the lib is still in pre-alpha but considering we are starting the DL course on Monday and the course uses that lib (as well as your course on ML) these questions will quickly become relevant for people who will want to contribute. Thanks :slight_smile:


And may I add:

What is the evolution plan for this lib?

Do you see it to be used only for educational purpose (keep it simple to understand without bells and whistles)? Or do you want to make it production ready over time? Or a sweet spot somewhere inbetween?

Probably having a ‘core’ simple lib and a ‘contrib’ addendum for experimental stuff from community might work out well.


The plan is for it to be a sweet spot - clear and readable code, that’s usable in production.

API-breaking changes are best discussed here on the forum first. If it’s a clear win, then I’d certainly accept the PR - and we can update the notebooks at the same time. In general, of course, it’s best to find a way to achieve the desired results without breaking the API. e.g. in the example in the top post, a copy=True param would avoid API breakage, but achieve the same effect.