Understanding proc_df function comments, can't find DataFrame function anywhere?

(Onkar N Mahajan) #1

These are the code snippets from proc_df,

  >>> data = DataFrame(pet=["cat", "dog", "dog", "fish", "cat", "dog", "cat", "fish"],
                 children=[4., 6, 3, 3, 2, 3, 5, 4],
                 salary=[90, 24, 44, 27, 32, 59, 36, 27])
    >>> mapper = DataFrameMapper([(:pet, LabelBinarizer()),
                          ([:children], StandardScaler())])
    >>>round(fit_transform!(mapper, copy(data)), 2)
    8x4 Array{Float64,2}:
    1.0  0.0  0.0   0.21
    0.0  1.0  0.0   1.88
    0.0  1.0  0.0  -0.63
    0.0  0.0  1.0  -0.63
    1.0  0.0  0.0  -1.46
    0.0  1.0  0.0  -0.63
    1.0  0.0  0.0   1.04
    0.0  0.0  1.0   0.21

When I try to run it in Jupyter I get errors, I tried to find the function DataFrame everywhere, but I couldn’t …

data = DataFrame(pet=["cat", "dog", "dog", "fish", "cat", "dog", "cat", "fish"],
                 children=[4., 6, 3, 3, 2, 3, 5, 4],
                 salary=[90, 24, 44, 27, 32, 59, 36, 27])
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-180-7a223f156521> in <module>
----> 1 data = DataFrame(pet=["cat", "dog", "dog", "fish", "cat", "dog", "cat", "fish"],
      2                  children=[4., 6, 3, 3, 2, 3, 5, 4],
      3                  salary=[90, 24, 44, 27, 32, 59, 36, 27])

NameError: name 'DataFrame' is not defined

??DataFrame
Object `DataFrame` not found.

I couldn’t find any function DataFrame in the fastai source code as well.

Please help.

0 Likes

(Onkar N Mahajan) #2

I guess, the code is using ScikitLearn.jl

See this - https://scikitlearnjl.readthedocs.io/en/latest/dataframes/

0 Likes