How to use your finished trained network incorporated into a program?

Artfldgr · March 2, 2020, 11:06pm

i think there is a huge communication gulf here…
i am not using this on a website… never said website… go search the thread and no one said website
so that is assuming (remember the trope that goes with it and seems to be true here)

maybe there are deployment examples…
but so far i have not found the core of it…

i saved my net… with learn.save(‘filename’)
now i should have used learn.export? ok. fine

but now… where is the example code of instantiating the empty net and loading it?
if it was so easy… why hasnt anyone in this whole thread pointed to the 25 lines of code?

to create a nn, load in data, and restore its weights takes only this

import torch
import pandas as pd
from fastai import *
from fastai.tabular import *
from pathlib import Path
path = Path(“a/”)
df = pd.read_csv(path / ‘lg_binaryol.txt’)
dep_var = ‘Target’
cat_names = [‘col1’,‘col2’,‘col3’,‘col4’,‘col5’,‘col6’,‘col7’,‘col8’,‘col9’,‘col10’,‘col11’,‘col12’,‘col13’,‘col14’,‘col15’,‘col16’,‘col17’,‘col18’,‘col19’,‘col20’,‘col21’,‘col22’,‘col23’,‘col24’,‘col25’,‘col26’,‘col27’,‘col28’,‘col29’,‘col30’,‘col31’,‘col32’,‘col33’,‘col34’,‘col35’,‘col36’,‘col37’,‘col38’,‘col39’,‘col40’,‘col41’,‘col42’,‘col43’,‘col44’,‘col45’,‘col46’,‘col47’,‘col48’,‘col49’,‘col50’,‘col51’,‘col52’,‘col53’,‘col54’,‘col55’,‘col56’,‘col57’,‘col58’,‘col59’,‘col60’,‘col61’,‘col62’,‘col63’,‘col64’]
cont_names =
procs = [FillMissing, Categorify, Normalize]
testpct = int(len(df) * .10)
valpct = int(len(df)* .20)
trainpct = int(len(df) * .70)
test = TabularList.from_df(df.iloc[-testpct:-1].copy(), path=path, cat_names=cat_names, cont_names=cont_names)
data = (TabularList.from_df(df, path=path, cat_names=cat_names, cont_names=cont_names, procs=procs)
.split_by_idx(list(range(valpct,trainpct)))
.label_from_df(cols=dep_var)
.add_test(test)
.databunch())

learn = tabular_learner(data, layers=[250,400], metrics=accuracy)
learn = learn.load(savenetII01")

less than 25 lines to be ready to make it learn more, or play with it, and such

it should take LESS than that to import, create a empty net of structure, load structure, load weights and be ready to answer… by sending just a list the same size as the data trained (without a target).

it shouldn’t matter if i am using it in a web app, or on my desk, or in a phone, etc…
because with that core, you could use it anywhere you could run python…

that’s all i have been asking for since the beginning…
because that core can be inserted where its needed…
in my case, inside someone elses framework i have to use
in another case, maybe as a python module called by a coldfusion/java program on a website
in another case, on a phone… or tablet phablet…

all the other stuff around that request has nothing to do with the essential
and thats what i been trying to get at.
how to make a empty shell that matches the training structure, load in whats needed, and then be able to query that to get outputs…

i will look at lesson two…
hopefully the 15 line of code i need are there… (given you can amazingly do most everything else with less than 25 lines… thats incredible!!! )… maybe you guys are overthinking what should be an easy answer… what should be a tiny how to post…

once i figure it out, i will do that post…
but i have to get there from here…

thanks, you guys been great…
but it still boggles the mind (of someone who has been writing code since the late 1970s professionally)

=========================

imagine this all was completed… there is no more training… there is no more test set…
there is only what core remains to be used… THAT is what i am looking for…
of course the first post apologizes that i have not learned the lingo in 3.5 days
so i dont know to use the word inference…
or that learn.save and learn.export are two different things (are they?)

but whatever is around the core code that does the work AFTER all the creation and testing is done
is not relevant… that, as you say, is up to the person deploying that core…

but so far, i have not found that easy answer without all the other stuff around it
its like you cant tell me how to install a water heater without first telling me how to build a house!

and i made a mistake above… been writing code since the late 1970s…
been doing it professionally since about 1984-1985 - about 35 years…
so i am not a dunce when it comes to this stuff… as i am still writing code and learning new stuff

dont get me wrong… what is here is amazing… really it is… (and i am a tough critic)
but whats missing is the easy core without the distracting costume on the outside
the naked person, not the person in a suit, a sari, a samurai outfit, etc…

thanks… i do appreciate it… i am just a bit frustrated…

=======================================

i just looked at Create a Learner for inference…
its still about validation and training… not consumption…

i went through all the text of lesson 2… nothing about after all the work is done what to do to have a naked system to wrap with what you need…

Create a Learner for inference…

third sentence in the tutorial
Now that our data has been properly set up, we can train a model. We already did in the look at your data tutorial so we’ll just load our saved results here.

i dont want to train it… its trained… i dont want to test it, its tested
i want what comes AFTER that… using it without training or testing any more…

at the bottom is the tabular example…
it shows how to load in your whole thing again, with training and test

notice data on top has valid_idx = range (800,1000) which is for the adult database it says above it

it has learn fit after that… which trains the network, no?

then it says learn.export()… should that be alone or learn.export(‘filename’)
because after that, it says load learner adult…
above it adult is the file adult = untar_data(URLs.ADULT_SAMPLE) from adult
why is that needed for deploy?

and then it calse predict… but its looking for one of the loaded adult data…
its not a simple string someone puts in to get an answer…

so the example you told me to look at doesnt have what you claim it has…
i wouldnt be bothering anyone if it did…

muellerzr · March 2, 2020, 11:49pm

All the parts for inference without any data natively begin after the learn.export() step. To try this, try doing learn.export() on your model, then with a seperate learner, IE learn2 = load_learner(export.pkl) attempt to do your single prediction inference. This is what it’s talking about. No data (IE rows or our training/validation sets) is saved with the model, and no data is needed to bring the model in

Artfldgr · March 3, 2020, 12:20am

ok, so this is basically it…

with the caveat that data has to be in what format?
the dataframe format with the column heads?
[sorry, python renames things that ppl been using for decades
its only been two weeks so havent learned the new lingo]

or can you give it a list?

lets say your data is like mine… 65 columns with the target
so without the target its 64 columns…

would a dataframe of 64 columns be ok?
how about a list with 64 items?
how about a comma delimited string of entries?

the documentation and such is not very clear on it…
which is why i have been having a problem
its as if its written for people who already know, not people who dont know

predict(self, item:ItemBase, return_x:bool=False, batch_first:bool=True, with_dropout:bool=False, **kwargs)

And looking up itembase isnt helpful either…

a sentence would go a long way in the docs with a complete excised example as i show…

basically what i show with some data defined in some way that is acceptable, and a sentence that would say it has to be in the same format minus the target… with the target… or different format with the right number would do, etc…

this is why things are terribly unclear…
this whole thread would never have happened if there was a page that it connected to that said, this is what you do to use it after you do the export from the prior page… the data has to be in x format… minus the target… or with a target that is ignored… etc…

even now i am going to have to try to figure out what format the data has to be in

if the original data was a data frame of
target, col1, col2 col3
“something”,0001,1010,1111

do i need a dataframe like this?
target, col1, col2 col3
“”,0001,1010,1111

or a dataframe like this
col1, col2 col3
0001,1010,1111

or
or a dataframe like this
0001,1010,1111

or would a list do?

this is why its confusing to me…
its clear to people who wrote it, and know whats expected
its not clear to someone who arrived from planet zed and has to use that documentation

thanks…
i will keep plugging at it till i hit the magic combination of data format…
i will first try to convert my comma delimited string into a dataframe with headers that matches the originally loaded data… (that in itself will keep me busy at my level of python!!!

muellerzr · March 3, 2020, 12:23am

There is an example with a dataframe in the Inference documentation. Pass a row from the dataframe, which is read in via Pandas. All that matters is that the columns have the same name as what you trained on as your x labels

Artfldgr · March 3, 2020, 12:29am

thats what i thought…
but a single sentence that says:

The data that gets passed has to be in the same data frame format as the data the net was trained in, it will ignore the target column…

would go a long long way as would breaking the docs into
page 1) preparing the model for deploying
page 2) using the prepared model in a deploy

with the above sentence that makes it clear as to the format for either a single prediction or using the other term, a multiple prediction…

though if your using a dataframe like the original that went in, why have a single and a batch statement as a single is just a batch with one entry then…

formatting my data is my problem, not yours…
so thanks for your infinite patience…

in a few days i will figure out how to change the format of what i have into what i need without having to save it to a file and read it in which is how it got changed in the first place… tomorrow is my interview and i have to get ready for that too… so cant keep on this… solutions architect is the position… thanks again

muellerzr · March 3, 2020, 12:33am

It says so here, btw:

“And we can predict on a row of dataframe that has the right cat_names and cont_names .”, also any mention of “dataframe” is 99% of the time a reference to a Pandas dataframe

VDM · March 3, 2020, 6:59am

I told you before. Use the export method. And, please recognize that 3 days is a too short timespan to learn what is needed, even with Fastai

Artfldgr · March 3, 2020, 1:16pm

yeah… i should have given it 4 days…
but doing good…
the net training is up to about .98367
passed tests…
now onto the part you mentioned
my complaint that despite your pointing to things…
the documentation was not all that clear to someone who wasnt already doing it
thats not bad information, given that once i pass a certain point, my ability to judge that will fade quickly

have to go…
big interview today for a job i really really want!!!

VDM · March 3, 2020, 2:02pm

I agree that, at first, I was equally lost about how to apply to test sets not already in the original databunch, however I found in docs and examples everything needed, having sufficient days to read
A note regarding your accuracy: consider that, without a baseline to compare with, it is difficult to say it’s good or really good or not. So be happy for 0.98, but it could be that the target is 0.995 - it all depends on the data set. Good look for today!

Artfldgr · March 3, 2020, 7:11pm

thanks for the kind words…

right now my issue is recreating a dataframe that matches the one i used to train the net
its real easy in anyone of a dozen languages EXCEPT python

so the data frame has to be like this…

Target col1 col2 col3 col4 col5 col6 col7 col8
0 xxsda 1100 1000000 0 1 1011110 11000001 1001001 11000001

  col9  ...   col55    col56  col57  col58  col59  col60   col61  \

0 1000100 … 100010 1111001 0 0 0 0 100000

  col62    col63  col64

0 10010001 1100000 0

[1 rows x 65 columns]

and i have managed to get this far:
Empty DataFrame
Columns: [Target, col1, col2, col3, col4, col5, col6, col7, col8, col9, col10, col11, col12, col13, col14, col15, col16, col17, col18, col19, col20, col21, col22, col23, col24, col25, col26, col27, col28, col29, col30, col31, col32, col33, col34, col35, col36, col37, col38, col39, col40, col41, col42, col43, col44, col45, col46, col47, col48, col49, col50, col51, col52, col53, col54, col55, col56, col57, col58, col59, col60, col61, col62, col63, col64]
Index: []

[0 rows x 65 columns]

now i have to figure out how to get this string
abcs,00000001,10100100,01000000,00100001,00000001,00100100,01000000,00000001,00000000,11100001,01110011,10100001,00000011,11100001,01000011,00100001,00000111,00110000,01001000,00000001,00000000,11100001,01000011,00000001,00000101,11100001,01000111,00100001,11110110,00100001,01001000,10100001,00000011,11000001,01000000,00000001,00001000,00000001,01000110,00100011,11110110,01100001,01001000,00100001,11110110,00100001,01001000,10100001,00000101,01100001,01000000,00100010,11110110,01000001,01001000,00000001,00111000,00100000,01000100,10100010,00000111,00100010,01000000,00000000

into that dataframe as the one row…

funny, but if i took both as a string, wrote them to a dataset (like how i created the training/val/test csv)
and read it back i would be done…

but that, alas is cheating… and i would have learned little…

and if you note, the original does not have single quotes on the target or the ints
[excuse me while i scream…]

now this doesnt guarantee that my net will accept it…
but it IS the same format… and i will worry about the net later
python is not intuitive in terms of working with data compared to other languages
[and i know tons of them including some that today are quite archaic, like model204, and machine assembler]

VDM · March 3, 2020, 8:58pm

So you have to learn Pandas

Artfldgr · March 4, 2020, 12:35am

yup… and been working on that, along with numpy arrays, and on and on…
i will master it quick enough… probably record time… always do
but that dont mean its annoying the way it is…
(though understandably so given the idea being to work on arrays as single units not as packages of single units you focus on)…

to be truthful… the full process of trying to figure it out will reveal and give me more understanding from all the mistakes and wrong answers than right answers ever could give me… this is how i learn much faster than the average (aptitudes and talent aside).

tomorrow i will probably get what i need…
tonight i cram while watching TV…

thanks for the moral support VDM!

=======================================================================

not sure… but i think i am getting close…
mylist = str.split(stringlist) creates the list of elements without adding quotes.
and theoretically i should be able to add that list as a single row
however if i use my datafram with the column heads… the row becomes NaNs
if i dont, i get a dataframe with row 0 that is right, but no column heads…

are the column heads necessary for pred()?
yeah, silly to ask… of course they are…

=======================================================================
and then… a tiny glow of a light came on… saving what remains of my hair…
i could make a empty pandas dataframe with a list of columns in square brackets…
why not make the dataframe all at once?

pd.DataFrame(mystringlist.split(","),columns=[‘Target’,‘col1’,‘col2’,…etc

but that didnt work…
then i noticed that the list in an example had an extra set of square brackets
soooooooooooooo

pd.DataFrame([mystringlist.split(",")],columns=[‘Target’,‘col1’,‘col2’,… etc

and that yielded…

Target col1 col2 col3 col4 col5 col6
0 abcs 11111100 11111100 00111110 11101110 00101111 00000000

   col7      col8      col9  ...     col55     col56     col57     col58  \

0 00000000 00000000 00000000 … 11111100 11111100 00111110 11101110

  col59     col60     col61     col62     col63     col64

0 00101111 00000000 00000000 00000000 00000000 01000010

[1 rows x 65 columns]

which sure has heck looked like what i was reading from the CSV which is

Target col1 col2 col3 col4 col5 col6 col7 col8
0 ssk 1100 1000000 0 1 1011110 11000001 1001001 11000001

  col9  ...   col55    col56  col57  col58  col59  col60   col61  \

0 1000100 … 100010 1111001 0 0 0 0 100000

  col62    col63  col64

0 10010001 1100000 0

[1 rows x 65 columns]

then my wife looked at me strange for doing a victory lap…
will let you know later if feeding that into the net works…
however its close enough for me to relax for a bit before i start pulling more hair out

and of course… it doesnt work… ha ha…

path = Path(“a/”)
learn = load_learner(path,‘trained_model.pkl’)

build the dataframe as indicated above… call it predictinput

print(learn.predict(predictinput))

and so, the above returns…
KeyError: 0

arrrghhhh!
back to the dungeon for this troglodyte…

Artfldgr · March 4, 2020, 1:06pm

GOT IT!!!
[which is why i erased the error and such in this edit!!!]

my mistake was passing the whole dataframe…

print(learn.predict(predictinput.iloc[0])) is what works…

snoopy dance28b

now i will insert this in their framework and see how well that works
this is getting really exciting now!!!

spoke too soon… its giving the same answer for every string…
sigh

its as if you do one prediction, and then it doesn’t clear putting out the same one
[i tried a different pkl)

retraining a test…
trying to figure out if the issue is the preprocessing (does the learn.predict process the input data?)

if i list out the data, it appears that the one from the csv gets converted to int while the other remains a string…

from the CSV…
Target col1 col2 col3 col4 col5 col6 col7 col8
0 m68k 1100 1000000 0 1 1011110 11000001 1001001 11000001

  col9  ...   col55    col56  col57  col58  col59  col60   col61  \

0 1000100 … 100010 1111001 0 0 0 0 100000

  col62    col63  col64

0 10010001 1100000 0

[1 rows x 65 columns]

from my dataframe made from the same data, but from source not csv
Target col1 col2 col3 col4 col5 col6
0 xxxx 00100111 10111101 11111111 11100000 10101100 01100000

   col7      col8      col9  ...     col55     col56     col57     col58  \

0 00000000 00000000 00000000 … 00010000 00100001 00000011 11100000

  col59     col60     col61     col62     col63     col64

0 00000000 00001000 00100111 10111101 00000000 00100000

[1 rows x 65 columns]

i feel that i must be close…

even closer
print([mystringlist.split(’,’)])
print()
print([int(x) for x in mystringlist.split(’,’)])

however… the 2nd while fixing the int issue… makes the dataframe throw up…
[[‘0’, ‘00000001’, ‘00000101’, ‘10100000’, ‘00010010’, ‘00100101’, ‘00001010’, ‘11010010’, ‘00100010’, ‘01100010’, ‘00011000’, ‘00100001’, ‘00101001’, ‘00000001’, ‘00010010’, ‘00100101’, ‘00001001’, ‘11010001’, ‘00010000’, ‘00110111’, ‘00101001’, ‘00000001’, ‘00010010’, ‘00100101’, ‘00000001’, ‘10100000’, ‘01110011’, ‘01100000’, ‘00000110’, ‘11010000’, ‘00001011’, ‘00000000’, ‘00001001’, ‘00000000’, ‘00110000’, ‘00000000’, ‘00000000’, ‘00000000’, ‘00101000’, ‘00000000’, ‘00000000’, ‘00000000’, ‘00000100’, ‘00000000’, ‘00000000’, ‘00000000’, ‘00100110’, ‘00000000’, ‘00000000’, ‘00000000’, ‘00011100’, ‘00000000’, ‘00000000’, ‘00000000’, ‘00101100’, ‘00000000’, ‘00000000’, ‘00000000’, ‘10000110’, ‘00101111’, ‘00100010’, ‘01001111’, ‘11111000’, ‘01111111’, ‘01010011’]]

[0, 1, 101, 10100000, 10010, 100101, 1010, 11010010, 100010, 1100010, 11000, 100001, 101001, 1, 10010, 100101, 1001, 11010001, 10000, 110111, 101001, 1, 10010, 100101, 1, 10100000, 1110011, 1100000, 110, 11010000, 1011, 0, 1001, 0, 110000, 0, 0, 0, 101000, 0, 0, 0, 100, 0, 0, 0, 100110, 0, 0, 0, 11100, 0, 0, 0, 101100, 0, 0, 0, 10000110, 101111, 100010, 1001111, 11111000, 1111111, 1010011]

so close but yet so far…
ah… add the extra brackets…

and we now have predictions that change!!!
huzzah!!! vivat!! vivat!

spoke too soon… of course!!!
what i need now is a string… but what i am getting is <class ‘fastai.core.Category’>
and of course, this wont work ‘’.join([str(x) for x in target])
because the class is not iterable… (but at least i know its integers that look like a string)

so close, but yet so far
water water everywhere, but not a drop to drink!!!

now this was very interesting… here is my ‘solution’
what comes out of the pred is a category…
learn.data.classes will reveal what they are…
so i use this to get the number rather than the text i cant convert: pred[0].data.numpy()

pred = learn.predict(predictinput.iloc[0])
target = pred[0].data.numpy()
learn.data.classes[target]

which i believe now returns the string from the classes…
checking it with the type() gives me <class ‘str’>

now i am off to trying to use that in the framework provided…
[not bad for 4.5 days… ]

and now it works in the framework… running ten iterations of their code yields 10 correct predictions!!!

awesome!!!

i am hoping that what i put in this, though a kind of mess, can help someone else who is as newb as i am…

Artfldgr · March 4, 2020, 4:26pm

The challenge was completed this morning…
I succeeded in its requirements and sent off my resume and the information

THANKS to all who helped!!!

mrfabulous1 · March 5, 2020, 10:19am

Hi Artfldgr hope you had lots fun getting your program working.

Newbies on this site are often people who have no programming experience at all. The framework course requirement is a mininum of 1 year of programing preferably in python.

As Jeremy says the best way to help is to write a little blog of what you overcame or achieved.

Well done mrfabulous1

Artfldgr · March 5, 2020, 12:36pm

It was nice to hear from you Mr Fabulous1

i used to do a bit of NN programming back when it was in its infancy, and you had to write everything end to end… and in C… C# didnt exist yet… java was a distant thing… and things were glacial compared to today

i have been programming in python now for maybe a month, month and a half…
and as someone pointed out in this thread… my fastai is now about a week

as to the tiny blog… well, i did include in this discourse the code and discoveries i made and what i did…
i figure that while i may sound quite incompetent to someone that knows more… the idea of how to convert the output into a text output and so forth would help… there really is no reference to that to make it easier (and the truth is, knowing python more may not have made it all that much easier given that some of it had to do with gathering a tiny comment in a forum here and there… like how to list the classes data)

i hope what i left behind helps someone
and that i didnt frustrate the people that tried to help too much

while i did complete the challenge that i was doing…
i dont know whether that would lead to anything
but i am now emboldened to do my other project, which is tabular in finance
the database i have put together has over 37 million days of information going back about 10 years
i have another that is even larger… recently cleaned and portioned into a sub base of two weeks of data in one table and as of yesterday a secondary table of computed figures… these weeks are complete, and number over 1 million across 11,000 companies. IF i decide to use points in which volume data is missing or zero, i could probably bring that up by a few million more…

todays challenge is to take the desired target and balance the data to that end…
and see if a nn will give the kind of answer i want… funny, but i have been looking at these for a while
and they all (of course not all but all i could find and knw about) seem to ask similar questions which are very very hard for a net to answer… i am going more towards a different set of answers.

also, what i am hoping to do is take the answers, and record them, and then use them to determine a value as to quality of output… and run that through a 2nd network so that it can predict the quality of the guess made by the first net…

that should be interesting.

as to having fun? yes, i had a lot!!! the frustration aside, fastai is a incredible piece of work organizing and making things more accessible without costing in quality…

to think that i was able to pretty much walk in, sit right down, and get a finished thing in under a week without,as you point out, putting a year of study into it… granted i am 35 years experienced in coding, but as i also point out, phython is not that intuitive, its bloated to the libraries like C always wanted to be, and NN is not an easy subject even among those that have spent time…

so its going to be interesting to see what, if anyting, i can get out with project II
hopefully, a job… (still waiting on the results of the interview i had - and whether the references put me in)

all the best everyone!!!
and thanks!!!

VDM · March 5, 2020, 12:48pm

Glad to see you succeeded.
Regarding neural networks, there is a quantum leap that splits what you (and I too) did at their, if not infancy, adolescence, and current networks, which corresponds to the move from shallow to deep. So, it’s good to have knowledge of the basics, but what drastically changed the performance is the increase in complexity that is also complexity in understanding. On the other side, most of the complexity is now masked by frameworks.
@mrfabulous1 suggests a little blog: the notes here are almost unreadable, and their unreadability also made helping difficult, although successful anyway . An ex-post synthesis could be useful. Now perhaps you can be more explicit on some detail, without divulgate too much. I will do the same as soon as the results of one challenge (scientific kind ) will become official

Artfldgr · March 29, 2020, 4:21pm

Hi, and thanks…
given your request and the challenges response, here is the rest of the info:

The challenge i was working on was the Praetorian challenge…
The net i worked on solved the challenge… but there is no reward for doing so.

Their response to my resume, was basically you cant change…
and there is no being added to their hall of fame for succeeding

The truth is that their challenge no longer gives a hash if you win, and so, you no longer can get a reward… they still keep the challenge up (though i dont know why)