Bug/Issue with @patch'ed summary method in hook.py (solved)

wgpubs · July 13, 2020, 5:57pm

Testing out XLMForSequenceClassification huggingface transformer model and getting a really odd behavior. Note the partial stack trace below:

~/anaconda3/envs/blurr/lib/python3.7/site-packages/transformers/modeling_utils.py in forward(self, hidden_states, cls_index)
   1128 
   1129         output = self.first_dropout(output)
-> 1130         output = self.summary(output)
   1131         output = self.activation(output)
   1132         output = self.last_dropout(output)

~/development/projects/blurr/_libs/fastai2/fastai2/callback/hook.py in summary(self, *xb)
    161 def summary(self:nn.Module, *xb):
    162     "Print a summary of `self` using `xb`"
--> 163     sample_inputs,infos = layer_info(self, *xb)
    164     n,bs = 64,find_bs(xb)
    165     inp_sz = _print_shapes(apply(lambda x:x.shape, xb), bs)

The SequenceSummary instance upon which self.summary(output) is called … ends up calling fastai’s summary() method instead of it’s own. This is caused because of the following @patched method in fastai, which basically circumvents how python works in this case:

@patch
def summary(self:nn.Module, *xb):
    ....

This has the looks of a nasty bug … but I’m not sure how to resolve. Any ideas?

@sgugger: since you got the inside track on huggingface and fastai v2, wondering if you got a recommendation. If not, sorry for the @ mention

wgpubs · July 14, 2020, 7:56pm

Is there a way to get rid of @patch methods like …

@patch
def summary(self:nn.Module, *xb):
    "Print a summary of `self` using `xb`"
    ...

… because of this it causes this line in huggingface not to call the correct method:

self.summary = nn.Linear(config.hidden_size, num_classes)

I can’t see anything to do about the huggingface code, so I’m hoping there is something that can be done on the fastai side. At the very worst, perhaps it is enough to not use such a common name (e.g., replace it with fastai_summary() … dunno.

tyoc213 · July 14, 2020, 8:22pm

I guess the more easy would by to write your own patch, search the code and paste above your call, being dynamic it should take yours in that moment on.

But I will spec some things break probably…

wgpubs · July 14, 2020, 8:32pm

The problem is that it replaces the call in the huggingface library, because it defines a summary method that acts upon a nn.Module.

tyoc213 · July 14, 2020, 11:20pm

mmmmm, the only I can suguest for the moment is what about patch the original source of the HF summary to the nn.Module again?? some like

@patch_to(nn.Module)
def summary(signature of HF):
   original HF code

Thought is not a “solution” I guess it will work as spected.

“another” option… would be to install fastai2 from source code and then comment out the summary code.

wgpubs · July 16, 2020, 8:05pm

nope.

The problem is that in XLMForSequenceClassification …

self.summary = nn.Linear(config.hidden_size, num_classes)

Its not a method, but because it is a nn.Module it is called like it is …

output = self.summary(output)

I’ve tried @patch'ng def summary in several ways, all of which don’t work as intended. All in all, as nice as @patch is, there is some code smell here and it can have some really awful side-effects as I’m seeing here.

I still think the easiest/best thing right now is to not patch classes/methods outside of the library itself … or at worse, rename the summary method to something very unlikely to be used in other libraries (e.g., fastai_summary.

Or then again, why aren’t these methods just created as part of the Learner class? Why are we using @patch at all???

Has anyone got a successful example of using XLM for classification tasks in fastai v2? At this point, I can’t think of what else to try

tyoc213 · July 19, 2020, 10:43pm

Just found this, just saying here in case it helps? in utils.py

# Cell
_patched = ['read', 'readlines', 'write', 'save', 'load', 'ls']

@contextmanager
def remove_patches_path():
    patches = L(getattr(Path, n) for n in _patched)
    try:
        for n in _patched: delattr(Path, n)
        yield
    finally:
        for (n, f) in zip(_patched, patches): setattr(Path, n, f)

it is used like this

with remove_patches_path():
    assert not hasattr(Path, 'write')
assert hasattr(Path, 'write')

Probably something like that for Learner could help.

Im not so sure, but it probably is because it enables to do literate programming.

wgpubs · July 19, 2020, 11:01pm

SOLVED

Nice find!

Initially I started down the path of just importing what I needed … and as an FYI, just about everything you need can be brought in with a from fastai2.basics import *. But I cam across a problem when trying to import a specific function from the .py file where these patched methods live.

The problem with these @patch methods is that even if you import just a function as such … from fastai2.callback.hook import _print_shapes it will read/run everything in that file first … AND that file, for whatever reason, is where the patched summary methods are. Thus, if you don’t want the patched methods … you can’t use anything in this file.

Your solution … works!

delattr(Learner, 'summary')
delattr(nn.Module, 'summary')

If feels hacky, but you’ve figured out how to get rid of patched methods which will be helpful to anyone figuring out how to get rid of these methods where there are conflicts (e.g., as is the case with huggingface’s XMLForSequenceClassification model).

Thanks much. You’ve restored a bit of my sanity