Forums

joblib unpickler AttributeError

Hi everyone,

I have run a model in a Jupyter notebook :

params = {'C' : [0.1,1,10], 'kernel' : ['rbf', 'linear']}
# Entrainement
svc = OneVsRestClassifier(GridSearchCV(SVC(), params))
svc.fit(X_train, y_train)

And then, since I'm happy with the result and it was pretty loooooong to run, I use joblib to be able to use it after:

file_svc = 'fit_SVM'
joblib.dump(svc, file_svc)

Until there everything works just fine. To use it afterwards and transform new data, here is what I do:

svc = joblib.load('OC-Projet-6/fit_SVM')
y_sup = svc.predict(X_sup)

This was the code (with path) I use in the Jupyter Notebook and it works perfectly. But when I try this in PythonAnywhere, I get an AttributeError:

svc = joblib.load('fit_SVM')
y_sup = svc.predict(X_sup)

Error doesn't come from the path (I do other joblib.load and this is the only one which leads to an error). Here is the message I get:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/AMarnier/mysite/modele.py", line 60, in tag_proposal
    svc = joblib.load('fit_SVM')
  File "/home/AMarnier/.local/lib/python3.6/site-packages/joblib/numpy_pickle.py", line 596, in load
    obj = _unpickle(fobj, filename, mmap_mode)
  File "/home/AMarnier/.local/lib/python3.6/site-packages/joblib/numpy_pickle.py", line 524, in _unpickle
    obj = unpickler.load()
  File "/usr/lib/python3.6/pickle.py", line 1050, in load
    dispatch[key[0]](self)
  File "/usr/lib/python3.6/pickle.py", line 1338, in load_global
    klass = self.find_class(module, name)
  File "/usr/lib/python3.6/pickle.py", line 1392, in find_class
    return getattr(sys.modules[module], name)
AttributeError: module 'sklearn.utils.deprecation' has no attribute 'DeprecationDict'

Does anyone has an idea of what causes the issue ? Thanks for your help.

Andréa

Are you pickling the the file on your own machine, and then unpickling it on PythonAnywhere? If so, that would explain it -- unfortunately pickled files can be dependent on the exact configuration of the machine where they were pickled, so they're not portable.

I pickle the file on my machine, upload it to PA and then try to unpickle it. I don't think it's the issue so, because it is the only unpickle among 4 that causes me an issue:

    tfidf = joblib.load('fit_tfidfvect')
    svc = joblib.load('fit_SVM')

    cv = joblib.load('fit_countvect')
    lda = joblib.load('fit_LDA')

The TF-IDF, Countvect and LDA unpickle work just fine.

Right, but they could all contain different data. If the versions of the different packages used in a particular file match, or at least are reasonably compatible, then unpickling will work. But if they don't, it won't.

The specific problem you're seeing seems to relate to sklearn -- maybe you could use pip3.6 install --user sklearn==XXXXX (where XXXX is the version of sklearn you have locally) in a bash console on PythonAnywhere to upgrade it so that it matches?

I already checked all versions. I updated scikit-learn, joblib, scikit-multiclass ... everything matches.

Well, I removed the GridSearchCV from joblib file and it works now. I'd still like to understand what happened here, but at least my app is working :)