Hi guys,
My model is loaded using pickle. The application can read the values and be able to perform the scaling. But unable to perform prediction. The same code is working on VScode. Any suggestions?
Hi guys,
My model is loaded using pickle. The application can read the values and be able to perform the scaling. But unable to perform prediction. The same code is working on VScode. Any suggestions?
Are seeing any errors when you run the code on PythonAnywhere?
No error messages.
In that case, what do you mean by "unable to perform prediction"?
print("Making prediction...") predicted_scaled = model_diameter.predict(scaled_input_data) //does not produce any value print("Predicted values:", predicted_scaled) does not produce any values.
Is scaled_input_data
what you expect? Is predicted_scaled
just None
or something else? Do you have access to the code you are running or is it just unpickled black box? Is it your unpickled code working on PythonAnywhere at all?
The below is my code: scaling is happening but not predicting.
new_data = [[leaves, height, temperature, humidity]]
feature_names = ['Leaves', 'Height', 'Temp', 'Humidity']
print("Loaded model:", type(loaded_model))
print("User input",new_data)
# Create a DataFrame for new data
new_data_df = pd.DataFrame(new_data, columns=feature_names)
# Use the loaded scaler to scale the new data
scaled_new_data = loaded_feature_scaler.transform(new_data_df)
print("Scaled data", np.array(scaled_new_data))
print("Code is being tested here ")
print(type(scaled_new_data),scaled_new_data.shape )
# Use the loaded model to make predictions on the scaled new data
predicted_diameter = loaded_model.predict(np.array(scaled_new_data))
#predicted_scaled = loaded_model.predict(scaled_new_data)
print("Predicted_Sclaed", predicted_diameter)
except Exception as e:
print(f"Error during prediction: {e}")
2024-10-15 14:00:17 hello
2024-10-15 14:00:17 Loaded model: <class 'xgboost.sklearn.XGBRegressor'>
2024-10-15 14:00:17 User input [[9, 5.0, 23.0, 49.34]]
2024-10-15 14:00:17 Scaled data [[0.25 0.04347826 0.69590269 0.51513751]]
2024-10-15 14:00:17 Code is being tested here
2024-10-15 14:00:17 <class 'numpy.ndarray'> (1, 4)
[edit by admin: formatting]
Are you running your code as part of a website's view functions? If so, do you know if the model is trying to use threads?
Yes, I am running code as a part of a website
import flask
from flask import Flask, render_template, request, redirect, url_for
import sqlite3
import pandas as pd
import pickle
from flask import jsonify
from datetime import datetime, timezone, timedelta
import numpy as np
import shap
import sklearn
from sklearn.preprocessing import MinMaxScaler
import xgboost
import sys
import threading
# Create a global lock
lock = threading.Lock()
app = Flask(__name__, static_url_path='/static')
print("SKlearn", sklearn.__version__)
print("XGboost", xgboost.__version__)
print("panda", pd.__version__)
print("Python version:", sys.version)
# Load the trained model
# Create new scaler and model instances
scaler = MinMaxScaler()
# Step 1: Attempt to load existing models and scalers
with open("Diameter_xgboost_model_Oct16.pkl", "rb") as model_file: #8
model_diameter = pickle.load(model_file)
except EOFError:
print("Failed to load diameter model. Please check the file.")
with open("Diameter_feature_scaler_Oct16.pkl", "rb") as feature_model_file:
model_feature = pickle.load(feature_model_file)
except EOFError:
print("Failed to load feature scaler. Creating a new scaler.")
# Create a new scaler if loading fails
with open("Diameter_target_scaler_Oct16.pkl", "rb") as target_model_file:
model_target = pickle.load(target_model_file)
except EOFError:
print("Failed to load feature scaler. Creating a new scaler.")
# Create a new scaler if loading fails
with open("Height_xgboost_model_Oct6.pkl", "rb") as height_model_file:
model_height = pickle.load(height_model_file)
with open("Height_feature_scaler_Oct6.pkl", "rb") as height_feature_model_file:
height_model_feature = pickle.load(height_feature_model_file)
with open("Height_target_scaler_Oct6.pkl", "rb") as height_targer_model_file:
height_model_target = pickle.load(height_targer_model_file)
with open("pH_xgboost_model_Oct1.pkl", "rb") as pH_model_file:
model_pH = pickle.load(pH_model_file)
with open("pH_feature_scaler_Oct1.pkl", "rb") as pH_feature_model_file:
pH_model_feature = pickle.load(pH_feature_model_file)
with open("pH_target_scaler_Oct1.pkl", "rb") as pH_targer_model_file:
pH_model_target = pickle.load(pH_targer_model_file)
with open("TDS_regression_model_Oct2.pkl", "rb") as TDS_model_file:
model_TDS = pickle.load(TDS_model_file)
with open("TDS_feature_scaler_Oct2.pkl", "rb") as TDS_feature_model_file:
TDS_model_feature = pickle.load(TDS_feature_model_file)
with open("TDS_target_scaler_Oct2.pkl", "rb") as TDS_targer_model_file:
TDS_model_target = pickle.load(TDS_targer_model_file)
# Define the feature names used during training XGboost
feature_names_diameter = ['Leaves', 'Height', 'Temp', 'Humidity']
feature_names_height = ['Leaves', 'Diameter', 'Temp', 'Humidity']
feature_names_pH = ['TDS', 'EC', 'Temp']
def home():
return render_template('index2.html')
def b_predict():
return render_template('b_Predict.html')
def b_aquaponics():
return render_template('b_aquaponics.html')
def diameter():
return render_template('b_diameter.html')
def height():
return render_template('b_height.html')
def ph():
return render_template('b_pH.html')
def tds():
return render_template('b_TDS.html')
@app.route("/predict", methods=["POST"])
def predict():
with lock:
if request.method == "POST":
data = request.json
leaves = int(data["leaves"])
height = float(data["height"])
temperature = float(data["temperature"])
humidity = float(data["humidity"])
input_data = pd.DataFrame([[leaves, height, temperature, humidity]], columns=feature_names_diameter)
# Perform the prediction using the model
#diameter = model_diameter.predict(input_data)
# Scale the input data
scaled_input_data = model_feature.transform(input_data)
# Perform the prediction using the model
scaled_input_df = pd.DataFrame(scaled_input_data, columns=feature_names_diameter)
print("Scaled DataFrame for Prediction:", scaled_input_df)
print("Prediction input columns:", scaled_input_df.columns)
print("Going to predict:")
predicted_scaled = model_diameter.predict(scaled_input_df)
# Calculate SHAP values for the prediction
explainer = shap.Explainer(model_diameter)
shap_values = explainer(scaled_input_df)
# Extract SHAP values for each feature
shap_values_list = shap_values.values[0].tolist()
shap_values_dict = dict(zip(feature_names_diameter, shap_values_list))
# Categorize features based on SHAP values
# Sort the shap_values_dict by SHAP value first, in descending order
sorted_shap_values = sorted(shap_values_dict.items(), key=lambda x: abs(x[1]), reverse=True)
# Create a list of features in sorted order, without distinguishing positive or negative
influential_features = ["Temperature" if feature == "Temp" else feature
for feature, shap_value in sorted_shap_values]
# Construct the message
if influential_features:
message = f"The factors that influenced the predicted plant diameter the most (in order from highest to lowest) are: {', '.join(influential_features)}."
message = "No significant factors influenced the prediction."
# Inverse transform the prediction
diameter = model_target.inverse_transform(predicted_scaled.reshape(-1, 1))
diameter = float(diameter[0])
diameter = round(diameter, 2)
print("Predicted values (unscaled):", diameter)
# Display the unscaled predictions
print("Predicted values (unscaled):", np.round(diameter, 2))
print( message)
return jsonify(predicted_diameter=diameter,
message1 =message)
@app.route("/predict_height", methods=["POST"])
def predict_height():
with lock:
if request.method == "POST":
data = request.json
leaves = int(data["leaves"])
diameter = float(data["diameter"])
temperature = float(data["temperature"])
humidity = float(data["humidity"])
print("Data captured")
input_data = pd.DataFrame([[leaves, diameter, temperature, humidity]], columns=feature_names_height)
print("INput Data",input_data)
# Perform the prediction using the model
#height = model_height.predict(input_data)
scaled_input_data = height_model_feature.transform(input_data)
print("Scaled", scaled_input_data)
# Perform the prediction using the model
#predicted_scaled = model_height.predict(scaled_input_data)
scaled_input_df = pd.DataFrame(scaled_input_data, columns=feature_names_height)
print("Scaled DataFrame for Prediction:", scaled_input_df)
print("Prediction input columns:", scaled_input_df.columns)
print("Going to predict:")
# Perform the prediction using the model
predicted_scaled = model_height.predict(scaled_input_df)
print("THe predicted", predicted_scaled)
# Calculate SHAP values for the prediction
explainer = shap.Explainer(model_height)
shap_values = explainer(scaled_input_df)
# Extract SHAP values for each feature
shap_values_list = shap_values.values[0].tolist()
shap_values_dict = dict(zip(feature_names_height, shap_values_list))
# Sort the shap_values_dict by SHAP value first, in descending order
sorted_shap_values = sorted(shap_values_dict.items(), key=lambda x: abs(x[1]), reverse=True)
# Create a list of features in sorted order, without distinguishing positive or negative
influential_features = [
"Temperature" if feature == "Temp" else feature
for feature, shap_value in sorted_shap_values
# Construct the message
if influential_features:
message_height = f"The factors that influenced the predicted plant height the most (in order from highest to lowest) are: {', '.join(influential_features)}."
message_height = "No significant factors influenced the prediction."
# Inverse transform the prediction
height = height_model_target.inverse_transform(predicted_scaled.reshape(-1, 1))
height = float(height[0])
height = round(height, 2)
print("Predicted values (unscaled):", height)
# Display the unscaled predictions
print("Predicted values (unscaled):", np.round(height, 2))
print( message_height)
return jsonify(predicted_height=height,
height_message =message_height)
@app.route("/predict_pH", methods=["POST"])
def predict_pH():
with lock:
if request.method == "POST":
data = request.json
tds = int(data["tds"])
ec = int(data["ec"])
temperature = float(data["temp"])
print("pHData captured")
input_data = pd.DataFrame([[tds, ec, temperature]], columns=feature_names_pH)
print("INput Data",input_data)
scaled_input_data = pH_model_feature.transform(input_data)
print("Scaled", scaled_input_data)
# Perform the prediction using the model
predicted_scaled = model_pH.predict(scaled_input_data)
# Calculate SHAP values for the prediction
explainer = shap.Explainer(model_pH)
shap_values = explainer(scaled_input_data)
# Extract SHAP values for each feature
shap_values_list = shap_values.values[0].tolist()
shap_values_dict = dict(zip(feature_names_pH, shap_values_list))
# Sort the shap_values_dict by SHAP value first, in descending order
sorted_shap_values = sorted(shap_values_dict.items(), key=lambda x: abs(x[1]), reverse=True)
# Create a list of features in sorted order, without distinguishing positive or negative
influential_features = [
"Temperature" if feature == "Temp" else feature
for feature, shap_value in sorted_shap_values
# Construct the message
if influential_features:
message_pH = f"The factors that influenced the predicted water pH the most (in order from highest to lowest) are: {', '.join(influential_features)}."
message_pH = "No significant factors influenced the prediction."
# Inverse transform the prediction
pH = pH_model_target.inverse_transform(predicted_scaled.reshape(-1, 1))
pH = float(pH[0])
pH= round(pH, 2)
print("Predicted values (unscaled):", pH)
# Display the unscaled predictions
print("Predicted values (unscaled):", np.round(pH, 2))
return jsonify(predicted_pH=pH,
pH_message =message_pH)
@app.route("/predict_TDS", methods=["POST"])
def predict_TDS():\
with lock:
if request.method == "POST":
data = request.json
pH= float(data["pH"])
temperature = float(data["temp"])
ec = int(data["ec"])
print("TDS Data captured")
input_data = pd.DataFrame([[pH, temperature,ec]], columns=feature_names_TDS)
print("INput Data",input_data)
scaled_input_data = TDS_model_feature.transform(input_data)
print("Scaled", scaled_input_data)
# Perform the prediction using the model
predicted_scaled = model_TDS.predict(scaled_input_data)
# Inverse transform the prediction
tds_value = TDS_model_target.inverse_transform(predicted_scaled.reshape(-1, 1))
tds = int(np.round(tds_value[0][0]))
print("Predicted values (unscaled):", tds)
# Feature Importance Extraction
if hasattr(model_TDS.named_steps['regressor'], 'coef_'):
# Get coefficients from the model
coefficients = model_TDS.named_steps['regressor'].coef_
# Flatten the coefficients if multi-target
if coefficients.ndim > 1:
coefficients = coefficients.flatten()
# Get feature names
feature_names = feature_names_TDS
# Create DataFrame for feature importance
feature_importance = pd.DataFrame({
'Feature': feature_names,
'Coefficient': coefficients
# Add absolute coefficient column for sorting
feature_importance['Absolute Coefficient'] = feature_importance['Coefficient'].abs()
# Sort by absolute coefficient in descending order
feature_importance = feature_importance.sort_values(by='Absolute Coefficient', ascending=False)
# Extract feature names in order of importance
influential_features = feature_importance['Feature'].tolist()
# Construct the message based on feature importance
message_TDS = f"The factors that influenced the predicted water TDS the most (in order from highest to lowest) are: {', '.join(influential_features)}."
message_TDS = "The model does not have feature importance available."
return jsonify(predicted_TDS=tds,
TDS_message =message_TDS)
@app.route('/save_option', methods=['POST'])
def save_option():
# Connect to the SQLite database
conn = sqlite3.connect('studyDemo1.db', check_same_thread=False)
cursor = conn.cursor()
# Retrieve the selected option from the form
selected_option = request.form.get('option')
# Insert the selected option into the database
query = "INSERT INTO selectDHprediction (DHselection) VALUES (?)"
values = (selected_option,)
cursor.execute(query, values)
# Commit the transaction and close the connection
# Redirect back to the home page or wherever you want after insertion
return redirect(url_for('home'))
if __name__ == '__main__':
pip list
Package Version
------------------ -----------
blinker 1.8.2
click 8.1.7
cloudpickle 3.0.0
Flask 2.1.2
importlib-metadata 4.11.3
itsdangerous 2.2.0
Jinja2 3.1.4
joblib 1.2.0
llvmlite 0.42.0
MarkupSafe 2.1.5
numba 0.59.1
numpy 1.24.4
nvidia-nccl-cu12 2.23.4
packaging 24.1
pandas 2.2.3
pip 24.2
python-dateutil 2.9.0.post0
pytz 2024.2
scikit-learn 1.3.0
scipy 1.8.0
setuptools 75.1.0
shap 0.46.0
six 1.16.0
slicer 0.0.8
threadpoolctl 3.5.0
tqdm 4.66.5
tzdata 2024.1
Werkzeug 2.0.3
wheel 0.44.0
xgboost 1.7.4
zipp 3.20.2
[edit by admin: formatting]
(aquaenv) 11:27 ~/mysite $ python3 --version Python 3.9.13 (aquaenv) 11:27 ~/mysite $
Python version:3.9 is selected
Do you see any errors in the logs? Maybe add extra logging to see what is the state of the code in places you'd expect to behave differently?
No error messages
Try running the code outside of a web app and see how it behaves then.