Create Machine Learning Web App in Minutes with Streamlit

Hello everyone, in this blog we are going to take a look at a low-code web development library Streamlit. As a data science student, fitting the model is a very small and easy part of a data science project lifecycle. But what’s the purpose of a training model if you cannot share it with others and show what it can do, that is where web development comes into the picture. If you want to showcase your work in a so-called aesthetic way, then streamlit can be your go-to web development framework.

What is Streamlit?

Streamlit is an open-source app framework for Machine Learning and Data Science projects. It makes it easy to create and share data apps in Python.

Streamlit Docs

In the implementation part, we will build an iris classification web application using streamlit.

Building the model:

import warnings as w
w.filterwarnings('ignore')
import pickle
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score # for cross validation

data = pd.read_csv('diabetes.csv')
data.head()
# Imputing the value 0 in glucose, blood pressure, skin thickness, insulin, BMI because it is not possible to have 0 value in these features.
columns = ['Glucose','BloodPressure','SkinThickness','Insulin','BMI']
data[columns] = data[columns].replace(0, data[columns].mean())
# Splitting the data into train and test
X = data.drop('Outcome', axis=1)
y = data['Outcome']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=101,stratify=y)

# Using Random Forest Classifier, it does not require scaling of data because it is a non-parametric model

rfc = RandomForestClassifier(n_estimators=250, random_state=101, n_jobs=-1,class_weight={0:0.3,1:0.7})
cv = cross_val_score(rfc, X_train, y_train, cv=5, scoring='roc_auc').mean()
rfc.fit(X_train, y_train)
y_pred = rfc.predict(X_test)

# Evaluating the model
from sklearn.metrics import classification_report
print(classification_report(y_test, y_pred))

# Saving the model
pickle.dump(rfc, open('model.pkl','wb'))

After making the model we saved it in a pickle file. A pickle file is a binary file that stores a serialized Python object hierarchy. The Python pickle module can be used to serialize and de-serialize objects to be saved on disk and loaded back into a Python program. Pickle file stores the model's state when it was trained on the model and will retain all the methods/functions.

But there’s no value of the model lying in your PC storage so we are going to make a web application that will predict the if the person has diabetes or not.

Photo by Adi Goldstein on Unsplash

Making of Web Application:

Execute the command in your command prompt:

pip install streamlit

This will install streamlit and you are ready to make your web application.

Host the Web Application:

Create a Python file named “app.py” write this code and save your file:

import streamlit as st

st.title('Diabetes Prediction App')

From your app.py directory execute the command in CMD:

streamlit run app.py

# If this give error of streamlit command not found then write 

python -m streamlit run app.py

If it’s up and running then we are good to go to the next stage of web application development.

Requirements for Web Application:

This is a crucial step in web development, be it a low-code framework or a framework that requires much more coding, the developer must have a layout/blueprint of the application in mind.

Layout Steps:

Title of your Web Application(We have already written that one).
An easy explanation workings of a web application, for a layman, the UI of the application matters the most not the coding or backend part.
What type of input should be taken from the users? Methods by which input will be taken? Storage structure of inputs for the model for predictions.
How the output will be shown to the user.

First Step:

We have already written this code. Quickly moving on to the next step.

Second Step:

For explanation, we’ll take the help of Streamlit’s text elements(You can check out Streamlit’s API docs) that are used to print the text on the screen of the web application.

Explanation Code:

import streamlit as st

# Title of the Web Application.
st.title("Diabetes Prediction App")

# Header got the Instructions.
st.header("Instruction for using the app:")

# Steps of the Explanation.
st.write(
    "1. Please enter all the values using the asked to predict if you have diabetes or not."
)
st.write("2. Click on the 'Predict' button to check if you have diabetes or not.")
st.write(
    "3. If you want to know more about the app, please click on the 'About' section."
)

Third Step:

For this step, we should know on which features our model has been trained and the data type of the features as well.

Description of the features used in modelling:

No of Pregnancies: Integer having a range of [0,17].

2. Plasma glucose concentration 2 hours in an oral glucose tolerance test: Integer having a range of [44,200).

3. Diastolic blood pressure (mm Hg): Integer having a range of [24,200).

4. Triceps skin fold thickness (mm): Integer having a range of (5,100).

5. 2-Hour serum insulin (mu U/ml): Integer having a range of (10,850).

6. Body mass index (weight in kg/(height in m)²): Float having a range of (15,70).

7. Diabetes pedigree function: Float having a range of (0,3)

8. Age (years): Integer having a range of (20,85).

These features are all of numeric type so we can take the help of Streamlit’s input widgets.

Input Code:

# No of pregnancies
pregnancies = st.number_input("No of Pregnancies:", 0, 17, 1, key="pregnancies")

# Plasma glucose concentration(2 hr)
glucose = st.slider(
    "Plasma Glucose Concentration(2 hr):", 0, 300, 100, 1, key="glucose"
)
# Here we have given key because there are multiple sliders in the app and streamlit needs to identify which slider is for which feature.
blood_pressure = st.slider(
    "Diastolic Blood Pressure(mm Hg):", 20, 400, 150, 1, key="blood_pressure"
)

# Triceps skin fold thickness
skin_thickness = st.slider(
    "Triceps Skin Fold Thickness(mm):", 0, 99, 20, 1, key="skin_thickness"
)

# 2-Hour serum insulin(mu U/ml)
insulin = st.slider("2-Hour Serum Insulin(mu U/ml):", 10, 1000, 350, 1, key="insulin")

# Body mass index
bmi = st.number_input("BMI:", 0.0, 100.0, 20.0, key="bmi")

# Diabetes pedigree function
dpf = st.number_input("Diabetes Pedigree Function:", 0.0, 3.0, 0.47, key="dpf")

# Age of the person
age = st.slider("Age:", 0, 100, 21, key="age")

# Savig the model input in a 2d list/array because the model expects a 2d array as input
input_for_model = [
    [pregnancies, glucose, blood_pressure, skin_thickness, insulin, bmi, dpf, age]
]

We have taken the help of Streamlit’s number_input and slider input widgets to take input from the user and stored it in a 2D array because the prediction function of the model takes a 2D array as input.

Fourth Step:

In this step, we are going to use the inputs and classify if the patient is diabetic or not.

# Creating a button to predict if the person has diabetes or not
model = pickle.load(open("model.pkl", "rb"))

if st.button("Predict"):
    with st.spinner("Predicting..."):
        model_output = model.predict(input_for_model)
        sleep(1)
        if model_output[0] == 0:
            st.success("You do not have Diabetes.")
        else:
            st.error("You have Diabetes.")

We have introduced Streamlit’s button widget as the prediction trigger. When the user clicks the predict button the model will use the tuned input and pass it on to the predict method of the model. Then a conditional statement is added to check if the output class is 0(Not Diabetic) then the output displayed will be “You do not have diabetes” otherwise “You have Diabetes”.

Complete Code of Application:

import streamlit as st
import pickle


@st.cache_resource
def load_model():
    return pickle.load(open("model.pkl", "rb"))


model = load_model()


# Function to predict if the person has diabetes or not
def predict_diabetes():
    output = model.predict(input_for_model)
    if output[0] == 0:
        st.success("You do not have diabetes.")
    else:
        st.error("You have diabetes.")


# Title of the app

st.title("Diabetes Prediction App")

# Subheader of the app
st.header("Instruction for using the app:")

# Instructions for using the app
st.write(
    "1. Please enter all the values using the asked to predict if you have diabetes or not."
)
st.write("2. Click on the 'Predict' button to check if you have diabetes or not.")
st.write(
    "3. If you want to know more about the app, please click on the 'About' section."
)

# UI for input features
st.subheader("Input your medical details here:")

# No of pregnancies
pregnancies = st.number_input("No of Pregnancies:", 0, 17, 1, key="pregnancies")

# Plasma glucose concentration(2 hr)
glucose = st.slider(
    "Plasma Glucose Concentration(2 hr):", 0, 300, 100, 1, key="glucose"
)
# Here we have given key because there are multiple sliders in the app and streamlit needs to identify which slider is for which feature.
blood_pressure = st.slider(
    "Diastolic Blood Pressure(mm Hg):", 20, 400, 150, 1, key="blood_pressure"
)

# Triceps skin fold thickness
skin_thickness = st.slider(
    "Triceps Skin Fold Thickness(mm):", 0, 99, 20, 1, key="skin_thickness"
)

# 2-Hour serum insulin(mu U/ml)
insulin = st.slider("2-Hour Serum Insulin(mu U/ml):", 10, 1000, 350, 1, key="insulin")

# Body mass index
bmi = st.number_input("BMI:", 0.0, 100.0, 20.0, key="bmi")

# Diabetes pedigree function
dpf = st.number_input("Diabetes Pedigree Function:", 0.0, 3.0, 0.47, key="dpf")

# Age of the person
age = st.slider("Age:", 0, 100, 21, key="age")

# Savig the model input in a 2d list/array because the model expects a 2d array as input
input_for_model = [
    [pregnancies, glucose, blood_pressure, skin_thickness, insulin, bmi, dpf, age]
]

# Creating a button to predict if the person has diabetes or not

if st.button("Predict"):
    with st.spinner("Predicting..."):
        predict_diabetes()

That’s it for the implementation of Streamlit. You can make different layouts, as I said it depends on the person who is making the web application and what they have in their mind for the layout presentation. You can use the sidebar property of Streamlit or you can make a multi-paged web app. Ultimately it comes down to the developer's choice.

Limitations of Streamlit:

Here are some potential limitations of Streamlit:

Limited styling options — Streamlit has a simple styling API that can style basic UI elements. However, it lacks the flexibility of full web frameworks for advanced styling and layout.
No frontend framework integration — Streamlit does not integrate well with popular frontend frameworks like React, Vue or Angular. This means the UI is limited to what Streamlit provides.
Apps are housed on a local server — Streamlit apps run on a local development server by default. This means they are not easily deployable to a public URL. Workarounds exist but are not straightforward.
Security — Streamlit apps are not designed with security in mind by default. Best practices like input validation and sanitization have to be implemented manually.
Performance — Streamlit apps can be slow, especially when re-running code. This is due to Streamlit re-executing all code on every change.
Limited interactivity — While Streamlit provides some interactive widgets, the interactivity is relatively limited compared to full frontend frameworks.
Debugging — Debugging Streamlit apps can be difficult since errors do not show up in the browser console. Streamlit has its logging system.

In summary, while Streamlit is excellent for quickly prototyping machine learning apps, it has some limitations for production-ready web apps. It works best for simple demo apps and prototypes rather than fully-fledged web applications.

If you want an article on the Deployment of Web Applications using Streamlit on HuggigFace, Render or other platforms and deployment using Docker, let me know in the comments.

Alternatives to Streamlit:

Here are some additional frameworks similar to Streamlit for building machine-learning apps:

Gradio — Gradio is an open-source framework for building beautiful and customizable machine-learning user interfaces in Python. It allows you to expose your Python ML code to a web UI with just a few lines of code.
Solara — Solara is a Python framework that allows you to create interactive machine-learning applications with a simple API. It aims to be a lightweight alternative to Streamlit and Dash.
Dash — Dash is an open-source framework by Plotly for building analytical web applications in Python. It has a number of advantages over Streamlit like integration with React and better styling capabilities.
Flask — Flask is a lightweight web framework for Python. It can be used to build machine learning web apps but requires more manual setup compared to Streamlit.

Hope this helps give you some insight into Streamlit! Let me know if you have any other questions. Till next time, au revoir. 👋