import os
from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv())
= os.getenv("API_KEY")
api_key = os.getenv("RSC_URL") rsc_url
Version, deploy, and monitor your model with vetiver in Python
The MLOps cycle
Data scientists have effective tools that they ❤️ to:
collect data
prepare, manipulate, refine data
train models
There is a lack 😩 of effective tools (especially open source) to:
put models into production
monitor model performance
trigger retraining
What is vetiver? https://vetiver.rstudio.com/
Vetiver, the oil of tranquility, is used as a stabilizing ingredient in perfumery to preserve more volatile fragrances.
The goal of vetiver is to provide fluent tooling to version, deploy, and monitor a trained model.
Build a model
Let’s build a model to predict which Scooby Doo episodes have a real monster and which have a fake monster.
import numpy as np
import pandas as pd
import pyarrow.feather as feather
500)
np.random.seed(= feather.read_feather('scooby-do.arrow').astype({'monster_real': 'category'}) scooby
from sklearn import model_selection, preprocessing, svm, pipeline, compose
= model_selection.train_test_split(
X_train, X_test, y_train, y_test 1:3],
scooby.iloc[:,'monster_real'],
scooby[=0.2
test_size
)
= preprocessing.StandardScaler().fit(X_train)
scaler = svm.LinearSVC().fit(scaler.transform(X_train), y_train)
svc
= pipeline.Pipeline([('std_scaler', scaler), ('svc', svc)]) svc_pipeline
Version and deploy a model
Create a deployable model object:
import vetiver
= vetiver.VetiverModel(svc_pipeline, "isabel.zimmerman/scooby-doo", ptype_data = X_train) v
Version and share the model:
import pins
# could be board_s3, board_azure, board_folder, etc
= pins.board_rsconnect(api_key=api_key, server_url=rsc_url, allow_pickle_read=True)
board
#vetiver.vetiver_pin_write(board, v)
Document the model: https://vetiver.rstudio.com/learn-more/model-card.html
Deploy model as a REST API:
= vetiver.VetiverAPI(v)
api api.run()
import rsconnect
= rsconnect.api.RSConnectServer(url = rsc_url, api_key = api_key)
connect_server
vetiver.deploy_rsconnect(= connect_server,
connect_server = board,
board = "isabel.zimmerman/scooby-doo",
pin_name )
Predict from a model
Predict for remote vetiver model:
= vetiver.vetiver_endpoint("https://colorado.rstudio.com/rsc/scooby/predict") connect_endpoint
= pd.DataFrame(
new_episodes 'year_aired': np.random.randint(1970, 2000, size=(5,)),
{'imdb': np.random.randint(5, 9, size=(5,))}
) new_episodes
year_aired | imdb | |
---|---|---|
0 | 1998 | 5 |
1 | 1996 | 7 |
2 | 1998 | 7 |
3 | 1986 | 8 |
4 | 1986 | 6 |
= vetiver.predict(data = new_episodes, endpoint = connect_endpoint)
response response
prediction | |
---|---|
0 | real |
1 | fake |
2 | fake |
3 | fake |
4 | real |
from datetime import timedelta
from sklearn import metrics
= feather.read_feather('scooby-validation.arrow').astype({'monster_real': 'category'})
scooby_validation
"preds"] = v.model.predict(scooby_validation.drop(columns=["monster_real", "date_aired"]))
scooby_validation[
= scooby_validation.astype({'preds': 'category'})
scooby_validation
= [metrics.accuracy_score]
metric_set
= vetiver.compute_metrics(data = scooby_validation,
scooby_metrics ="date_aired",
date_var= timedelta(weeks = 52),
period =metric_set,
metric_set="monster_real",
truth="preds") estimate
= vetiver.plot_metrics(scooby_metrics)
m m.show()