Allow consumer of deployed ML model to retrieve predictions without requiring consumer to have access to input data
This is part of a related set of posts describing challenges I have encountered and things I have learned on my MLOps journey on Azure. The first post is at: My MLOps Journey so far.
In this post I will describe how I deployed machine learning models using the Azure Machine Learning Studio in a scenario where the user of the predictions does not have access to the data that the machine learning (ML) models need as input.
Standard approach to deploying an ML model with the Azure ML Studio
In the Azure documentation Deploy machine learning models to Azure, it is assumed that a user will upload their own data and want predictions based on this data. Deploying a model in this scenario is well described in the documentation mentioned above and I will not explain it here.
Deploying an ML model with the Azure ML Studio in cases where the user cannot access input data
Instead of asking the user to provide the input data for which the user wants predictions, we ask the user to specify the date for which predictions are desired. If using cURL, the request to the prediction endpoint to get predictions for June 15th 2021 at 15:00 would be:
curl -X POST http://<RESTEndpointURLFromAMLStudio>/score -H “Authorization:Bearer <keyFromAMLStudio>” -H “Content-Type: application/json” -d “{‘date’:’15–06–2021 15:00'}”
The ast module is used in the ‘run’ function to obtain a dictionary with the input date:
def run(input_text):
input_dict = ast.literal_eval(input_text)
This date is used to query the appropriate data, which is given as input to the ML model. The predictions are then returned to the user in json format, in the same way as described in Deploy machine learning models to Azure.
When to choose which approach
If the user is expected to have access to the data that the ML model needs as input, choosing the approach where the user sends data to the prediction service will offer the most flexibility to the user. If the user will not have access to all the necessary input data, the responsibility of retrieving this data will have to be on the deployment side.
Summary
I have described how I retrieved the necessary input data in the scoring script to allow users of predictions to obtain predictions without having access to the input data.
I would love to hear from you, especially if there is some of this you disagree with, would like to add to, or have a better solution for.