There is a lack of expertise in creating and managing ML pipelines making it more difficult for organizations to become AI driven. In order to solve this challenge, Intel® Tiber™ AI Studio launched a new solution called AI Blueprints which is an increasingly comprehensive set of customizable machine learning pipelines ready-to-use in any application. It is a collection of many pre-built ML pipelines so that you don’t have to do the heavy lifting of training and testing your model as well as connecting various machine learning components. This particular blog post will go over:
- The types of recommendation systems and how they work.
- How to create a recommendation engine using a Blueprint without writing model code.
Type of Recommendation Systems
These are just some of the companies using recommendation systems. A recommender system, is a subclass of information filtering system that seeks to predict the “rating” or “preference” a user would give to an item.
Broadly speaking, there are two different types of filtering: collaborative and content-based.
- Collaborative filtering: This focuses on collecting and analyzing data on user behavior, activities, and preferences, to predict what a person will like, based on their similarity to other users.
- Content filtering: This works on the principle that if you like a particular item, you will also like this other item. To make recommendations, algorithms use a profile of the customer’s preferences and a description of an item (genre, product type, color, word length) to work out the similarity of items using cosine and Euclidean distances.
Below is the basic recommendation system process.
The first thing we need to do is get our data from our data source whether that be a database, a storage bucket like S3, or even something as simple as a csv file. The second step is fitting models to the data and selecting the best model. The final step is to deploy the model. This can be in real time like an api or in batch requests.
How to Create a Recommendation with AI Blueprints
This is the pipeline we will be utilizing in this tutorial.
The image above shows a flow which in cnvrg is a production-ready machine learning pipeline that allows you to build complex DAG (directed acyclic graph) pipelines and run your ML components by dragging and dropping.
Before getting into making a recommendation engine it is important to take a step back and mention the types of AI blueprints.
Types of blueprints
Generally speaking, there are three types of blueprints.
- Inference: These are pre-trained and ready to be used immediately. All you need to do is one click deployment of the blueprint to your own infrastructure. This is good for use cases like object detection, text detection, and sentiment analysis.
- Training: This is either fine tuning or training a model. These perform best on your specific data. You need to provide it with your own dataset. In the training, the blueprint will try to find the best model that performs best on your data and makes it easy to deploy it at the end of the process.
- Components: It allows you to mix and match connectors with models and deployment options and create your own story.
Selecting the Blueprint
To be able to follow along with this part of the tutorial, you will need to sign up or login to Intel® Tiber™ AI Studio Metacloud. Once you’ve created a user name, you should get a similar screen when you login for the first time.
The next step is to select Blueprints.
Next type in recommender.
Click on Recommenders Train and a screen similar to the one in the image below.
Next, click on Use Blueprint and you will see a flow diagram.
This training blueprint consists of the following components:
- S3 Connector: This is used to get data from S3. In this tutorial, we will not be using data from S3 but instead we will replace this with a data task. A data task represents your datasets that are hosted and accessible via the cnvrg platform. We are using a data task to show how to upload your own dataset.
- Data validation: This makes sure the data is formatted correctly. It also handles null values.
- Train Test Split: This splits the data into training and test sets. This is used to help assess how well our algorithm performs.
- Multiple Model Training: We will be training 5 models. Naturally, it is easy to adjust this to more or less models.
- Compare: Comparing models against a common metric between all those algorithms
- Inference: Take the model and get predictions from it by using it as an API.
Adding a Dataset
While there is a default dataset for this particular blueprint, this tutorial will utilize the Restaurant & consumer Data Set from UCI which I have cleaned. For convenience, you can download the csv file here.
On cnvrg, open DATASETS in a new tab.
Click on Create New Dataset, type in the name of your dataset, and click on Create. In the image below, the dataset is named restaurant.
Next, upload the dataset. In the image below, I am uploading rating_final.csv.
Return to the flow tab. The next step is to select New Task then Data Task.
Select whatever dataset you want. The image below selected restaurant-2. Click on Save Changes. Note that the dataset you select will affect your path in subsequent steps.
You should have a flow that looks similar to the image below.
Remove the connection between S3 and Data Validation. Click on the S3 connector and click on Delete Card. Then, connect your dataset to Data Validation.
Click on your Data Validation (the dataset in this tutorial is restaurant-2). Next, type in your path (mine was /data/restaurant-2/rating_final.csv) and click on Save Changes
Click on Run.
Depending on your metacloud account, it might take a couple of minutes to run. If you see success, you have successfully created a recommendation system! The next tutorial will go over how to deploy these recommendations.
Conclusion
There is an ever expanding marketplace of blueprints and components for a variety of AI applications.
This post went over recommendation systems and how you can create them using AI Blueprints with little to no code. Keep in mind that it is very easy to do other amazing things with AI Blueprints like twitter sentiment analysis, text summarization, computer vision, and much more. If you would like to share what you have done with AI Blueprints, you can post about it in the Intel® Tiber™ AI Studio community or @cnvrg_io on twitter.
If you would like to watch a full video workshop of how to build a recommender system with AI Blueprints, you can find a hands-on video tutorial here: https://cnvrg.io/webinars-and-workshops/recommender-system-workshop/