Manage your Data and do Feature Engineering
The Files section, and the Feature Engineering section of the Workbench is where you import, manage, and transform raw data into valuable features to enhance machine learning models, enabling efficient data preparation and preprocessing.
The data used to set up your recommender will likely consist of historical information. Whether offers that have been taken up by clients, and/or characteristics of the client at the point they were presented with offers.
Add Data
In the Data and Features section of the Workbench, you will find Manage Files. Here, you will be able to add, view, delete and download the data files available for you to build predictions with.
Upload Data
To upload a file of your own, select + Upload File. A section will open below the files list where you can input the details of your upload. Files must be uploaded in either CSV or JSON format. Upload and then refresh, the file will appear in your files list.
Download Data
To download a file, click on the file name. A section will open up where you can view the details of your download. Click Download and select your download location.
Delete Data
To delete a file, click Delete to the right of the file name. Deleting a file from here will remove the file from all projects whether active or inactive!
Connect a Database
In the Data and Features section of the Workbench, you will find Feature Engineering. Add a database using connection strings with the Presto Data Navigator.
If you have your own database, you can connect it here. This database access option uses the Presto Worker in the platform. Add a Connection path, similar to this example: local/master?user=admin
. Then write a SQL statement to extract the data you want, similar to this example:select * from master.bank_customer limit 2
. Then click Execute.
Ingest Data
Ingest data to be used in your projects with the Ecosystem Data Navigator. Once data has been added to the Platform it must be ingested into a specified database and collection.
Add a Database
You can either select a database and ingest your file into it, or create a new database by selecting + Add Database.
Add a unique database name related to your project. Click the Database button to the left of the input field to create it.
Once your database has been created, refresh the database list and click into it.
Ingest Collections
To ingest your file as a new collection inside your chosen database, select + Ingest Collection.
Select your file from the file list. You will see the file name appear above the Ingest: input field. Either copy this name or choose a unique one related to your project, then click Ingest to the left of the input.
Find & Export Collections
Once data has been ingested, it must be put into a format in which it can be used by machine learning algorithms. In order to do this, export your Collection to the ecosystem platform. You can then create a feature store from the exported data. Refresh the page if your Collection has not yet appeared on the list. Find your Collection. Using the Options dropdown to the right of your collection name, click Export.
View and edit the details of your export. Most of the settings in this tab can remain default. If you are unsure of how much data to export, leave the Number to Export as 0 to export it all. Click Export.