ML and Relevancy Entity Structure

Madhuri Dange

April 24, 2024 09:56

This is an entity-centric approach to learning and relevancy that results in easy views of all relationships between a given set of entities and an unambiguous translation from metadata instructions to engine operations. The specifics of when to train models, run algorithms, and cache data are left to the ML engine to operate behind the scenes. Some metadata may be required to expose and fine-tune those operations over time, but for now this should be adequate to move forward, create a generic code structure, and implement our existing functionality as proof of concept.

This topic consists of below sub-topics:

ML Service Metadata
ML Model Metadata
Entity Relation Metadata
Relevancy Context Metadata
Relevancy Query
Relevancy Result Caching

ML Service Metadata

Machine Learning Services

This entity encapsulates information about an ML Service and the plugin we've written to deal with it.

Fields

Name
dotNetPlugInID - links to dotNet Plug Ins record containing plugin information
Username for the service
API key for the service

SubType: MachineLearningModelTypes

For example, Google Predictions support CLASSIFICATION or REGRESSION, and BigML lets you chose a single model, a bagged ensemble, or a random decision forest ensemble.

Fields

Name
Description

SubSubType: MachineLearningModelTypeInputParameters

Fields:

Type (dropdown)
Optional (boolean)
Variadic (boolean)
Description

ML Model Metadata

Machine Learning Models

This entity encapsulates the data we need in order to get a prediction from a service. That means it knows not only what the predictive endpoint is, but also what data makes up the model, what the feature being predicted is, and the type of model that's been (or needs to be) trained.

Fields

Name
Description
EntityRelationID - links to an Entity Relation, which specifies the Origin and Target entities
PredictedFeature - any field from the Views in the FeatureSets subtype
MachineLearningServiceID
MachineLearningModelType - any model type from the model types on the linked MachineLearningServiceID
PMML - Predictive Model Markup Language describing the model – not all services provide this, but for those that do we should grab it; it will make moving a model between services possible. This is a read-only field since changing the PMML will not actually change the model being run – it's for archiving purposes.

SubType: MachineLearningModelFeatureSets

Outside services only need one feature set, but some of our own algorithms need more than one.

Fields:

ViewID
Description

I created a view type for these views to group them for ML. They currently require manual modification in custom SQL because Aptify does not support renaming columns in views and it is programmatically convenient to standardize the column names for feature sets.

SubType: MachineLearningModelInputParameters

Validation will ensure that every required input parameter of the selected Model Type is specified here.

Fields:

MachineLearningModelTypeInputParameterID
Value

Entity Relation Metadata

Entity Relations

This entity encapsulates an Origin and Target entity.

Fields:

Name (computed: OriginEntityName -> TargetEntityName)
OriginEntityName
TargetEntityName

The Form Template should have an extra tab, Associated Models, with a view of all Machine Learning Models that have this Entity Relation.

Relevancy Context Metadata

Relevancy Contexts

This entity represents the context for which we want relevancy data. For example, Tab relevancy, Search relevancy, etc.

Fields:

Name
Description

Subtype: RelevancyContextRelations

A validation script or process flow will ensure that no two RelevancyContextRelations on the same Relevancy Context have the same origin and target; relevancy requests must be unambiguous!

Fields:

MLModelID
MLModelID_Name (virtual)
MLModelID_OriginEntityName (virtual)
MLModelID_TargetEntityName (virtual)

Relevancy Query

Relevancy Queries

This is the structure of the form data that must be sent to the Relevancy Service:

context: "contextname"

origintargetpairs: { {"originentityname": {comma-delimited-list of origin record IDs}, {"targetentityname": {comma-delimited-list of target record IDs}}, {etc.}, {etc.}

}

Relevancy Result Caching

ML Multi-Model Caches

Each origin->target entity pair has a distinct ML Multi-Model Cache, but all models with the same origin->target entity pair use the same ML Multi-Model Cache record.

Fields

EntityRelationID - links to the origin->target entity pair
EntityRelationID_Name (virtual)

SubType: RecordRelationCaches

Fields:

OriginRecordID
TargetRecordID

SubSubType: RecordRelationCacheWeights

Fields:

MachineLearningModelID - restricted to models that have the same EntityRelation as this ML Multi-Model Cache
Weight

This updated UML diagram represents the first functional build of the entities (minus the web MLService plugin subtype, oops!)

UML Diagram for ML and Relevancy Entities.png

Notice

We have upgraded our support system to serve you better.
For Support, please go to our Momentive Support Hub located here.

ML and Relevancy Entity Structure

ML Service Metadata

Machine Learning Services

SubType: MachineLearningModelTypes

Fields

SubSubType: MachineLearningModelTypeInputParameters

Fields:

ML Model Metadata

Machine Learning Models

Fields

SubType: MachineLearningModelFeatureSets

Fields:

SubType: MachineLearningModelInputParameters

Fields:

Entity Relation Metadata

Entity Relations

Fields:

Relevancy Context Metadata

Relevancy Contexts

Fields:

Subtype: RelevancyContextRelations

Fields:

Relevancy Query

Relevancy Queries

Relevancy Result Caching

ML Multi-Model Caches

Fields

SubType: RecordRelationCaches

Fields:

SubSubType: RecordRelationCacheWeights

Fields:

Comments

Articles in this section

Notice

We have upgraded our support system to serve you better. For Support, please go to our Momentive Support Hub located here.

ML Service Metadata

Machine Learning Services

SubType: MachineLearningModelTypes

Fields

SubSubType: MachineLearningModelTypeInputParameters

Fields:

ML Model Metadata

Machine Learning Models

Fields

SubType: MachineLearningModelFeatureSets

Fields:

SubType: MachineLearningModelInputParameters

Fields:

Entity Relation Metadata

Entity Relations

Fields:

Relevancy Context Metadata

Relevancy Contexts

Fields:

Subtype: RelevancyContextRelations

Fields:

Relevancy Query

Relevancy Queries

Relevancy Result Caching

ML Multi-Model Caches

Fields

SubType: RecordRelationCaches

Fields:

SubSubType: RecordRelationCacheWeights

Fields:

Articles in this section

We have upgraded our support system to serve you better.
For Support, please go to our Momentive Support Hub located here.