Skip to content
This repository has been archived by the owner on Dec 5, 2024. It is now read-only.
Davide Berdin edited this page Nov 18, 2019 · 4 revisions

The model is the entity that keeps the data that we want to serve to the clients. It is composed by different parts and it needs to be attached to a container before using it. If you haven't read about the containers check this link.

Content

Each model has some metadata that is added at creation time. The metadata looks like the following:

  • Name. Name of the model
  • SignalOrder. Each key in the data is called signal. To enforce the validation on the definition of each key, this list contains the amount of parameters that compose the key. Check Signal Ordering for more information
  • Concatenator. If the key is composed by multiple variables, the concatenator character helps validating the key. Check Concatenator for more information

In Redis the model entity is created as follow:

  • Hash = Name. The name of the model must be unique and it will compose the Hash in Redis
  • Key = signalID. The key is the signalID of the dataset you are uploading to Phoenix
  • Field = [ {} ]. The Field is a list of objects then it will be returned to the client. This is serialized as JSON string for simplicity

With this system we guarantee that there won't be clashes in the dataset. Note that the APIs keeps track of the models' name that are created to avoid duplication. As mentioned before, the name of the model must be unique. The APIs will throw a validation error in case of duplication.

Signal Ordering

As mentioned previously, we upload data in a Key/Value fashion. Although, the key can be composed by multiple parameters. In fact, it is common to have keys that is a combination of userId and articleId or other forms. To add a little layer of validation, the APIs asks the client to supply the information regarding the dataset. To make the above explanation simpler, let's build two examples.

Simple Key

The dataset looks like the following

{"signalId":"1","recommended":[{"item":"a","score":"0.5"},{"item":"b","score":"0.4"}]}
{"signalId":"2","recommended":[{"item":"c","score":"0.5"},{"item":"d","score":"0.4"}]}
{"signalId":"3","recommended":[{"item":"a","score":"0.5"},{"item":"c","score":"0.4"}]}

When you create the model in Phoenix the values for the signal ordering will look like the following:

{
    ...
    "signalOrder":["userId"],
    "concatenator":""
}

In this way, when the system uploads the dataset, it can validate that the "signalId" is actually formed by 1 parameter only. Of course, it cannot check if the actual "signalId" values exists, but at least we are enforcing the correct form.

Composed Key

Now, let's assume that the dataset looks like the following

{"signalId":"1_aaa_34","recommended":[{"item":"a","score":"0.5"},{"item":"b","score":"0.4"}]}
{"signalId":"2_aaa_34","recommended":[{"item":"c","score":"0.5"},{"item":"d","score":"0.4"}]}
{"signalId":"3_aaa_34","recommended":[{"item":"a","score":"0.5"},{"item":"c","score":"0.4"}]}

When you create the model in Phoenix the values for the signal ordering will look like the following:

{
    ...
    "signalOrder":["userId", "articleId", "paramX"],
    "concatenator":"_"
}

Now, when you are uploading the data, the system performs the following check. It counts the number of signalOrder elements, it splits the signalId by the concatenator value, and it performs a length matching. If that's not correct, it will store the line number where the error was found and returned to the client when checking the upload status (or directly in case of upload data directly).

Concatenator

The concatenator is a character used to connect multiple signals together. To avoid having the client to specify "random" characted as concatenator, you can choose only from the following list: '|','#','_','-'. We believe that other characters would trigger false-positive validation errors or not at all.

Clone this wiki locally