-
Notifications
You must be signed in to change notification settings - Fork 2
Model
The model is the entity that keeps the data that we want to serve to the clients. It is composed by different parts and it needs to be attached
to a container
before using it. If you haven't read about the containers
check this link.
Each model has some metadata
that is added at creation time. The metadata looks like the following:
-
Name
. Name of the model -
SignalOrder
. Each key in the data is calledsignal
. To enforce the validation on the definition of each key, this list contains the amount of parameters that compose the key. Check Signal Ordering for more information -
Concatenator
. If the key is composed by multiple variables, the concatenator character helps validating the key. Check Concatenator for more information
In Redis the model
entity is created as follow:
-
Hash = Name
. The name of the model must be unique and it will compose the Hash in Redis -
Key = signalID
. The key is the signalID of the dataset you are uploading to Phoenix -
Field = [ {} ]
. The Field is a list of objects then it will be returned to the client. This is serialized as JSON string for simplicity
With this system we guarantee that there won't be clashes in the dataset. Note that the APIs keeps track of the models' name
that are created to avoid duplication. As mentioned before, the name of the model must be unique. The APIs will throw a validation error in case of duplication.
As mentioned previously, we upload data in a Key/Value
fashion. Although, the key
can be composed by multiple parameters. In fact, it is common to have keys that is a combination of userId
and articleId
or other forms. To add a little layer of validation, the APIs asks the client to supply the information regarding the dataset. To make the above explanation simpler, let's build two examples.
The dataset looks like the following
{"signalId":"1","recommended":[{"item":"a","score":"0.5"},{"item":"b","score":"0.4"}]}
{"signalId":"2","recommended":[{"item":"c","score":"0.5"},{"item":"d","score":"0.4"}]}
{"signalId":"3","recommended":[{"item":"a","score":"0.5"},{"item":"c","score":"0.4"}]}
When you create the model
in Phoenix the values for the signal ordering
will look like the following:
{
...
"signalOrder":["userId"],
"concatenator":""
}
In this way, when the system uploads the dataset, it can validate that the "signalId"
is actually formed by 1 parameter only. Of course, it cannot check if the actual "signalId"
values exists, but at least we are enforcing the correct form.
Now, let's assume that the dataset looks like the following
{"signalId":"1_aaa_34","recommended":[{"item":"a","score":"0.5"},{"item":"b","score":"0.4"}]}
{"signalId":"2_aaa_34","recommended":[{"item":"c","score":"0.5"},{"item":"d","score":"0.4"}]}
{"signalId":"3_aaa_34","recommended":[{"item":"a","score":"0.5"},{"item":"c","score":"0.4"}]}
When you create the model
in Phoenix the values for the signal ordering
will look like the following:
{
...
"signalOrder":["userId", "articleId", "paramX"],
"concatenator":"_"
}
Now, when you are uploading the data, the system performs the following check. It counts the number of signalOrder
elements, it splits the signalId
by the concatenator
value, and it performs a length
matching. If that's not correct, it will store the line number
where the error was found and returned to the client when checking the upload
status (or directly in case of upload data directly).
The concatenator is a character used to connect multiple signals together. To avoid having the client to specify "random"
characted as concatenator, you can choose only from the following list: '|','#','_','-'
. We believe that other characters would trigger false-positive validation errors or not at all.