◾Smartico Data Warеhouse
Access to Smartico DWH is a paid service that should be requested separately.
Please contact your Customer Success Manager to get more details.
Intro
Smartico provides access to the Data Warehouse (DWH) with row data related to the system operation. The DWH is based on Google BigQuery and exposes facts and dimensions tables.
There are mainly three main domains of data exposed through the DWH:
CRM - information about campaigns, communication, conversions, automation rules, etc
Gamification - missions, levels, tournaments, mini-games, points, store purchases
Affiliation - registration facts, financial transactions, payments, adjustments, affiliates information
Some of these domains have shared information, e.g. bonuses - can be given from the CRM and from Gamification. In the Affiliation domain you will also find CRM data, so you can analyse all communication that is addressed to the affiliates.
Smartico DWH shouldn't be used to run analytical queries directly, as limitations apply to how much data can be retrieved daily.
Instead, you should do a batch load of changes to your analytical system and run reports on your side.
All fact tables are partitioned by date for your convenience so you can efficiently load only the data delta for the completed day. We recommend running the import of the previous day a few hours after midnight in UTC to ensure that last-day data is fully delivered to the fact tables.
Don't apply functions on the partition columns when filtering with 'WHERE' clause. The correct example of taking data for the last day is - 'WHERE fact_date >= TIMESTAMP_TRUNC(TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY), DAY)'
Pay attention to the fact that the data set 'dwh_ext_xxx' is given as an example. The dataset name for your particular setup is the same as the name of the user to whom you will get to access DWH.
Records in the fact tables are immutable (except cases when stated explicitly), which means they are not changing after they are added.
Fact tables are updated in real-time, and dimensional tables are updated once per hour.
Segments export
You can query DWH to get a near-real-time list of users that belong to specific segments.
To make a segment available for exporting, it first needs to be marked as exportable through the DWH in Smartico BackOffice, as shown on the screen below.
After that, segment content can be queried using the following SQL, where XXX needs to be replaced with the ID of your label and YYY with the ID of the segment.
Example of query results
Query for user profile details
You can query DWH to get detailed information about users stored on the Smartico side.
Note that only a small subset of user profile properties are exposed through the DWH, and only properties that are shared through all the clients. If you have a need to expose specific properly, please contact Smartico support.
Currently exposed properties:
Property | Type | Meaning |
---|---|---|
user_id | INT64 | ID of the user in Smartico system |
user_ext_id | STRING | ID of the user in the integrated platform |
core_registration_date | TIMESTAMP | Date/time of registration |
core_user_last_time_online | TIMESTAMP | Date/time when user was last time online |
ach_level_current_id | INT64 | ID of the current gamification level |
ach_points_ever | INT64 | Amount of points user collected ever |
ach_points_balance | INT64 | Current points balance of user |
user_country | STRING | Country |
core_user_language | STRING | Language, ISO code |
core_user_last_device_type | STRING | Last used device type (Desktop, Mobile, Native, Wrapper) |
core_wallet_currency | STRING | Currency code |
core_tags | STRING[] | Array of user markers |
core_public_tags | STRING[] | Array of public user marker |
core_external_markers | STRING[] | Array of external user markers |
Transactions deduplication
Note that in rare cases some fact tables may contain duplicated records. This may happen in very exceptional cases and doesn't have impact on operation part of system, so for example if there is duplicated record for the "mail sent" fact, the mail wasn't sent twice, just a fact of sending is duplicated.
You can do a deduplication using the field marked as "PK" in corresponding table.
Dimensional tables
dwh_ext_xxx.dm_activity_type
List of possible "activities" that can be used to build a Flow for the scheduled & real-time campaigns. There are activities like "Send SMS", "Send Mail", "Give Bonus", "Stop campaign" etc.
There are essential activities that exist in any campaign, like
Journey start (1) - represents the fact that the campaign started
Journey stopped (2) - represents the stop of the campaign
Journey Converted (3) - represents the fact that the campaign was converted
Table structure:
Column | Type | Note |
---|---|---|
activity_type_id | int64 | |
activity_name | string |
dwh_ext_xxx.dm_audience
The list of campaigns. Includes both real-time and scheduled.
Table structure:
Column | Type | Note |
---|---|---|
audience_id | int64 | |
audience_name | string | |
entry_mode_id | int64 | 0 - ONCE_IN_A_LIFE 1 - ONCE_IN_AN_OPEN_JOURNEY 2 - EVERY_TIME_CONDITION_MET 3 - STOP_AND_START |
audience_status_id | int64 | 1 - Draft 2 - Active 3 - Paused 4 - Disabled 5 - Archived 6 - Executed (for Scheduled only) |
audience_exec_type_id | int64 | 1 - Realtime campaign of Marketing category 2 - Realtime campaign of Operational category 3 - Scheduled campaign of Marketing category 4 - Scheduled campaign of Operational category |
category_id | int64 | 1 - Marketing campaign 2 - Operational campaign |
dwh_ext_xxx.dm_resource
Communication resources (templates) of all types, e.g. SMS, popups, emails, etc.
Note that the list includes both - main resources and variations of the main resources
Table structure:
Column | Type | Note |
---|---|---|
resource_id | int64 | |
resource_name | string | |
resource_subject | string | Subject line for Mail, Push, Inbox For SMS will contain a full body of SMS For Popups and IVR will be null, as these resource types don't have a concept close to the 'subject'. Note that string can contain tags that are replaced during the actual building of the resource for specific user |
resource_type_id | int64 | 1 - Mail 2 - Popup 3 - SMS 4 - Push 5 - Inbox 9 - IVR |
dwh_ext_xxx.dm_providers_sms
List of all sms sent with their label, id, provider name and type, and create and update times.
Column | Type | Note |
---|---|---|
create_date | timestamp | Time when this communication provider is defined |
update_date | timestamp | Time when this communication provider is last updated |
id | int64 | ID of the communication provider |
provider_name | string | Name of the exact provider |
provider_type | string | Name of the overall service provider |
dwh_ext_xxx.dm_providers_mail
List of all emails sent with their label, id, provider name and type, and create and update times.
Column | Type | Note |
---|---|---|
create_date | timestamp | Time when this communication provider is defined |
update_date | timestamp | Time when this communication provider is last updated |
id | int64 | ID of the communication provider |
provider_name | string | Name of the exact provider |
provider_type | string | Name of the overall service provider |
dwh_ext_xxx.dm_ach
The table contains information about missions and badges.
Column | Type | Note |
---|---|---|
ach_id | int64 | ID of mission/badge |
internal_name | string | Name of mission/badge |
ach_type_id | int64 | Type of achievement, 1 - Mission, 2 - Badge |
requires_optin | bool | Indicator if mission requires explicit opt-in |
limit_in_time_ms | int64 | Time in milliseconds for the time limited missions |
dwh_ext_xxx.dm_saw_template
.The table contains information about mini-game templates
Column | Type | Note |
---|---|---|
saw_template_id | int64 | ID of the game template |
template_name | string | Name of template |
saw_game_type_id | int64 | Type of the game: 1 - Spin The Wheel 2 - Scratch & Catch 3 - Match X 4 - Gift box |
saw_buyin_type_id | int64 | Type of the buy-in: 1 - Free of charge 2 - Gamification points 3 - Spin attempts |
is_visitor_mode | bool | If true, indicates that game is targeting visitors, otherwise only registered users |
dwh_ext_xxx.dm_shop_item
The table contains information about missions and badges.
Column | Type | Note |
---|---|---|
item_id | int64 | ID of shop item |
item_name | string | Name of item |
item_type_id | int64 | Type of item: 1 - Bonus 2 - Tangible 3 - Mini-game attempt(s) 4 - Level upgrade |
item_status_id | bool | Status: 1 - Draft 2 - Active 3 - Archived |
dwh_ext_xxx.dm_saw_prize
The table contains information about missions and badges.
Column | Type | Note |
---|---|---|
saw_prize_id | int64 | ID of prize |
saw_prize_name | string | Name of prize |
saw_template_id | int64 | ID of the mini-game template to which the prize belongs, see dm_saw_template |
prize_value | bool | Value of prize in case it is a type of "gamification points". |
prize_type_id | int64 | Type of the prize: 1 - No Prize 2 - Points 3 - Bonus 4 - Tangible 5 - Spin in the mini-game 6 - Jackpot 7 - Level upgrade |
is_surcharge | bool | If the prize is a surcharge (will be given when no other prizes are left) |
dwh_ext_xxx.dm_bonus_template
List of all bonus templates with their label, id, type and name.
Column | Type | Note |
---|---|---|
label_id | int64 | The id of the label to which this bonus templated belongs |
label_bonus_template_id | int64 | The id of the bonus template for the specified label |
product_bonus_type_id | int64 | Id that indicates the bonus type |
public_name | string | Bonus template name |
dwh_ext_xxx.dm_tournament
The table contains information about tournaments and tournament instances.
id | int64 | Primary identifier for the tournament. |
name | string | Name of the tournament. |
logic_version_id | int64 | ID representing the logic version of the tournament.
|
registration_type_id | int64 | ID for the registration type of the tournament.
|
duration_ms | int64 | Duration of the tournament in milliseconds. |
players_min_count | int64 | Minimum number of players required for the tournament. |
players_max_count | int64 | Maximum number of players allowed, optional. |
tournament_type_id | int64 | ID represents the type of tournament.
|
tournament_status_id | int64 | ID indicating the status of the tournament.
|
last_tournament_create_date | timestamp | Timestamp of when the last tournament was created. |
show_before_start_ms | int64 | Time in milliseconds to show the tournament before start. |
allow_late_registration | bool | Indicates if late registration is allowed. |
dwh_ext_xxx.dm_ach_points_change_source
Lookup table with possible sources for the points transactions.
Column | Type | Note |
---|---|---|
source_type_id | int64 | ID |
source_type_name | string | Name of the source |
CRM fact tables
dwh_ext_xxx.j_av
Includes all the attribution value data. You can read more about attribute value here
Partitioned by: event_date
Data availability: last 180 days by event_date
Primary key: id
Note: the data can mutate over the time and depends if the deposit events are delivered with delay
Column | Type | Note |
---|---|---|
id | int64 | The incremental id of the calculation fact |
event_date | timestamp | The date when the attribution event (deposit) happened |
resource_id | int64 | FK: dm_resource, main resource that was used for the communication |
audience_id | int64 | FK: dm_audience, the campaign that triggered communication |
value | numeric | The attribution amount in the label base currency that resource+campaign generated. |
raw_value | numeric | The raw deposit amount in the original deposit currency |
converted_value | numeric | The deposit amount converted to the label base currency |
event_id | int64 | The id of the attribution event (deposit) on smartico side |
engagement_id | int64 | Reference to the j_engagements, represents the point in the Campaign/Flow from where the communication was triggered |
root_engagement_id | int64 | Reference to the j_engagements, represents the "root" point in the Campaign/Flow from where the communication was triggered |
kpi_impact_id | int64 | The id of the attribution value impact 1 - Because of resource delivery 2 - Because the user has an impression on the resource 3 - Because the user has interaction with the resource (click) 4 - Indirect impact, when nothing from above happened |
default_currency | string | The base currency of the label |
user_id | int64 | The user id of the user who made the deposit |
calculated_ts | timestamp | Timestamp when the calculation happened |
Example of query that returns attribute value per campaign (audience_id)
Example of query that returns conversion of specific user per campaign, including the time of conversion
dwh_ext_xxx.j_communication
This includes all communications sent to the end-user and all possible outcomes (fact_type_id) of these communications, like failure of delivery, impressions, clicks, etc.
Ideas to understand communications concept
the communication is always triggered by some "activity" from the campaign. This activity is referenced by engagement_id or engagement_uid. Such "activity" is part of the campaign, where the starting point of the campaign (the fact that the user entered into the campaign) is presented by the root_engagement_uuid & root_engagement_create_date. The campaign definition itself if presented by root_audience_id
The first fact that happens for communication is CREATE, after that, it is usually either SENT and DELIVERED or FAILED. It can be later seen by the end-user (IMPRESSION) and CLICKED. Users can also OPTOUT from the communication by following the "opt-out" link in the resource. For Inbox type of message, the user can also READ it and DELETE
The communication is always connected to some resource (template), presented as resource_id, and in case the resource has variations, by the resource_variation_id
Note that smartico is tracking the fact of any communication only 30 days after it's created
Partitioned by: fact_date
Data availability: last 180 days by fact_date
Primary key: engagement_id + fact_type_id + fact_date
Table structure:
Column | Type | Note |
---|---|---|
fact_date | timestamp | The date/time when fact happened |
user_id | int64 | |
fact_type_id | int64 | CREATE = 1 SENT = 2 IMPRESSION = 3 CLICK = 4 FAIL = 5 DELIVERED = 6 OPTOUT = 7 READ = 8 (for Inbox only) DELETE = 9 (for Inbox only) |
fact_details | string | for CLICK fact - the link that was clicked for FAIL - the detailed reason for fail |
activity_type_id | int64 | Type of activity that triggered that communication (FK: dm_activity_type) |
engagement_uid | uuid | Reference to the j_engagements, represents the point in the Campaign/Flow from where the communication was triggered |
resource_id | int64 | FK: dm_resource, main resource |
resource_variation_id | int64 | FK: dm_resource, specific variation of the resource |
root_audience_id | int64 | FK: dm_audience, the campaign that triggered communication |
root_engagement_uuid | uuid/string | Reference to the j_engagements, represents the "root" point in the Campaign/Flow from where the communication was triggered |
root_engagement_create_date | timestamp | The date when the campaign that triggered specific communication is started for the user |
engagement_id | int64 | Similar to engagement_uid, but in the int64 presentation |
communication_id | int64 | The unique ID of the communication will be the same for all facts through which the communication is transitioning. The ID id generated on the fact CREATE(1) |
audience_category_id | int64 | Category of the campaign. 1 - Marketing 2 - Operational |
external_id | string | ID of the communication assigned by the external gateway at the moment when the external gateway accepted communication from Smartico |
label_provider_id | int64 | ID of the communication provider (dm_providers_mail and dm_providers_sms) |
planned_send_date | timestamp | In some setups, the campaign that creates communication is scheduling delivery for a later time, in such case planned_send_date will reflect this date |
fail_reason_id | int64 | ID of the fail reason, FK: dm_com_fail_reason |
user_ext_id | string | The ID of the user in the operator system |
crm_brand_id | int64 | ID of the brand in the Smartico system |
Example of query that returns counts of communication by status and provider for the past 7 days.
dwh_ext_xxx.j_engagements
The table represents the steps of how the user enters the campaign, exits it, converts, and goes through all the "activities" of the Flow defined in the campaign.
The table represents the recursive structure of user transition over the campaign graph.
For most of the use cases, you will need to analyze only records that have activity_type_id as one of
1 - represents the start of the campaign
2 - stop the campaign
3 - conversion of campaign
The table structure below describes only essential fields that could be used to analyze the campaign performance. If you need a deeper analysis, please contact support@smartico.ai for an explanation.
Partitioned by: create_date
Data availability: last 180 days by create_date
Primary key: engagement_id or engagement_uid
Table structure:
Column | Type | Note |
---|---|---|
engagement_id | int64 | unique id |
engagement_uid | uuid/string | same as unique id, but uuid presentation |
root_audience_id | int64 | FK: dm_audience |
create_date | timestamp | time when user got this activity |
user_id | int64 | |
root_engagement_id | int64 | reference to the same table pointing to the starting point of this campaign for the specific user |
root_engagement_create_date | timestamp | The date when the campaign is started for a specific user |
activity_type_id | int64 | FK: dm_activity_type |
user_ext_id | string | The ID of the user in the operator system |
resource_id | int64 | ID of the resource if the activity is related to the communication. Important that the field is populated retrospectively every night at 3 UTC |
expected_stop_date | timestamp | Expected stop date/time of the campaign, set as create_date of the activity_type_id = 1 (start of the campaign) + campaign duration set in the BackOffice. Note that the campaign can be forcefully stopped inside the flow using "Stop" activity |
ab_target | boolean | Indicating if the current element in the campaign flow was set as an AB test target for the communication element followed this target |
Example of query that returns
data of label 2305
users entered campaign 793485 after 20/11/2023
and converted in this campaign
dwh_ext_xxx.j_bonuses
The table represents the history of bonuses given to users, including all the attempts to issue bonuses through the API provided by the platform.
Each bonus fact can be represented in the table multiple times, where each time it reflects the change in the lifecycle of the bonus. The status of the bonus is refllected in the column bonus_status_id with possible values:
1 - NEW
3 - REDEEMED
4 - REDEEM_FAILED
Partitioned by: fact_date
Data availability: last 180 days by fact_date
Primary key: bonus_id
Table structure:
Column | Type | Note |
---|---|---|
fact_date | timestamp | the date of the fact |
bonus_id | int65 | unique ID of the transaction preserved for all facts of this transaction |
user_id | int64 | Smartico user ID |
user_ext_id | string | The ID of the user in the operator system |
crm_brand_id | int64 | ID of the brand in the Smartico system |
bonus_status_id | int64 | status of bonus as explained above |
label_bonus_template_id | int64 | ID of template in the smartico system |
source_product_id | int64 | ID of the sub-system that issued the bonus, e.g. 0 - Campaign 9 - Automation rule 100 - Gamification 125 - Dynamic Bonuses engine 126 - Manual bonus 5000 - Store 8000 - Tournament 8001 - Mini-game for Retention user 8002 - Mini-game for Visitor |
entity_id | int64 | ID of the entity related to the source_product_id that issued the bonus. e.g. If issued by the Automation Rule, this ID will represent ID of the rule. |
root_audience_id | int64 | Similar to the entity_id, but should be used for the Campaign reference |
error_code | int64 | Error code in case of failed to issue bonus |
error_message | string | The detailed error message |
Example of query that returns final statuses of bonuses in the defined period
Gamification fact tables
dwh_ext_xxx.g_ach_completed
The table represents the history of all completed missions & badges.
Partitioned by: create_date
Data availability: last 180 days by create_date
Primary key: ach_completed_id
Column | Type | Note |
---|---|---|
ach_id | INT64 | ID of the mission/badge |
create_date | TIMESTAMP | The time when mission/badge compelted |
is_recurring | BOOLEAN | Indicate if the original mission was recurring |
user_ext_id | STRING | External ID of the user |
crm_brand_id | INT64 | ID of the brand in the Smartico system |
ach_completed_id | INT64 | Unique ID of the completion fact |
Example of the query returning a count of users completed missions in the last 3 days
dwh_ext_xxx.g_ach_optins
The table represents the history of all opt-ins in the missions that require explicit opt-ins to participate.
Partitioned by: create_date
Data availability: last 180 days by create_date
Primary key: ach_id, user_id, create_date
Column | Type | Note |
---|---|---|
ach_id | INT64 | ID of the mission |
create_date | TIMESTAMP | The time when user opted in the mission |
user_id | INT64 | user id in the smartico system |
user_ext_id | STRING | External ID of the user |
crm_brand_id | INT64 | ID of the brand in the Smartico system |
Example of the query returning up to 1000 rows of missions that were active in the past 30 days with counts of opted in customers and customers that completed the mission in that period.
dwh_ext_xxx.g_ach_claimed
The table represents facts of prizes claiming for the missions for which a claim is required.
Read more about "claiming" in the Missions guide.
Partitioned by: create_date
Data availability: last 180 days by create_date
Primary key: ach_completed_id
Column | Type | Note |
---|---|---|
ach_id | INT64 | ID of the mission |
create_date | TIMESTAMP | The time when user claimed the mission |
user_id | INT64 | user id in the smartico system |
user_ext_id | STRING | External ID of the user |
crm_brand_id | INT64 | ID of the brand in the Smartico system |
ach_completed_id | INT64 | Reference to g_ach_completed that represents completed missions |
dwh_ext_xxx.g_ach_points_change_log
The table represents all changes in the gamification points
Partitioned by: create_date
Data availability: last 180 days by create_date
Primary key: NA
Column | Type | Note |
---|---|---|
create_date | TIMESTAMP | The time of transaction |
source_type_id | INT64 | Type of of the source of the change, refer to dm_ach_points_change_source |
source_entity_id | INT64 | Id of the entity that triggered transaction, the lookup table depends on the source_type_id |
points_collected | INT64 | Amount of points in transaction, could be negative in case of deduction |
user_points_ever | INT64 | Amount of points user collected ever after the transaction |
user_points_balance | INT64 | Current points balance after transaction |
user_ext_id | STRING | External ID of the user |
crm_brand_id | INT64 | ID of the brand in the Smartico system |
Example of query showing the source of the points credit and deposit
dwh_ext_xxx.g_minigames
The table represents mini-game gameplay results.
Partitioned by: create_date
Data availability: last 180 days by create_date
Primary key: spin_id
Column | Type | Note |
---|---|---|
create_date | TIMESTAMP | The time of transaction |
saw_template_id | INT64 | ID of mini-game template, refer to dm_saw_template |
saw_prize_id | INT64 | ID of the won prize, refer to dm_saw_prize |
prize_type_id | INT64 | Type of the prize: 1 - No Prize 2 - Points 3 - Bonus 4 - Tangible 5 - Spin in the mini-game 6 - Jackpot 7 - Level upgrade |
user_ext_id | STRING | External ID of the user |
crm_brand_id | INT64 | ID of the brand in the Smartico system |
buyin_cost_points | INT64 | The price of a mini-game attempt paid by the user for the mini-games that cost gamification points (there are other types of mini-games, that are free or "attempt" based) |
prize_amount | NUMERIC | Winning amount, in case the prize is the type of Jackpot or Points |
pool_after | INT64 | The number of the items of this prize that is left in the stock after the user wins the prize identified by saw_prize_id |
spin_id | INT64 | Unique ID of the game play fact |
saw_prize_note | STRING | Reserved for future use |
dwh_ext_xxx.g_tournament_analytics
The table contains information about various tournaments and relevant analytics events.
Each tournament can be represented in the table multiple times, where each time it reflects different tournament instances (different runs).
Note that there is data nested under progress.
Partitioned by: event_time
Data availability: last 180 days by create_date
Primary key: id
Column | Type | Note | |
create_date | TIMESTAMP | Timestamp of when the record was created. | |
update_date | TIMESTAMP | Timestamp of when the record was last updated. | |
id | INT64 | Primary identifier for the tournament. | |
name | STRING | Name of the tournament, optional. | |
logic_version_id | INT64 | ID representing the logic version of the tournament. | |
registration_type_id | INT64 | ID for the registration type of the tournament.
| |
duration_ms | INT64 | Duration of the tournament in milliseconds. | |
players_min_count | INT64 | Minimum number of players required for the tournament. | |
players_max_count | INT64 | Maximum number of players allowed, optional. | |
tournament_type_id | INT64 | ID representing the type of the tournament.
| |
tournament_status_id | INT64 | ID indicating the status of the tournament.
| |
last_tournament_create_date | TIMESTAMP | Timestamp of when the last tournament was created. | |
show_before_start_ms | INT64 | Time in milliseconds to show the tournament before start. | |
allow_late_registration | BOOLEAN | Indicates if late registration is allowed. | |
event_id | INT64 | ID of the associated tournament analytics event. | |
event_time | TIMESTAMP | Time when the analytics event was recorded. | |
user_id | INT64 | ID of the user associated with the event. | |
event_type_id | INT64 | ID representing the type of the analytics event. | |
progress | RECORD | Repeated record indicating progress details related to the event. | |
| tournament_id | INT64 | ID of the tournament associated with the analytics event. |
| tournament_instance_id | INT64 | ID of the specific instance of the tournament associated with the analytics event. |
| scores | INT64 | Scores attained during the tournament event. |
| total_scores | INT64 | Total accumulated scores during the tournament event. |
| rule_id | INT64 | ID of the rule applied during the tournament event. |
crm_brand_id | INT64 | ID of the brand in the CRM system. | |
user_ext_id | STRING | External ID of the user associated with the analytics event. |
This example provides a comprehensive summary of each tournament instance, focusing on score distributions and rule applications. It allows for easy comparison between different tournament instances and can help identify patterns in performance and rule usage across various tournaments.
The results are ordered by tournament_id and then tournament_instance_id, making it easy to view all instances of a particular tournament together. This structure can be particularly useful for analyzing trends or changes in tournament performance over time or across different instances of the same tournament.
dwh_ext_xxx.g_shop_transactions
The table represents all purchases in the store.
Partitioned by: create_date
Data availability: last 180 days by create_date
Primary key: transaction_id
Column | Type | Note |
---|---|---|
transaction_id | INT64 | Unique ID of transaction |
create_date | TIMESTAMP | The time of transaction |
shop_item_id | INT64 | ID of the shop item, refer to dm_shop_item |
points_amount | INT64 | The price of the item in the gamification points |
user_ext_id | STRING | External ID of the user |
crm_brand_id | INT64 | ID of the brand in the Smartico system |
dwh_ext_xxx.g_ux
The table represents the UI interactions of users with the Gamification widget
Partitioned by: create_date
Data availability: last 180 days by create_date
Primary key: NA
Column | Type | Note |
---|---|---|
label_id | integer | Unique id of the label |
crm_brand_id | integer | Unique id of the brand. Relevant only for multi-brand setup |
create_date | timestamp | Date that represents exactly when the user performed an action (opened a modal or navigated to a some menu section) |
user_ext_id | string | Unique id of user in the integrated system |
user_id | integer | Unique id of user in the Smartico system |
interface_type | integer | Device type of the current user session 0 - Mobile 1 - Desktop |
screen_name_id | integer | Identifier of the screen the user has navigated to. Possible values are shown in the table below - ScreenNameIds |
screen_subname_id | integer | Identifier of the category/section of the screen the user has navigated to. Possible values are presented in the table below and depend on the screen_name_id. Relevant only for screens that have categorized tabs, like store, tournaments, etc |
custom_section_id | integer | Represents custom section id if the current screen is opened in the context of the custom section. |
entity_id | integer | Represents the ID of the entity the user has opened on the current screen. For example: a specific item in the store or a particular tournament. |
ScreenNameIds:
ID | Name of screen |
---|---|
1 | Overview / Home |
2 | Missions screen_subname_id could be 0 - overview 2 - available 3 - locked 4 - completed 5 - missed |
3 | Leaderboards screen_subname_id could be 1 - daily 2 - weekly 3 - monthly |
4 | Tournaments list screen_subname_id could be 0 - overview 3 - in progress 4 - finished 5 - my tournaments |
5 | Tournament lobby (details of specific tournament) screen_subname_id could be 1 - rules 2 - Prizes 3 - Leaderboard 4 - Related games |
6 | Levels (grid presentation) |
7 | Levels (map presentation) |
8 | Badges |
9 | Bonuses screen_subname_id could be 2 - pending 3 - redeemed |
10 | Spin a Wheel |
11 | Store |
12 | Store (purchase history) |
13 | Inbox screen_subname_id could be 1 - All message 2 - Favorite |
14 | Match X |
15 | Quiz |
16 | Jackpots |
17 | Lootbox |
18 | Custom section |
31 | Mission item modal |
32 | Level item modal |
33 | Badge item modal |
34 | Store item modal |
35 | Bonus item modal |
36 | Jackpot item modal |
37 | Username change modal |
38 | Avatar change modal |
dwh_ext_xxx.g_ach_levels_changed
The table represents the log of level upgrades/downgrades for each user.
Partitioned by: fact_date
Data availability: last 180 days by fact_date
Primary key: level_change_id
Column | Type | Note |
---|---|---|
fact_date | TIMESTAMP | The time of transaction |
level_change_id | INT64 | Unique ID of the level change fact |
from_level_id | INT64 | ID of the level before transaction |
to_level_id | INT64 | ID of the level after transaction |
points_change_source_id | INT64 | If level change is triggered by points balance change, then this field represents the source of points change. Possible IDs: |
points_collected_on_level_change | INT64 | If level change is triggered by points balance change, then this field represents the amount of points added to the user at the moment of upgrade |
user_points_ever | INT64 | "Points ever" balance of user after level hcange |
user_ext_id | STRING | External ID of the user |
crm_brand_id | INT64 | ID of the brand in the Smartico system |
Affiliation dimensional & fact tables
This is an example of the query that calculates total commission split by components (CPA, CPL, RevShare) for all affiliates lifetime.
Notes:
Provided example of calculating of sub_affiliate_fee should be used only in the setup with one level of sub-affiliates network (when only direct parent of affiliate is getting attribution from the comission generated by his children). In case of multi-level affiliation network setup, the sub-affailites commissions should be calculated in the different way
These calculations are not taking into accounts adjustments that can be done in 3 differences cases:
Manual adjustments
Adjustments based on the "Tiers" based deal structures (RevShare tiers)
Adjustment done beginning of the month based on "No negative cary over" calculations
dwh_ext_xxx.dm_affiliate
The table represent information about affiliate
Column | Type | Note |
---|---|---|
affiliate_id | INT64 | ID of the affiliate |
create_date | TIMESTAMP | Date when affiliate is created |
parent_affiliate_id | INT64 | ID of the parent affiliate |
manager_id | INT64 | ID of the manager |
payment_method_id | INT64 | ID of payment method |
ext_affiliate_id | STRING | ID of affiliate in the external system |
dwh_ext_xxx.aff_fin_history_daily
The table represents daily aggregation of all financial data system collects about a player. Each column refers to some type of financial data and reflect the SUM of all numbers affilaite system have received for some "operation_date". To understand the final numbers for a player, you should SUM all records for a player, for example if you need to know the total deposit amount of a player, you should SUM all "deposits" + "first_deposit" columns for a player.
Partitioned by: operation_date
Table structure:
Column | Type | Note |
---|---|---|
ext_customer_id | STRING | Player ID the way it's reported to affiliate system |
brand_id | INTEGER | Brand ID in affiliate system |
country | STRING | ISO2 country code of the player |
affiliate_id | INTEGER | Affiliate ID in case player belongs to one |
registration_id | INTEGER | TAP registration id |
operation_date | TIMESTAMP | financial day |
update_date | TIMESTAMP | last update date |
first_deposit | NUMERIC | first deposit amount |
deposits | NUMERIC | deposits amount reported during operation_date |
withdrawals | NUMERIC | withdrawals amount reported during operation_date |
chargebacks | NUMERIC | chargebacks amount reported during operation_date |
bonuses | NUMERIC | bonuses amount reported during operation_date |
deductions | NUMERIC | deductions amount reported during operation_date |
bets | NUMERIC | bets amount reported during operation_date |
wins | NUMERIC | wins amount reported during operation_date |
operations | NUMERIC | operations count reported during operation_date |
ggr | NUMERIC | GGR amount reported during operation_date |
cpl_fee | NUMERIC | CPL fee generated during operation_date |
cpa_fee | NUMERIC | CPA fee generated during operation_date |
rev_share_fee | NUMERIC | RevShare fee generated during operation_date |
sub_affiliate_fee | NUMERIC | Total sub affiliate fee generated during operation_date for all affiliate levels |
net_pnl | NUMERIC | NET PnL calculated during operation_date |
deposit_count | NUMERIC | deposits count reported during operation_date |
withdrawal_count | NUMERIC | withdrawals count reported during operation_date |
q_cpa | NUMERIC | if value is 1, it means that at this operation_date player was qualified for CPA |
q_cpl | NUMERIC | if value is 1, it means that at this operation_date player was qualified for CPL |
How to connect to the Google BigQuery SQL
Smartico leverages Google BigQuery to offer robust and scalable data analytics capabilities. As part of our service, we provide a secure JSON key file that enables programmatic access to our DWH. This section outlines how to use the provided JSON key to connect to BigQuery, depending on the Business Intelligence (BI) tool you choose.
Overview of JSON Key Usage
A JSON key file is used as a credential for authentication. It allows secure access to Google BigQuery without needing user-specific Google credentials. The JSON key file you receive will be associated with a specific service account created and managed by our company, which has been granted access to necessary BigQuery resources.
Important Security Note
Please store your JSON key file securely and ensure it is only accessible to those who require it. Do not share the JSON key publicly or with unauthorized users, as it provides direct access to our Data Warehouse.
Connecting to BigQuery Using Various BI Tools
Different BI tools have varied methods for setting up connections to BigQuery using a JSON key. Below, we offer guidance on connecting some popular BI tools to our DWH. For detailed instructions, refer to the specific documentation for the BI tool you are using.
Note: you shouldn't run analytical queries directly on Smartico DWH, as it may incur high costs. The proper way is to build a pipeline that loads daily data in your internal DWH and build reports on top of it.
Google Data Studio
Connection Type: Direct integration with Google services.
Key Usage: Typically, Data Studio uses OAuth 2.0 for authentication, which does not require a service account JSON key if using a Google account with access rights.
Tableau
Connection Type: Direct connection using a service account.
Key Usage: Upload your JSON key file during the setup of the BigQuery connection in Tableau.
Looker
Connection Type: Direct connection using a service account.
Key Usage: Paste the contents of the JSON key file into the appropriate field when configuring the BigQuery connection in Looker.
Microsoft Power BI
Connection Type: Connect using ODBC drivers.
Key Usage: Use the Simba ODBC Driver for Google BigQuery and specify the path to your JSON key file in the DSN configuration.
Qlik Sense
Connection Type: Connect using ODBC drivers.
Key Usage: Configure a System DSN using the ODBC Data Source Administrator on Windows, including the JSON key file.
FAQ
How to analyze DWH usage
You can find 2 reports related to DWH usage in the Reports \ Analytics section of Smartico BackOffice.
DWH: requests count & data read for the current month - is showing the requests count and amount of data read per day
DWH: queries log for last 24 hours - will show you exact requests completed in the last 24 hours and the amount of data read for each
Can't access DWH from Power BI, error: Unable to authenticate with Google BigQuery Storage API. Check your account permissions.
You need to disable the usage of Storage API in Power BI settings following this guide - https://learn.microsoft.com/en-us/power-query/connectors/google-bigquery.
"Use Storage API" - A flag that enables using the Storage API of Google BigQuery. This option is true by default. This option can be set to false to not use the Storage API and use REST APIs instead.
How to authenticate in Power BI with the service account
Use the following guide from Microsoft - https://learn.microsoft.com/en-us/power-query/connectors/google-bigquery
Service Account Email: must be in email format
Service Account JSON key file contents: once this JSON key is downloaded, all new lines must be removed from the file so that the contents are in one line. Once the JSON file is in that format, the contents can be pasted into this field.
Pay attention the content of JSON file need to be manually formatted in one line
Can't access table dwh_ext_xxx.j_engagements and any other
You need to replace "xxx" value with the label_id specific for your setup. You can also see the label_id in the name of the BigQuery user that has the format l"dwh_ext_1234"
Can different labels and brands have access using the same DWH account?
Every label requires a separate account to access DWH.
You could have many brands under one label setup. Data related to all these labels will be available from the respective DWH account for this label
The documentation says not to use the DW directly for analytical queries, as it may involve costs. Is the best way to carry out a daily data load to an internal DW?
The best approach is to make a daily incremental load from Smartico DWH to your DWH.
Dimensional tables (prefixed as “dm”) are quite small, and you can reload them fully every day.
Other tables (facts tables), are partitioned by date/time, so you can run every day at 1 AM UTC, and get fresh data for the previous day, and load it to your DWH
When using a BI tool such as Power BI in the import model, can I connect directly to the DWH? Is there any financial or performance impact?
If you use Power BI or any other tool to query data directly from Smartico DWH, this may incur a high load and additional costs depending on how complex reports you are doing. Best way is to load data from Smartico DWH to your DWH and connect Power BI to your DWH.
Regarding the "import" mode in Power BI - according to the Microsoft documentation, it can work in two ways - "Full refresh" and "Incremental refresh". Based on the names, the best way is to use "Incremental way" in order to avoid high resource usage, but for exact setup please consult with an expert in the Power BI tool.
Is there a limit on the number of queries that does not generate this extra cost?
There is a limit of 1 TB of scanned data per month (it reflects the way how Google is building pricing for BigQuery usage). There is no limit in number of queries.
Reading the documentation, there are some details we'd like to know: Is the data in the DWH always available for a window of the last 180 days?
Yes, fact tables are limited to last 180 days. If you do incremental/daily load of data finally, you can store unlimited in terms of time, data on your side.
Can we get data from all the campaigns within this DWH? Information related to the segmentations we used, results by channel (sms, email, push, etc), value assigned to the active group and control group, CTR, conversion rate, etc?
You can get results from communication channels like SMS, email, push, etc.
The fact table for communication channels - j_communication, includes all fails, clicks, and impressions, so you can build derivative metrics like CTR, Delivery rate, Fail rate, etc.
j_engagements fact table represents information about campaigns, you can build conversion rate for campaigns using it.
Segment exporting is available through the table-function fn_export_segment, you can find details in the documentation above.
Last updated