◾Access to DWH
Access to DWH is paid service that should be requested separately.
Please contact your account manager to get more details.
Intro
Smartico provides access to the Data Warehouse (DWH) with row data related to the system operation. The DWH is based on Google BigQuery and exposes facts and dimensions tables.
There are mainly three domains of data exposed through the DWH:
CRM - information about campaigns, communication, conversions, automation rules etc
Gamification - missions, levels, tournaments, mini-games, points, store purchases
Affiliation - registration facts, financial transactions, payments, adjustments, affiliates information
Some of these domains have shared information, e.g. bonuses - can be given from the CRM and from Gamification. In the Affilication domain you will find also CRM data, so you can analyse all communication that is addressed to the affiliates.
Smartico DWH shouldn't be used to run analytical queries directly, as limitations apply to how much data can be retrieved daily.
Instead, you should do a batch load of changes to your analytical system and run reports on your side.
All fact tables are partitioned by date for your convenience so that you can load only the data delta for the completed day efficiently. We recommend running the import of the previous day a few hours after midnight in UTC to ensure that last-day data is fully delivered to the fact tables.
Pay attention to the fact that the data set 'dwh_ext_xxx' is given as an example. The dataset name for your particular setup is the same as the name of the user that you will get to access DWH.
Records in the fact tables are immutable (except cases when stated explicitly), which means they are not changing after they are added.
Fact tables are updated in real-time, and dimensional tables are updated once per hour.
Dimensional tables
dwh_ext_xxx.dm_activity_type
List of possible "activities" that can be used to build a Flow for the scheduled & real-time campaigns. There are activities like "Send SMS", "Send Mail", "Give Bonus", "Stop campaign" etc.
There are essential activities that exist in any campaign, like
Journey start (1) - represents the fact that the campaign started
Journey stopped (2) - represents the stop of the campaign
Journey Converted (3) - represents the fact that the campaign was converted
Table structure:
Column | Type | Note |
---|---|---|
activity_type_id | int64 | |
activity_name | string |
dwh_ext_xxx.dm_audience
The list of campaigns. Includes both real-time and scheduled.
Table structure:
Column | Type | Note |
---|---|---|
audience_id | int64 | |
audience_name | string | |
entry_mode_id | int64 | 0 - ONCE_IN_A_LIFE 1 - ONCE_IN_AN_OPEN_JOURNEY 2 - EVERY_TIME_CONDITION_MET 3 - STOP_AND_START |
audience_status_id | int64 | 1 - Draft 2 - Active 3 - Paused 4 - Disabled 5 - Archived 6 - Executed (for Scheduled only) |
audience_exec_type_id | int64 | 1 - Realtime campaign of Marketing category 2 - Realtime campaign of Operational category 3 - Scheduled campaign of Marketing category 4 - Scheduled campaign of Operational category |
dwh_ext_xxx.dm_resource
Communication resources (templates) of all types, e.g. SMS, popups, emails, etc.
Note that the list includes both - main resources and variations of the main resources
Table structure:
Column | Type | Note |
---|---|---|
resource_id | int64 | |
resource_name | string | |
resource_type_id | int64 | 1 - Mail 2 - Popup 3 - SMS 4 - Push 5 - Inbox 9 - IVR |
dwh_ext_xxx.dm_ach
Table contains information about missions and badges
Column | Type | Note |
---|---|---|
ach_id | int64 | ID of mission/badge |
internal_name | string | Name of mission/badge |
dwh_ext_xxx.dm_ach_points_change_source
Lookup table with possible sources for the points transactions.
Column | Type | Note |
---|---|---|
source_type_id | int64 | ID |
source_type_name | string | Name of the source |
CRM fact tables
dwh_ext_xxx.j_communication
Includes all communications sent to the end-user and all possible outcomes (fact_type_id) of these communications, like failure of delivery, impressions, clicks, etc.
Ideas to understand communications concept
the communication is always triggered by some "activity" from the campaign. This activity is referenced by engagement_id or engagement_uid. Such "activity" is part of the campaign, where the starting point of the campaign (the fact that the user entered into the campaign) is presented by the root_engagement_uuid & root_engagement_create_date. The campaign definition itself if presented by root_audience_id
The first fact that happens for communication is CREATE, after that, it is usually either SENT and DELIVERED or FAILED. It can be later seen by the end-user (IMPRESSION) and CLICKED. Users can also OPTOUT from the communication by following the "opt-out" link in the resource. For Inbox type of message, the user can also READ it and DELETE
The communication is always connected to some resource (template), presented as resource_id, and in case the resource has variations, by the resource_variation_id
Note that smartico is tracking the fact of any communication only 30 days after it's created
Partitioned by: fact_date
Data availability: last 180 days by fact_date
Table structure:
Column | Type | Note |
---|---|---|
fact_date | timestamp | The date/time when fact happened |
user_id | int64 | |
fact_type_id | int64 | CREATE = 1 SENT = 2 IMPRESSION = 3 CLICK = 4 FAIL = 5 DELIVERED = 6 OPTOUT = 7 READ = 8 (for Inbox only) DELETE = 9 (for Inbox only) |
fact_details | string | for CLICK fact - the link that was clicked for FAIL - the detailed reason for fail |
activity_type_id | int64 | Type of activity that triggered that communication (FK: smarticoC.dm_activity_type) |
engagement_uid | uuid | Reference to the smarticoC.j_engagements, represents the point in the Campaign/Flow from where the communication was triggered |
resource_id | int64 | FK: smarticoC.dm_resource, main resource |
resource_variation_id | int64 | FK: smarticoC.dm_resource, specific variation of the resource |
root_audience_id | int64 | FK: smarticoC.dm_audience, the campaign that triggered communication |
root_engagement_uuid | uuid/string | Reference to the smarticoC.j_engagements, represents the "root" point in the Campaign/Flow from where the communication was triggered |
root_engagement_create_date | timestamp | The date when the campaign that triggered specific communication is started for the user |
engagement_id | int64 | Similar to engagement_uid, but in the int64 presentation |
communication_id | int64 | The unique ID of the communication will be the same for all facts through which the communication is transitioning. The ID id generated on the fact CREATE(1) |
audience_category_id | int64 | Category of the campaign. 1 - Marketing 2 - Operational |
external_id | string | ID of the communication assigned by the external gateway at the moment when the external gateway accepted communication from Smartico |
planned_send_date | timestamp | In some setups, the campaign that creates communication is scheduling delivery for a later time, in such case planned_send_date will reflect this date |
fail_reason_id | int64 | ID of the fail reason, FK: smarticoC.dm_com_fail_reason |
user_ext_id | string | The ID of the user in the operator system |
dwh_ext_xxx.j_engagements
The table represents the steps of how the user enters the campaign, exits it, converts, and goes through all the "activities" of the Flow defined in the campaign.
The table represents the recursive structure of user transition over the campaign graph.
For most of the use cases, you will need to analyze only records that have activity_type_id as one of
1 - represents the start of the campaign
2 - stop the campaign
3 - conversion of campaign
The table structure below describes only essential fields that could be used to analyze the campaign performance. If you need a deeper analysis, please contact support@smartico.ai for an explanation.
Partitioned by: create_date
Data availability: last 180 days by create_date
Table structure:
Column | Type | Note |
---|---|---|
engagement_id | int64 | unique id |
engagement_uid | uuid/string | same as unique id, but uuid presentation |
root_audience_id | int64 | FK: smarticoC.dm_audience |
create_date | timestamp | time when user got this activity |
user_id | int64 | |
root_engagement_id | int64 | reference to the same table pointing to the starting point of this campaign for the specific user |
root_engagement_create_date | timestamp | The date when the campaign is started for a specific user |
activity_type_id | int64 | FK: smarticoC.dm_activity_type |
user_ext_id | string | The ID of the user in the operator system |
resource_id | int64 | ID of the resource if the activity is related to the communication. Important that the field is populated retrospectively every night at 3 UTC |
Example of query that returns
data of label 2305
users entered campaign 793485 after 20/11/2023
and converted in this campaign
Gamification fact tables
dwh_ext_xxx.g_ach_completed
The table represents history of all completed missions & badges
Partitioned by: create_date
Data availability: last 180 days by create_date
Column | Type | Note |
---|---|---|
ach_id | INT64 | ID of the mission/badge |
create_date | TIMESTAMP | The time when mission/badge compelted |
is_recurring | BOOLEAN | Indicate if the original mission was recurring |
user_ext_id | STRING | External ID of the user |
crm_brand_id | INT64 | ID of the brand in the Smartico system |
Example of query returning number of users completed missions in the last 3 days
dwh_ext_xxx.g_ach_points_change_log
The table represents all changes in the gamification points
Partitioned by: create_date
Data availability: last 180 days by create_date
Column | Type | Note |
---|---|---|
create_date | TIMESTAMP | The time of transaction |
source_type_id | INT64 | Type of of the source of the change, refer to dm_ach_points_change_source |
source_entity_id | INT64 | Id of the entity that triggered transaction, the lookup table depends on the source_type_id |
points_collected | INT64 | Amount of points in transaction, could be negative in case of deduction |
user_points_ever | INT64 | Amount of points user collected ever after the transaction |
user_points_balance | INT64 | Current points balance after transaction |
user_ext_id | STRING | External ID of the user |
crm_brand_id | INT64 | ID of the brand in the Smartico system |
Example of query showing the source of the points credit and deposit
Affiliation dimensional & fact tables
Example of query that calculates total commission with split by components (CPA, CPL, RevShare) for all affiliates lifetime.
Notes:
Provided example of calculating of sub_affiliate_fee should be used only in the setup with one level of sub-affiliates network (when only direct parent of affiliate is getting attribution from the comission generated by his children). In case of multi-level affiliation network setup, the sub-affailites commissions should be calculated in the different way
These calculations are not taking into accounts adjustments that can be done in 3 differences cases:
Manual adjustments
Adjustments based on the "Tiers" based deal structures (RevShare tiers)
Adjustment done beginning of the month based on "No negative cary over" calculations
dwh_ext_xxx.dm_affiliate
The table represent information about affiliate
Column | Type | Note |
---|---|---|
affiliate_id | INT64 | ID of the affiliate |
create_date | TIMESTAMP | Date when affiliate is created |
parent_affiliate_id | INT64 | ID of the parent affiliate |
manager_id | INT64 | ID of the manager |
payment_method_id | INT64 | ID of payment method |
ext_affiliate_id | STRING | ID of affiliate in the external system |
dwh_ext_xxx.aff_fin_history_daily
The table represents daily aggregation of all financial data system collects about a player. Each column refers to some type of financial data and reflect the SUM of all numbers affilaite system have received for some "operation_date". To understand the final numbers for a player, you should SUM all records for a player, for example if you need to know the total deposit amount of a player, you should SUM all "deposits" + "first_deposit" columns for a player.
Partitioned by: operation_date
Table structure:
Column | Type | Note |
---|---|---|
ext_customer_id | STRING | Player ID the way it's reported to affiliate system |
brand_id | INTEGER | Brand ID in affiliate system |
country | STRING | ISO2 country code of the player |
affiliate_id | INTEGER | Affiliate ID in case player belongs to one |
registration_id | INTEGER | TAP registration id |
operation_date | TIMESTAMP | financial day |
update_date | TIMESTAMP | last update date |
first_deposit | NUMERIC | first deposit amount |
deposits | NUMERIC | deposits amount reported during operation_date |
withdrawals | NUMERIC | withdrawals amount reported during operation_date |
chargebacks | NUMERIC | chargebacks amount reported during operation_date |
bonuses | NUMERIC | bonuses amount reported during operation_date |
deductions | NUMERIC | deductions amount reported during operation_date |
bets | NUMERIC | bets amount reported during operation_date |
wins | NUMERIC | wins amount reported during operation_date |
operations | NUMERIC | operations count reported during operation_date |
ggr | NUMERIC | GGR amount reported during operation_date |
cpl_fee | NUMERIC | CPL fee generated during operation_date |
cpa_fee | NUMERIC | CPA fee generated during operation_date |
rev_share_fee | NUMERIC | RevShare fee generated during operation_date |
sub_affiliate_fee | NUMERIC | Total sub affiliate fee generated during operation_date for all affiliate levels |
net_pnl | NUMERIC | NET PnL calculated during operation_date |
deposit_count | NUMERIC | deposits count reported during operation_date |
withdrawal_count | NUMERIC | withdrawals count reported during operation_date |
q_cpa | NUMERIC | if value is 1, it means that at this operation_date player was qualified for CPA |
q_cpl | NUMERIC | if value is 1, it means that at this operation_date player was qualified for CPL |
FAQ
Can't access DWH from Power BI, error: Unable to authenticate with Google BigQuery Storage API. Check your account permissions.
You need to disable the usage of Storage API in Power BI settings following this guide - https://learn.microsoft.com/en-us/power-query/connectors/google-bigquery.
"Use Storage API" - A flag that enables using the Storage API of Google BigQuery. This option is true by default. This option can be set to false to not use the Storage API and use REST APIs instead.
Can't access table dwh_ext_xxx.j_engagements and any other
You need to replace "xxx" value with the label_id specific for your setup. You can also see the label_id in the name of BigQuery user that has the format like "dwh_ext_1234"
Can different labels and brands are access with the samw DWH account
Every label requires separate account to access DWH.
You could have many brands under one label setup, data related to all these labels will be availalbe from the repsective DWH account for this label
Last updated