Access to DWH
Access to DWH is paid service that should be requested separately.
Please contact your account manager to get more details.
Intro
Smartico provides access to the Data Warehouse (DWH) with row data related to the system operation. The DWH is based on Google BigQuery and exposes facts and dimensions tables.
There are mainly three domains of data exposed through the DWH:
CRM - information about campaigns, communication, conversions, automation rules etc
Gamification - missions, levels, tournaments, mini-games, points, store purchases
Affiliation - registration facts, financial transactions, payments, adjustments, affiliates information
Some of these domains have shared information, e.g. bonuses - can be given from the CRM and from Gamification. In the Affilication domain you will find also CRM data, so you can analyse all communication that is addressed to the affiliates.
Smartico DWH shouldn't be used to run analytical queries directly, as limitations apply to how much data can be retrieved daily.
Instead, you should do a batch load of changes to your analytical system and run reports on your side.
All fact tables are partitioned by date for your convenience so that you can load only the data delta for the completed day efficiently. We recommend running the import of the previous day a few hours after midnight in UTC to ensure that last-day data is fully delivered to the fact tables.
Pay attention to the fact that the data set 'dwh_ext_xxx' is given as an example. The dataset name for your particular setup is the same as the name of the user that you will get to access DWH.
Records in the fact tables are immutable (except cases when stated explicitly), which means they are not changing after they are added.
Fact tables are updated in real-time, and dimensional tables are updated once per hour.
Dimensional tables
dwh_ext_xxx.dm_activity_type
List of possible "activities" that can be used to build a Flow for the scheduled & real-time campaigns. There are activities like "Send SMS", "Send Mail", "Give Bonus", "Stop campaign" etc.
There are essential activities that exist in any campaign, like
Journey start (1) - represents the fact that the campaign started
Journey stopped (2) - represents the stop of the campaign
Journey Converted (3) - represents the fact that the campaign was converted
Table structure:
activity_type_id
int64
activity_name
string
dwh_ext_xxx.dm_audience
The list of campaigns. Includes both real-time and scheduled.
Table structure:
audience_id
int64
audience_name
string
entry_mode_id
int64
0 - ONCE_IN_A_LIFE 1 - ONCE_IN_AN_OPEN_JOURNEY
2 - EVERY_TIME_CONDITION_MET
3 - STOP_AND_START
audience_status_id
int64
1 - Draft 2 - Active 3 - Paused 4 - Disabled 5 - Archived 6 - Executed (for Scheduled only)
audience_exec_type_id
int64
1 - Realtime campaign of Marketing category 2 - Realtime campaign of Operational category 3 - Scheduled campaign of Marketing category 4 - Scheduled campaign of Operational category
dwh_ext_xxx.dm_resource
Communication resources (templates) of all types, e.g. SMS, popups, emails, etc.
Note that the list includes both - main resources and variations of the main resources
Table structure:
resource_id
int64
resource_name
string
resource_type_id
int64
1 - Mail 2 - Popup 3 - SMS 4 - Push 5 - Inbox 9 - IVR
dwh_ext_xxx.dm_ach
Table contains information about missions and badges
ach_id
int64
ID of mission/badge
internal_name
string
Name of mission/badge
dwh_ext_xxx.dm_ach_points_change_source
Lookup table with possible sources for the points transactions.
source_type_id
int64
ID
source_type_name
string
Name of the source
CRM fact tables
dwh_ext_xxx.j_communication
Includes all communications sent to the end-user and all possible outcomes (fact_type_id) of these communications, like failure of delivery, impressions, clicks, etc.
Ideas to understand communications concept
the communication is always triggered by some "activity" from the campaign. This activity is referenced by engagement_id or engagement_uid. Such "activity" is part of the campaign, where the starting point of the campaign (the fact that the user entered into the campaign) is presented by the root_engagement_uuid & root_engagement_create_date. The campaign definition itself if presented by root_audience_id
The first fact that happens for communication is CREATE, after that, it is usually either SENT and DELIVERED or FAILED. It can be later seen by the end-user (IMPRESSION) and CLICKED. Users can also OPTOUT from the communication by following the "opt-out" link in the resource. For Inbox type of message, the user can also READ it and DELETE
The communication is always connected to some resource (template), presented as resource_id, and in case the resource has variations, by the resource_variation_id
Note that smartico is tracking the fact of any communication only 30 days after it's created
Partitioned by: fact_date
Data availability: last 180 days by fact_date
Table structure:
fact_date
timestamp
The date/time when fact happened
user_id
int64
fact_type_id
int64
CREATE = 1 SENT = 2 IMPRESSION = 3 CLICK = 4 FAIL = 5 DELIVERED = 6 OPTOUT = 7 READ = 8 (for Inbox only) DELETE = 9 (for Inbox only)
fact_details
string
for CLICK fact - the link that was clicked for FAIL - the detailed reason for fail
activity_type_id
int64
Type of activity that triggered that communication (FK: smarticoC.dm_activity_type)
engagement_uid
uuid
Reference to the smarticoC.j_engagements, represents the point in the Campaign/Flow from where the communication was triggered
resource_id
int64
FK: smarticoC.dm_resource, main resource
resource_variation_id
int64
FK: smarticoC.dm_resource, specific variation of the resource
root_audience_id
int64
FK: smarticoC.dm_audience, the campaign that triggered communication
root_engagement_uuid
uuid/string
Reference to the smarticoC.j_engagements, represents the "root" point in the Campaign/Flow from where the communication was triggered
root_engagement_create_date
timestamp
The date when the campaign that triggered specific communication is started for the user
engagement_id
int64
Similar to engagement_uid, but in the int64 presentation
communication_id
int64
The unique ID of the communication will be the same for all facts through which the communication is transitioning. The ID id generated on the fact CREATE(1)
audience_category_id
int64
Category of the campaign. 1 - Marketing 2 - Operational
external_id
string
ID of the communication assigned by the external gateway at the moment when the external gateway accepted communication from Smartico
planned_send_date
timestamp
In some setups, the campaign that creates communication is scheduling delivery for a later time, in such case planned_send_date will reflect this date
fail_reason_id
int64
ID of the fail reason, FK: smarticoC.dm_com_fail_reason
user_ext_id
string
The ID of the user in the operator system
dwh_ext_xxx.j_engagements
The table represents the steps of how the user enters the campaign, exits it, converts, and goes through all the "activities" of the Flow defined in the campaign.
The table represents the recursive structure of user transition over the campaign graph.
For most of the use cases, you will need to analyze only records that have activity_type_id as one of
1 - represents the start of the campaign
2 - stop the campaign
3 - conversion of campaign
The table structure below describes only essential fields that could be used to analyze the campaign performance. If you need a deeper analysis, please contact support@smartico.ai for an explanation.
Partitioned by: create_date
Data availability: last 180 days by create_date
Table structure:
engagement_id
int64
unique id
engagement_uid
uuid/string
same as unique id, but uuid presentation
root_audience_id
int64
FK: smarticoC.dm_audience
create_date
timestamp
time when user got this activity
user_id
int64
root_engagement_id
int64
reference to the same table pointing to the starting point of this campaign for the specific user
root_engagement_create_date
timestamp
The date when the campaign is started for a specific user
activity_type_id
int64
FK: smarticoC.dm_activity_type
user_ext_id
string
The ID of the user in the operator system
resource_id
int64
ID of the resource if the activity is related to the communication. Important that the field is populated retrospectively every night at 3 UTC
Example of query that returns
data of label 2305
users entered campaign 793485 after 20/11/2023
and converted in this campaign
Gamification fact tables
dwh_ext_xxx.g_ach_completed
The table represents history of all completed missions & badges
Partitioned by: create_date
Data availability: last 180 days by create_date
ach_id
INT64
ID of the mission/badge
create_date
TIMESTAMP
The time when mission/badge compelted
is_recurring
BOOLEAN
Indicate if the original mission was recurring
user_ext_id
STRING
External ID of the user
crm_brand_id
INT64
ID of the brand in the Smartico system
Example of query returning number of users completed missions in the last 3 days
dwh_ext_xxx.g_ach_points_change_log
The table represents all changes in the gamification points
Partitioned by: create_date
Data availability: last 180 days by create_date
create_date
TIMESTAMP
The time of transaction
source_type_id
INT64
Type of of the source of the change, refer to dm_ach_points_change_source
source_entity_id
INT64
Id of the entity that triggered transaction, the lookup table depends on the source_type_id
points_collected
INT64
Amount of points in transaction, could be negative in case of deduction
user_points_ever
INT64
Amount of points user collected ever after the transaction
user_points_balance
INT64
Current points balance after transaction
user_ext_id
STRING
External ID of the user
crm_brand_id
INT64
ID of the brand in the Smartico system
Example of query showing the source of the points credit and deposit
Affiliation dimensional & fact tables
Example of query that calculates total commission with split by components (CPA, CPL, RevShare) for all affiliates lifetime.
Notes:
Provided example of calculating of sub_affiliate_fee should be used only in the setup with one level of sub-affiliates network (when only direct parent of affiliate is getting attribution from the comission generated by his children). In case of multi-level affiliation network setup, the sub-affailites commissions should be calculated in the different way
These calculations are not taking into accounts adjustments that can be done in 3 differences cases:
Manual adjustments
Adjustments based on the "Tiers" based deal structures (RevShare tiers)
Adjustment done beginning of the month based on "No negative cary over" calculations
dwh_ext_xxx.dm_affiliate
The table represent information about affiliate
affiliate_id
INT64
ID of the affiliate
create_date
TIMESTAMP
Date when affiliate is created
parent_affiliate_id
INT64
ID of the parent affiliate
manager_id
INT64
ID of the manager
payment_method_id
INT64
ID of payment method
ext_affiliate_id
STRING
ID of affiliate in the external system
dwh_ext_xxx.aff_fin_history_daily
The table represents daily aggregation of all financial data system collects about a player. Each column refers to some type of financial data and reflect the SUM of all numbers affilaite system have received for some "operation_date". To understand the final numbers for a player, you should SUM all records for a player, for example if you need to know the total deposit amount of a player, you should SUM all "deposits" + "first_deposit" columns for a player.
Partitioned by: operation_date
Table structure:
ext_customer_id
STRING
Player ID the way it's reported to affiliate system
brand_id
INTEGER
Brand ID in affiliate system
country
STRING
ISO2 country code of the player
affiliate_id
INTEGER
Affiliate ID in case player belongs to one
registration_id
INTEGER
TAP registration id
operation_date
TIMESTAMP
financial day
update_date
TIMESTAMP
last update date
first_deposit
NUMERIC
first deposit amount
deposits
NUMERIC
deposits amount reported during operation_date
withdrawals
NUMERIC
withdrawals amount reported during operation_date
chargebacks
NUMERIC
chargebacks amount reported during operation_date
bonuses
NUMERIC
bonuses amount reported during operation_date
deductions
NUMERIC
deductions amount reported during operation_date
bets
NUMERIC
bets amount reported during operation_date
wins
NUMERIC
wins amount reported during operation_date
operations
NUMERIC
operations count reported during operation_date
ggr
NUMERIC
GGR amount reported during operation_date
cpl_fee
NUMERIC
CPL fee generated during operation_date
cpa_fee
NUMERIC
CPA fee generated during operation_date
rev_share_fee
NUMERIC
RevShare fee generated during operation_date
sub_affiliate_fee
NUMERIC
Total sub affiliate fee generated during operation_date for all affiliate levels
net_pnl
NUMERIC
NET PnL calculated during operation_date
deposit_count
NUMERIC
deposits count reported during operation_date
withdrawal_count
NUMERIC
withdrawals count reported during operation_date
q_cpa
NUMERIC
if value is 1, it means that at this operation_date player was qualified for CPA
q_cpl
NUMERIC
if value is 1, it means that at this operation_date player was qualified for CPL
FAQ
Can't access DWH from Power BI, error: Unable to authenticate with Google BigQuery Storage API. Check your account permissions.
You need to disable the usage of Storage API in Power BI settings following this guide - https://learn.microsoft.com/en-us/power-query/connectors/google-bigquery.
"Use Storage API" - A flag that enables using the Storage API of Google BigQuery. This option is true by default. This option can be set to false to not use the Storage API and use REST APIs instead.
Can't access table dwh_ext_xxx.j_engagements and any other
You need to replace "xxx" value with the label_id specific for your setup. You can also see the label_id in the name of BigQuery user that has the format like "dwh_ext_1234"
Can different labels and brands are access with the samw DWH account
Every label requires separate account to access DWH.
You could have many brands under one label setup, data related to all these labels will be availalbe from the repsective DWH account for this label
Last updated
Was this helpful?