Smartico Data Warеhouse

Access to Smartico DWH is a paid service that should be requested separately.

Please contact your Customer Success Manager to get more details.

Intro

Smartico provides access to the Data Warehouse (DWH) with row data related to the system operation. The DWH is based on Google BigQuery and exposes facts and dimensions tables.

There are mainly three main domains of data exposed through the DWH:

  • CRM - information about campaigns, communication, conversions, automation rules, etc

  • Gamification - missions, levels, tournaments, mini-games, points, store purchases

  • Affiliation - registration facts, financial transactions, payments, adjustments, affiliates information

Some of these domains have shared information, e.g. bonuses - can be given from the CRM and from Gamification. In the Affiliation domain you will also find CRM data, so you can analyse all communication that is addressed to the affiliates.

Smartico DWH shouldn't be used to run analytical queries directly, as limitations apply to how much data can be retrieved daily.

Instead, you should do a batch load of changes to your analytical system and run reports on your side.

All fact tables are partitioned by date for your convenience so you can efficiently load only the data delta for the completed day. We recommend running the import of the previous day a few hours after midnight in UTC to ensure that last-day data is fully delivered to the fact tables.

Don't apply functions on the partition columns when filtering with 'WHERE' clause. The correct example of taking data for the last day is - 'WHERE fact_date >= TIMESTAMP_TRUNC(TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY), DAY)'

Pay attention to the fact that the data set 'dwh_ext_xxx' is given as an example. The dataset name for your particular setup is the same as the name of the user to whom you will get to access DWH.

Records in the fact tables are immutable (except cases when stated explicitly), which means they are not changing after they are added.

Fact tables are updated in real-time, and dimensional tables are updated once per hour.

Segments export

You can query DWH to get a near-real-time list of users that belong to specific segments.

To make a segment available for exporting, it first needs to be marked as exportable through the DWH in Smartico BackOffice, as shown on the screen below.

After that, segment content can be queried using the following SQL, where XXX needs to be replaced with the ID of your label and YYY with the ID of the segment.


select * from dwh_ext_XXX.fn_export_segment_YYY()

Example of query results

Query for user profile details

You can query DWH to get detailed information about users stored on the Smartico side.


SELECT user_id, user_ext_id, core_registration_date, core_wallet_currency, core_user_language, core_tags, core_public_tags, core_external_markers
FROM dwh_ext_2283.j_user
where ARRAY_LENGTH(core_tags) > 0 and core_registration_date is not null

Note that only a small subset of user profile properties are exposed through the DWH, and only properties that are shared through all the clients. If you have a need to expose specific properly, please contact Smartico support.

Currently exposed properties:

PropertyTypeMeaning

user_id

INT64

ID of the user in Smartico system

user_ext_id

STRING

ID of the user in the integrated platform

core_registration_date

TIMESTAMP

Date/time of registration

core_user_last_time_online

TIMESTAMP

Date/time when user was last time online

ach_level_current_id

INT64

ID of the current gamification level

ach_points_ever

INT64

Amount of points user collected ever

ach_points_balance

INT64

Current points balance of user

user_country

STRING

Country

core_user_language

STRING

Language, ISO code

core_user_last_device_type

STRING

Last used device type (Desktop, Mobile, Native, Wrapper)

core_wallet_currency

STRING

Currency code

core_tags

STRING[]

Array of user markers

core_public_tags

STRING[]

core_external_markers

STRING[]

Transactions deduplication

Note that in rare cases some fact tables may contain duplicated records. This may happen in very exceptional cases and doesn't have impact on operation part of system, so for example if there is duplicated record for the "mail sent" fact, the mail wasn't sent twice, just a fact of sending is duplicated.

You can do a deduplication using the field marked as "PK" in corresponding table.

Dimensional tables

dwh_ext_xxx.dm_activity_type

List of possible "activities" that can be used to build a Flow for the scheduled & real-time campaigns. There are activities like "Send SMS", "Send Mail", "Give Bonus", "Stop campaign" etc.

There are essential activities that exist in any campaign, like

  • Journey start (1) - represents the fact that the campaign started

  • Journey stopped (2) - represents the stop of the campaign

  • Journey Converted (3) - represents the fact that the campaign was converted

Table structure:

ColumnTypeNote

activity_type_id

int64

activity_name

string

dwh_ext_xxx.dm_audience

The list of campaigns. Includes both real-time and scheduled.

Table structure:

ColumnTypeNote

audience_id

int64

audience_name

string

entry_mode_id

int64

0 - ONCE_IN_A_LIFE 1 - ONCE_IN_AN_OPEN_JOURNEY

2 - EVERY_TIME_CONDITION_MET

3 - STOP_AND_START

audience_status_id

int64

1 - Draft 2 - Active 3 - Paused 4 - Disabled 5 - Archived 6 - Executed (for Scheduled only)

audience_exec_type_id

int64

1 - Realtime campaign of Marketing category 2 - Realtime campaign of Operational category 3 - Scheduled campaign of Marketing category 4 - Scheduled campaign of Operational category

category_id

int64

1 - Marketing campaign 2 - Operational campaign

dwh_ext_xxx.dm_resource

Communication resources (templates) of all types, e.g. SMS, popups, emails, etc.

Note that the list includes both - main resources and variations of the main resources

Table structure:

ColumnTypeNote

resource_id

int64

resource_name

string

resource_subject

string

Subject line for Mail, Push, Inbox For SMS will contain a full body of SMS For Popups and IVR will be null, as these resource types don't have a concept close to the 'subject'. Note that string can contain tags that are replaced during the actual building of the resource for specific user

resource_type_id

int64

1 - Mail 2 - Popup 3 - SMS 4 - Push 5 - Inbox 9 - IVR

dwh_ext_xxx.dm_providers_sms

List of all sms sent with their label, id, provider name and type, and create and update times.

ColumnTypeNote

create_date

timestamp

Time when this communication provider is defined

update_date

timestamp

Time when this communication provider is last updated

id

int64

ID of the communication provider

provider_name

string

Name of the exact provider

provider_type

string

Name of the overall service provider

dwh_ext_xxx.dm_providers_mail

List of all emails sent with their label, id, provider name and type, and create and update times.

ColumnTypeNote

create_date

timestamp

Time when this communication provider is defined

update_date

timestamp

Time when this communication provider is last updated

id

int64

ID of the communication provider

provider_name

string

Name of the exact provider

provider_type

string

Name of the overall service provider

dwh_ext_xxx.dm_ach

The table contains information about missions and badges.

ColumnTypeNote

ach_id

int64

ID of mission/badge

internal_name

string

Name of mission/badge

ach_type_id

int64

Type of achievement, 1 - Mission, 2 - Badge

requires_optin

bool

Indicator if mission requires explicit opt-in

limit_in_time_ms

int64

Time in milliseconds for the time limited missions

dwh_ext_xxx.dm_saw_template

.The table contains information about mini-game templates

ColumnTypeNote

saw_template_id

int64

ID of the game template

template_name

string

Name of template

saw_game_type_id

int64

Type of the game: 1 - Spin The Wheel 2 - Scratch & Catch 3 - Match X 4 - Gift box

saw_buyin_type_id

int64

Type of the buy-in: 1 - Free of charge 2 - Gamification points 3 - Spin attempts

is_visitor_mode

bool

If true, indicates that game is targeting visitors, otherwise only registered users

dwh_ext_xxx.dm_shop_item

The table contains information about missions and badges.

ColumnTypeNote

item_id

int64

ID of shop item

item_name

string

Name of item

item_type_id

int64

Type of item: 1 - Bonus 2 - Tangible 3 - Mini-game attempt(s) 4 - Level upgrade

item_status_id

bool

Status: 1 - Draft 2 - Active 3 - Archived

dwh_ext_xxx.dm_saw_prize

The table contains information about missions and badges.

ColumnTypeNote

saw_prize_id

int64

ID of prize

saw_prize_name

string

Name of prize

saw_template_id

int64

ID of the mini-game template to which the prize belongs, see dm_saw_template

prize_value

bool

Value of prize in case it is a type of "gamification points".

prize_type_id

int64

Type of the prize: 1 - No Prize 2 - Points 3 - Bonus 4 - Tangible 5 - Spin in the mini-game 6 - Jackpot 7 - Level upgrade

is_surcharge

bool

If the prize is a surcharge (will be given when no other prizes are left)

dwh_ext_xxx.dm_bonus_template

List of all bonus templates with their label, id, type and name.

ColumnTypeNote

label_id

int64

The id of the label to which this bonus templated belongs

label_bonus_template_id

int64

The id of the bonus template for the specified label

product_bonus_type_id

int64

Id that indicates the bonus type

public_name

string

Bonus template name

dwh_ext_xxx.dm_tournament

The table contains information about tournaments and tournament instances.

id

int64

Primary identifier for the tournament.

name

string

Name of the tournament.

logic_version_id

int64

ID representing the logic version of the tournament.

  1. Based on properties

  2. Based on formula builder

registration_type_id

int64

ID for the registration type of the tournament.

  1. Auto (all users will participate)

  2. Free opt-in

  3. Buy-in with Points

  4. Optin & Manual Approval

  5. Requires Qualification

duration_ms

int64

Duration of the tournament in milliseconds.

players_min_count

int64

Minimum number of players required for the tournament.

players_max_count

int64

Maximum number of players allowed, optional.

tournament_type_id

int64

ID represents the type of tournament.

  1. Scheduled

tournament_status_id

int64

ID indicating the status of the tournament.

  1. Draft

  2. Published

  3. Paused

  4. Archived

last_tournament_create_date

timestamp

Timestamp of when the last tournament was created.

show_before_start_ms

int64

Time in milliseconds to show the tournament before start.

allow_late_registration

bool

Indicates if late registration is allowed.

dwh_ext_xxx.dm_ach_points_change_source

Lookup table with possible sources for the points transactions.

ColumnTypeNote

source_type_id

int64

ID

source_type_name

string

Name of the source

CRM fact tables

dwh_ext_xxx.j_av

Includes all the attribution value data. You can read more about attribute value here

Partitioned by: event_date

Data availability: last 180 days by event_date

Primary key: id

Note: the data can mutate over the time and depends if the deposit events are delivered with delay

ColumnTypeNote

id

int64

The incremental id of the calculation fact

event_date

timestamp

The date when the attribution event (deposit) happened

resource_id

int64

FK: dm_resource,

main resource that was used for the communication

audience_id

int64

FK: dm_audience,

the campaign that triggered communication

value

numeric

The attribution amount in the label base currency that resource+campaign generated.

raw_value

numeric

The raw deposit amount in the original deposit currency

converted_value

numeric

The deposit amount converted to the label base currency

event_id

int64

The id of the attribution event (deposit) on smartico side

engagement_id

int64

Reference to the j_engagements, represents the point in the Campaign/Flow from where the communication was triggered

root_engagement_id

int64

Reference to the j_engagements, represents the "root" point in the Campaign/Flow from where the communication was triggered

kpi_impact_id

int64

The id of the attribution value impact 1 - Because of resource delivery

2 - Because the user has an impression on the resource

3 - Because the user has interaction with the resource (click) 4 - Indirect impact, when nothing from above happened

default_currency

string

The base currency of the label

user_id

int64

The user id of the user who made the deposit

calculated_ts

timestamp

Timestamp when the calculation happened

Example of query that returns attribute value per campaign (audience_id)

SELECT  audience_id, sum(value) as value
FROM dwh_ext_xxx.j_av
WHERE event_date >= TIMESTAMP("2024-07-01")
group by audience_id

Example of query that returns conversion of specific user per campaign, including the time of conversion

with x as (
  select audience_id, event_date, sum(value) as av, 
  from dwh_ext_xxx.j_av
  where user_id = 69726604
  group by audience_id, event_date
)
select a.audience_id, a.audience_name, x.av, x.event_date
from x
inner join dwh_ext_xxx.dm_audience a on x.audience_id = a.audience_id
order by x.event_date desc

dwh_ext_xxx.j_communication

This includes all communications sent to the end-user and all possible outcomes (fact_type_id) of these communications, like failure of delivery, impressions, clicks, etc.

Ideas to understand communications concept

  • the communication is always triggered by some "activity" from the campaign. This activity is referenced by engagement_id or engagement_uid. Such "activity" is part of the campaign, where the starting point of the campaign (the fact that the user entered into the campaign) is presented by the root_engagement_uuid & root_engagement_create_date. The campaign definition itself if presented by root_audience_id

  • The first fact that happens for communication is CREATE, after that, it is usually either SENT and DELIVERED or FAILED. It can be later seen by the end-user (IMPRESSION) and CLICKED. Users can also OPTOUT from the communication by following the "opt-out" link in the resource. For Inbox type of message, the user can also READ it and DELETE

  • The communication is always connected to some resource (template), presented as resource_id, and in case the resource has variations, by the resource_variation_id

  • Note that smartico is tracking the fact of any communication only 30 days after it's created

Partitioned by: fact_date

Data availability: last 180 days by fact_date

Primary key: engagement_id + fact_type_id + fact_date

Table structure:

ColumnTypeNote

fact_date

timestamp

The date/time when fact happened

user_id

int64

fact_type_id

int64

CREATE = 1 SENT = 2 IMPRESSION = 3 CLICK = 4 FAIL = 5 DELIVERED = 6 OPTOUT = 7 READ = 8 (for Inbox only) DELETE = 9 (for Inbox only)

fact_details

string

for CLICK fact - the link that was clicked for FAIL - the detailed reason for fail

activity_type_id

int64

Type of activity that triggered that communication (FK: dm_activity_type)

engagement_uid

uuid

Reference to the j_engagements, represents the point in the Campaign/Flow from where the communication was triggered

resource_id

int64

FK: dm_resource, main resource

resource_variation_id

int64

FK: dm_resource, specific variation of the resource

root_audience_id

int64

FK: dm_audience, the campaign that triggered communication

root_engagement_uuid

uuid/string

Reference to the j_engagements, represents the "root" point in the Campaign/Flow from where the communication was triggered

root_engagement_create_date

timestamp

The date when the campaign that triggered specific communication is started for the user

engagement_id

int64

Similar to engagement_uid, but in the int64 presentation

communication_id

int64

The unique ID of the communication will be the same for all facts through which the communication is transitioning. The ID id generated on the fact CREATE(1)

audience_category_id

int64

Category of the campaign. 1 - Marketing 2 - Operational

external_id

string

ID of the communication assigned by the external gateway at the moment when the external gateway accepted communication from Smartico

label_provider_id

int64

ID of the communication provider (dm_providers_mail and dm_providers_sms)

planned_send_date

timestamp

In some setups, the campaign that creates communication is scheduling delivery for a later time, in such case planned_send_date will reflect this date

fail_reason_id

int64

ID of the fail reason, FK: dm_com_fail_reason

user_ext_id

string

The ID of the user in the operator system

crm_brand_id

int64

ID of the brand in the Smartico system

Example of query that returns counts of communication by status and provider for the past 7 days.

SELECT
    COALESCE(sm.provider_name, ml.provider_name) AS provider_name,
    DATE(c.fact_date) AS communication_date,
    CASE 
        WHEN c.fact_type_id = 1 THEN 'CREATE'
        WHEN c.fact_type_id = 2 THEN 'SENT'
        WHEN c.fact_type_id = 3 THEN 'IMPRESSION'
        WHEN c.fact_type_id = 4 THEN 'CLICK'
        WHEN c.fact_type_id = 5 THEN 'FAIL'
        WHEN c.fact_type_id = 6 THEN 'DELIVERED'
        WHEN c.fact_type_id = 7 THEN 'OPTOUT'
        WHEN c.fact_type_id = 8 THEN 'READ'
        WHEN c.fact_type_id = 9 THEN 'DELETE'
        ELSE 'UNKNOWN'
    END AS status_name,
    COUNT(*) AS status_count
FROM
    `dwh_ext_2283.j_communication` c
LEFT JOIN
    `dwh_ext_2283.dm_providers_sms` sm ON c.label_provider_id = sm.id
LEFT JOIN
    `dwh_ext_2283.dm_providers_mail` ml ON c.label_provider_id = ml.id
JOIN
    `dwh_ext_2283.dm_resource` r ON c.resource_id = r.resource_id
WHERE
    c.fact_date >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
    AND r.resource_type_id IN (1, 3) -- Filter for Mail (1) and SMS (3)
GROUP BY
    communication_date, provider_name, status_name
ORDER BY
    communication_date DESC, status_count DESC;

dwh_ext_xxx.j_engagements

The table represents the steps of how the user enters the campaign, exits it, converts, and goes through all the "activities" of the Flow defined in the campaign.

The table represents the recursive structure of user transition over the campaign graph.

For most of the use cases, you will need to analyze only records that have activity_type_id as one of

  • 1 - represents the start of the campaign

  • 2 - stop the campaign

  • 3 - conversion of campaign

The table structure below describes only essential fields that could be used to analyze the campaign performance. If you need a deeper analysis, please contact support@smartico.ai for an explanation.

Partitioned by: create_date

Data availability: last 180 days by create_date

Primary key: engagement_id or engagement_uid

Table structure:

ColumnTypeNote

engagement_id

int64

unique id

engagement_uid

uuid/string

same as unique id, but uuid presentation

root_audience_id

int64

FK: dm_audience

create_date

timestamp

time when user got this activity

user_id

int64

root_engagement_id

int64

reference to the same table pointing to the starting point of this campaign for the specific user

root_engagement_create_date

timestamp

The date when the campaign is started for a specific user

activity_type_id

int64

FK: dm_activity_type

user_ext_id

string

The ID of the user in the operator system

resource_id

int64

ID of the resource if the activity is related to the communication. Important that the field is populated retrospectively every night at 3 UTC

expected_stop_date

timestamp

Expected stop date/time of the campaign, set as create_date of the activity_type_id = 1 (start of the campaign) + campaign duration set in the BackOffice. Note that the campaign can be forcefully stopped inside the flow using "Stop" activity

ab_target

boolean

Indicating if the current element in the campaign flow was set as an AB test target for the communication element followed this target

Example of query that returns

  • data of label 2305

  • users entered campaign 793485 after 20/11/2023

  • and converted in this campaign

   WITH SRC AS (
        SELECT 
            engagement_id,
            root_audience_id,
            root_engagement_id,
            root_engagement_create_date,
            user_id,
            engagement_uid,
            activity_type_id,
            create_date,
            from_control_group,
            activity_id
        FROM dwh_ext_xxx.j_engagements
        WHERE label_id = 2305 AND activity_type_id in (1,2,3)
            AND root_audience_id IN (793485)
            AND create_date >= '2023-11-20T00:00:00Z'
            AND root_engagement_create_date >= '2023-11-20T00:00:00Z'
            -- for the performance, you can limit upper date only with 30 days from the 'to', this is MAX duration of campaign
            AND create_date <= TIMESTAMP_ADD('2023-11-21T23:59:59Z', INTERVAL 31 DAY)
            AND root_engagement_create_date <= '2023-11-21T23:59:59Z'
    ),
    PREFINAL as (
        SELECT 
            SRC.root_audience_id, SRC.root_engagement_id, user_id,
            MAX(CASE WHEN SRC.activity_type_id = 1 then SRC.engagement_uid else null end) as root_engagement_uid,
            MAX(CASE WHEN SRC.activity_type_id = 1 then SRC.create_date ELSE NULL END) as start_date,
            MAX(CASE WHEN SRC.activity_type_id = 3 then SRC.create_date ELSE NULL END) as convert_date,
            MAX(CASE WHEN SRC.activity_type_id = 2 then SRC.create_date ELSE NULL END) as stop_date,   
            MAX(CASE WHEN SRC.activity_type_id = 1 THEN SRC.from_control_group ELSE NULL END) as from_control_group
        FROM SRC
        WHERE true
        GROUP BY SRC.root_audience_id, SRC.root_engagement_id, SRC.user_id
    ),
    FINAL as (
        SELECT 
            COUNT(*) OVER() AS cnt, ROW_NUMBER() OVER(ORDER BY PREFINAL.start_date DESC) AS row,
            PREFINAL.*,
        FROM PREFINAL
        WHERE true AND convert_date IS NOT NULL
    )
    SELECT *
    FROM FINAL

dwh_ext_xxx.j_bonuses

The table represents the history of bonuses given to users, including all the attempts to issue bonuses through the API provided by the platform.

Each bonus fact can be represented in the table multiple times, where each time it reflects the change in the lifecycle of the bonus. The status of the bonus is refllected in the column bonus_status_id with possible values:

  • 1 - NEW

  • 3 - REDEEMED

  • 4 - REDEEM_FAILED

Partitioned by: fact_date

Data availability: last 180 days by fact_date

Primary key: bonus_id

Table structure:

ColumnTypeNote

fact_date

timestamp

the date of the fact

bonus_id

int65

unique ID of the transaction preserved for all facts of this transaction

user_id

int64

Smartico user ID

user_ext_id

string

The ID of the user in the operator system

crm_brand_id

int64

ID of the brand in the Smartico system

bonus_status_id

int64

status of bonus as explained above

label_bonus_template_id

int64

ID of template in the smartico system

source_product_id

int64

ID of the sub-system that issued the bonus, e.g. 0 - Campaign 9 - Automation rule 100 - Gamification

125 - Dynamic Bonuses engine

126 - Manual bonus 5000 - Store 8000 - Tournament 8001 - Mini-game for Retention user 8002 - Mini-game for Visitor

entity_id

int64

ID of the entity related to the source_product_id that issued the bonus. e.g. If issued by the Automation Rule, this ID will represent ID of the rule.

root_audience_id

int64

Similar to the entity_id, but should be used for the Campaign reference

error_code

int64

Error code in case of failed to issue bonus

error_message

string

The detailed error message

Example of query that returns final statuses of bonuses in the defined period

WITH MAIN AS (
    SELECT
        JB.fact_date as create_date,
        JB.bonus_id,
        JB.user_id,
        JB.label_id,
        JB.crm_brand_id,
        JB,source_product_id,
        JB,source_product_ref_id,
        JB,root_audience_id,
        JB,bonus_status_id,
        JB,label_bonus_template_id,
        JB,entity_id,
        RANK() OVER (PARTITION BY bonus_id ORDER BY fact_date DESC) AS rank_desc,
        RANK() OVER (PARTITION BY bonus_id ORDER BY fact_date ASC) AS rank_asc
    FROM dwh_ext_2283.j_bonuses JB
    WHERE JB.fact_date >= '2024-04-01T00:00:00Z' AND JB.fact_date < '2024-04-15T23:59:59.000Z'
),
FINAL AS (
    SELECT
        A.*,
        B.create_date AS bonus_create_date,
        B.entity_id AS entity_id
    FROM MAIN A
    LEFT JOIN MAIN B ON B.rank_asc = 1 AND A.bonus_id = B.bonus_id
    WHERE A.rank_desc = 1 AND (B.create_date >= '2024-04-01T00:00:00Z' AND B.create_date < '2024-04-15T23:59:59Z')
        AND true
)
SELECT *
FROM FINAL

Gamification fact tables

dwh_ext_xxx.g_ach_completed

The table represents the history of all completed missions & badges.

Partitioned by: create_date

Data availability: last 180 days by create_date

Primary key: ach_completed_id

ColumnTypeNote

ach_id

INT64

ID of the mission/badge

create_date

TIMESTAMP

The time when mission/badge compelted

is_recurring

BOOLEAN

Indicate if the original mission was recurring

user_ext_id

STRING

External ID of the user

crm_brand_id

INT64

ID of the brand in the Smartico system

ach_completed_id

INT64

Unique ID of the completion fact

Example of the query returning a count of users completed missions in the last 3 days

select A.internal_name, A.ach_id, count(*) as completed_users_count
from dwh_ext_xxx.g_ach_completed M 
inner join dwh_ext_xxx.dm_ach A ON M.ach_id = A.ach_id 
WHERE M.create_date > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 3 DAY)
GROUP BY A.internal_name, A.ach_id

dwh_ext_xxx.g_ach_optins

The table represents the history of all opt-ins in the missions that require explicit opt-ins to participate.

Partitioned by: create_date

Data availability: last 180 days by create_date

Primary key: ach_id, user_id, create_date

ColumnTypeNote

ach_id

INT64

ID of the mission

create_date

TIMESTAMP

The time when user opted in the mission

user_id

INT64

user id in the smartico system

user_ext_id

STRING

External ID of the user

crm_brand_id

INT64

ID of the brand in the Smartico system

Example of the query returning up to 1000 rows of missions that were active in the past 30 days with counts of opted in customers and customers that completed the mission in that period.

SELECT 
    A.internal_name,
    A.ach_id,
    COUNT(DISTINCT O.user_id) AS opted_in_users_count,
    COUNT(DISTINCT C.user_ext_id) AS completed_users_count
FROM 
    dwh_ext_xxx.dm_ach A
LEFT JOIN
    dwh_ext_xxx.g_ach_optins O ON A.ach_id = O.ach_id AND O.create_date > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
LEFT JOIN
    dwh_ext_xxx.g_ach_completed C ON A.ach_id = C.ach_id 
WHERE
    A.requires_optin = TRUE 
    AND (C.create_date > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
         OR O.create_date > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY))
GROUP BY 
    A.internal_name, A.ach_id
    limit 1000

dwh_ext_xxx.g_ach_claimed

The table represents facts of prizes claiming for the missions for which a claim is required.

Read more about "claiming" in the Missions guide.

Partitioned by: create_date

Data availability: last 180 days by create_date

Primary key: ach_completed_id

ColumnTypeNote

ach_id

INT64

ID of the mission

create_date

TIMESTAMP

The time when user claimed the mission

user_id

INT64

user id in the smartico system

user_ext_id

STRING

External ID of the user

crm_brand_id

INT64

ID of the brand in the Smartico system

ach_completed_id

INT64

Reference to g_ach_completed that represents completed missions

dwh_ext_xxx.g_ach_points_change_log

The table represents all changes in the gamification points

Partitioned by: create_date

Data availability: last 180 days by create_date

Primary key: NA

ColumnTypeNote

create_date

TIMESTAMP

The time of transaction

source_type_id

INT64

Type of of the source of the change, refer to dm_ach_points_change_source

source_entity_id

INT64

Id of the entity that triggered transaction, the lookup table depends on the source_type_id

points_collected

INT64

Amount of points in transaction, could be negative in case of deduction

user_points_ever

INT64

Amount of points user collected ever after the transaction

user_points_balance

INT64

Current points balance after transaction

user_ext_id

STRING

External ID of the user

crm_brand_id

INT64

ID of the brand in the Smartico system

Example of query showing the source of the points credit and deposit

select S.source_type_name, 
	sum( CASE WHEN points_collected > 0 THEN points_collected ELSE 0 END) as points_credit,
	sum( CASE WHEN points_collected < 0 THEN -1 * points_collected ELSE 0 END) as points_debit
from dwh_ext_12078.g_ach_points_change_log L 
inner join dwh_ext_12078.dm_ach_points_change_source S ON L.source_type_id = S.source_type_id 
WHERE L.create_date  > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 3 DAY)
group by S.source_type_name

dwh_ext_xxx.g_minigames

The table represents mini-game gameplay results.

Partitioned by: create_date

Data availability: last 180 days by create_date

Primary key: spin_id

ColumnTypeNote

create_date

TIMESTAMP

The time of transaction

saw_template_id

INT64

ID of mini-game template, refer to dm_saw_template

saw_prize_id

INT64

ID of the won prize, refer to dm_saw_prize

prize_type_id

INT64

Type of the prize: 1 - No Prize 2 - Points 3 - Bonus 4 - Tangible 5 - Spin in the mini-game 6 - Jackpot 7 - Level upgrade

user_ext_id

STRING

External ID of the user

crm_brand_id

INT64

ID of the brand in the Smartico system

buyin_cost_points

INT64

The price of a mini-game attempt paid by the user for the mini-games that cost gamification points (there are other types of mini-games, that are free or "attempt" based)

prize_amount

NUMERIC

Winning amount, in case the prize is the type of Jackpot or Points

pool_after

INT64

The number of the items of this prize that is left in the stock after the user wins the prize identified by saw_prize_id

spin_id

INT64

Unique ID of the game play fact

saw_prize_note

STRING

Reserved for future use

dwh_ext_xxx.g_tournament_analytics

The table contains information about various tournaments and relevant analytics events.

Each tournament can be represented in the table multiple times, where each time it reflects different tournament instances (different runs).

Note that there is data nested under progress.

Partitioned by: event_time

Data availability: last 180 days by create_date

Primary key: id

Column

Type

Note

create_date

TIMESTAMP

Timestamp of when the record was created.

update_date

TIMESTAMP

Timestamp of when the record was last updated.

id

INT64

Primary identifier for the tournament.

name

STRING

Name of the tournament, optional.

logic_version_id

INT64

ID representing the logic version of the tournament.

registration_type_id

INT64

ID for the registration type of the tournament.

  1. AUTO

  2. BUY-IN

  3. OPT-IN

  4. BUY-IN-CASH

duration_ms

INT64

Duration of the tournament in milliseconds.

players_min_count

INT64

Minimum number of players required for the tournament.

players_max_count

INT64

Maximum number of players allowed, optional.

tournament_type_id

INT64

ID representing the type of the tournament.

  1. ONE TIME

  2. REPETITIVE

  3. SIT & GO

tournament_status_id

INT64

ID indicating the status of the tournament.

  1. Archived

  2. Draft

  3. Paused

  4. Published

last_tournament_create_date

TIMESTAMP

Timestamp of when the last tournament was created.

show_before_start_ms

INT64

Time in milliseconds to show the tournament before start.

allow_late_registration

BOOLEAN

Indicates if late registration is allowed.

event_id

INT64

ID of the associated tournament analytics event.

event_time

TIMESTAMP

Time when the analytics event was recorded.

user_id

INT64

ID of the user associated with the event.

event_type_id

INT64

ID representing the type of the analytics event.

progress

RECORD

Repeated record indicating progress details related to the event.

tournament_id

INT64

ID of the tournament associated with the analytics event.

tournament_instance_id

INT64

ID of the specific instance of the tournament associated with the analytics event.

scores

INT64

Scores attained during the tournament event.

total_scores

INT64

Total accumulated scores during the tournament event.

rule_id

INT64

ID of the rule applied during the tournament event.

crm_brand_id

INT64

ID of the brand in the CRM system.

user_ext_id

STRING

External ID of the user associated with the analytics event.

This example provides a comprehensive summary of each tournament instance, focusing on score distributions and rule applications. It allows for easy comparison between different tournament instances and can help identify patterns in performance and rule usage across various tournaments.

The results are ordered by tournament_id and then tournament_instance_id, making it easy to view all instances of a particular tournament together. This structure can be particularly useful for analyzing trends or changes in tournament performance over time or across different instances of the same tournament.

WITH unnested_data AS (
  SELECT
    tournament_id,
    tournament_instance_id,
    p.scores,
    p.total_scores,
    p.rule_id
  FROM
`dwh_ext_xxx.g_tournament_analytics`,
    UNNEST(progress) AS p
  WHERE
    event_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
)
SELECT
  tournament_id,
  tournament_instance_id,
  AVG(scores) AS avg_scores,
  MAX(scores) AS max_scores,
  MIN(scores) AS min_scores,
  AVG(total_scores) AS avg_total_scores,
  MAX(total_scores) AS max_total_scores,
  MIN(total_scores) AS min_total_scores,
  COUNT(DISTINCT rule_id) AS unique_rules_applied
FROM
  unnested_data
GROUP BY
  tournament_id,
  tournament_instance_id
ORDER BY
  tournament_id,
  tournament_instance_id;

dwh_ext_xxx.g_shop_transactions

The table represents all purchases in the store.

Partitioned by: create_date

Data availability: last 180 days by create_date

Primary key: transaction_id

ColumnTypeNote

transaction_id

INT64

Unique ID of transaction

create_date

TIMESTAMP

The time of transaction

shop_item_id

INT64

ID of the shop item, refer to dm_shop_item

points_amount

INT64

The price of the item in the gamification points

user_ext_id

STRING

External ID of the user

crm_brand_id

INT64

ID of the brand in the Smartico system

dwh_ext_xxx.g_ux

The table represents the UI interactions of users with the Gamification widget

Partitioned by: create_date

Data availability: last 180 days by create_date

Primary key: NA

ColumnTypeNote

label_id

integer

Unique id of the label

crm_brand_id

integer

Unique id of the brand. Relevant only for multi-brand setup

create_date

timestamp

Date that represents exactly when the user performed an action (opened a modal or navigated to a some menu section)

user_ext_id

string

Unique id of user in the integrated system

user_id

integer

Unique id of user in the Smartico system

interface_type

integer

Device type of the current user session 0 - Mobile 1 - Desktop

screen_name_id

integer

Identifier of the screen the user has navigated to. Possible values are shown in the table below - ScreenNameIds

screen_subname_id

integer

Identifier of the category/section of the screen the user has navigated to. Possible values are presented in the table below and depend on the screen_name_id. Relevant only for screens that have categorized tabs, like store, tournaments, etc

custom_section_id

integer

Represents custom section id if the current screen is opened in the context of the custom section.

entity_id

integer

Represents the ID of the entity the user has opened on the current screen. For example: a specific item in the store or a particular tournament.

ScreenNameIds:

IDName of screen

1

Overview / Home

2

Missions screen_subname_id could be 0 - overview 2 - available 3 - locked 4 - completed 5 - missed

3

Leaderboards screen_subname_id could be 1 - daily 2 - weekly 3 - monthly

4

Tournaments list screen_subname_id could be 0 - overview 3 - in progress 4 - finished 5 - my tournaments

5

Tournament lobby (details of specific tournament) screen_subname_id could be 1 - rules 2 - Prizes 3 - Leaderboard 4 - Related games

6

Levels (grid presentation)

7

Levels (map presentation)

8

Badges

9

Bonuses screen_subname_id could be 2 - pending 3 - redeemed

10

Spin a Wheel

11

Store

12

Store (purchase history)

13

Inbox screen_subname_id could be 1 - All message 2 - Favorite

14

Match X

15

Quiz

16

Jackpots

17

Lootbox

18

Custom section

31

Mission item modal

32

Level item modal

33

Badge item modal

34

Store item modal

35

Bonus item modal

36

Jackpot item modal

37

Username change modal

38

Avatar change modal

dwh_ext_xxx.g_ach_levels_changed

The table represents the log of level upgrades/downgrades for each user.

Partitioned by: fact_date

Data availability: last 180 days by fact_date

Primary key: level_change_id

ColumnTypeNote

fact_date

TIMESTAMP

The time of transaction

level_change_id

INT64

Unique ID of the level change fact

from_level_id

INT64

ID of the level before transaction

to_level_id

INT64

ID of the level after transaction

points_change_source_id

INT64

If level change is triggered by points balance change, then this field represents the source of points change. Possible IDs:

    Journey = 1,
    AchievementTaskCompletion = 2,
    AchievementCompletion = 3,
    LevelsStructureChange = 4,
    StorePurchase = 5,
    ManualAdjustment = 6,
    Leaderboard = 7,
    Tournament = 11,
    AutomationRule = 12,
    TournamentRegistration = 13,
    TournamentRegistrationCancellation = 14,
    RefundPoints = 15,
    PlayMiniGame = 16,
    WinMiniGame = 17,

points_collected_on_level_change

INT64

If level change is triggered by points balance change, then this field represents the amount of points added to the user at the moment of upgrade

user_points_ever

INT64

"Points ever" balance of user after level hcange

user_ext_id

STRING

External ID of the user

crm_brand_id

INT64

ID of the brand in the Smartico system

Affiliation dimensional & fact tables

This is an example of the query that calculates total commission split by components (CPA, CPL, RevShare) for all affiliates lifetime.

WITH X as (
  SELECT 
    A.affiliate_id, A.parent_affiliate_id,
    COALESCE(SUM(cpa_fee),0) as cpa_fee,
    COALESCE(SUM(cpl_fee),0) as cpl_fee,
    COALESCE(SUM(rev_share_fee),0) as rev_share_fee,
    COALESCE(SUM(sub_affiliate_fee),0) as sub_affiliate_fee
  FROM dwh_ext_XXX.aff_fin_history_daily F
  INNER JOIN dwh_ext_XXX.dm_affiliate A ON F.affiliate_id = A.affiliate_id
  WHERE operation_date > '2000-01-01 00:00:00 UTC'
  GROUP BY A.affiliate_id, A.parent_affiliate_id
),
SUB_SUMS as (
  SELECT parent_affiliate_id, SUM(X.sub_affiliate_fee) as sub_affiliate_fee
  FROM X
  GROUP BY parent_affiliate_id
  HAVING SUM(X.sub_affiliate_fee) > 0
),
FINAL as (
  select TOP_AFFS.* EXCEPT(sub_affiliate_fee, parent_affiliate_id), SUB_SUMS.sub_affiliate_fee, 
    TOP_AFFS.cpa_fee + TOP_AFFS.rev_share_fee + TOP_AFFS.cpl_fee + COALESCE(SUB_SUMS.sub_affiliate_fee,0) as total_commission
  from X as TOP_AFFS
  left join SUB_SUMS ON TOP_AFFS.affiliate_id = SUB_SUMS.parent_affiliate_id
)
SELECT * 
FROM FINAL
ORDER BY total_commission DESC

Notes:

  1. Provided example of calculating of sub_affiliate_fee should be used only in the setup with one level of sub-affiliates network (when only direct parent of affiliate is getting attribution from the comission generated by his children). In case of multi-level affiliation network setup, the sub-affailites commissions should be calculated in the different way

  2. These calculations are not taking into accounts adjustments that can be done in 3 differences cases:

    1. Manual adjustments

    2. Adjustments based on the "Tiers" based deal structures (RevShare tiers)

    3. Adjustment done beginning of the month based on "No negative cary over" calculations

dwh_ext_xxx.dm_affiliate

The table represent information about affiliate

ColumnTypeNote

affiliate_id

INT64

ID of the affiliate

create_date

TIMESTAMP

Date when affiliate is created

parent_affiliate_id

INT64

ID of the parent affiliate

manager_id

INT64

ID of the manager

payment_method_id

INT64

ID of payment method

ext_affiliate_id

STRING

ID of affiliate in the external system

dwh_ext_xxx.aff_fin_history_daily

The table represents daily aggregation of all financial data system collects about a player. Each column refers to some type of financial data and reflect the SUM of all numbers affilaite system have received for some "operation_date". To understand the final numbers for a player, you should SUM all records for a player, for example if you need to know the total deposit amount of a player, you should SUM all "deposits" + "first_deposit" columns for a player.

Partitioned by: operation_date

Table structure:

ColumnTypeNote

ext_customer_id

STRING

Player ID the way it's reported to affiliate system

brand_id

INTEGER

Brand ID in affiliate system

country

STRING

ISO2 country code of the player

affiliate_id

INTEGER

Affiliate ID in case player belongs to one

registration_id

INTEGER

TAP registration id

operation_date

TIMESTAMP

financial day

update_date

TIMESTAMP

last update date

first_deposit

NUMERIC

first deposit amount

deposits

NUMERIC

deposits amount reported during operation_date

withdrawals

NUMERIC

withdrawals amount reported during operation_date

chargebacks

NUMERIC

chargebacks amount reported during operation_date

bonuses

NUMERIC

bonuses amount reported during operation_date

deductions

NUMERIC

deductions amount reported during operation_date

bets

NUMERIC

bets amount reported during operation_date

wins

NUMERIC

wins amount reported during operation_date

operations

NUMERIC

operations count reported during operation_date

ggr

NUMERIC

GGR amount reported during operation_date

cpl_fee

NUMERIC

CPL fee generated during operation_date

cpa_fee

NUMERIC

CPA fee generated during operation_date

rev_share_fee

NUMERIC

RevShare fee generated during operation_date

sub_affiliate_fee

NUMERIC

Total sub affiliate fee generated during operation_date for all affiliate levels

net_pnl

NUMERIC

NET PnL calculated during operation_date

deposit_count

NUMERIC

deposits count reported during operation_date

withdrawal_count

NUMERIC

withdrawals count reported during operation_date

q_cpa

NUMERIC

if value is 1, it means that at this operation_date player was qualified for CPA

q_cpl

NUMERIC

if value is 1, it means that at this operation_date player was qualified for CPL

How to connect to the Google BigQuery SQL

Smartico leverages Google BigQuery to offer robust and scalable data analytics capabilities. As part of our service, we provide a secure JSON key file that enables programmatic access to our DWH. This section outlines how to use the provided JSON key to connect to BigQuery, depending on the Business Intelligence (BI) tool you choose.

Overview of JSON Key Usage

A JSON key file is used as a credential for authentication. It allows secure access to Google BigQuery without needing user-specific Google credentials. The JSON key file you receive will be associated with a specific service account created and managed by our company, which has been granted access to necessary BigQuery resources.

Important Security Note

Please store your JSON key file securely and ensure it is only accessible to those who require it. Do not share the JSON key publicly or with unauthorized users, as it provides direct access to our Data Warehouse.

Connecting to BigQuery Using Various BI Tools

Different BI tools have varied methods for setting up connections to BigQuery using a JSON key. Below, we offer guidance on connecting some popular BI tools to our DWH. For detailed instructions, refer to the specific documentation for the BI tool you are using.

Note: you shouldn't run analytical queries directly on Smartico DWH, as it may incur high costs. The proper way is to build a pipeline that loads daily data in your internal DWH and build reports on top of it.

Google Data Studio

  • Connection Type: Direct integration with Google services.

  • Key Usage: Typically, Data Studio uses OAuth 2.0 for authentication, which does not require a service account JSON key if using a Google account with access rights.

Tableau

  • Connection Type: Direct connection using a service account.

  • Key Usage: Upload your JSON key file during the setup of the BigQuery connection in Tableau.

Looker

  • Connection Type: Direct connection using a service account.

  • Key Usage: Paste the contents of the JSON key file into the appropriate field when configuring the BigQuery connection in Looker.

Microsoft Power BI

  • Connection Type: Connect using ODBC drivers.

  • Key Usage: Use the Simba ODBC Driver for Google BigQuery and specify the path to your JSON key file in the DSN configuration.

Qlik Sense

  • Connection Type: Connect using ODBC drivers.

  • Key Usage: Configure a System DSN using the ODBC Data Source Administrator on Windows, including the JSON key file.

FAQ

How to analyze DWH usage

You can find 2 reports related to DWH usage in the Reports \ Analytics section of Smartico BackOffice.

DWH: requests count & data read for the current month - is showing the requests count and amount of data read per day

DWH: queries log for last 24 hours - will show you exact requests completed in the last 24 hours and the amount of data read for each

Can't access DWH from Power BI, error: Unable to authenticate with Google BigQuery Storage API. Check your account permissions.

You need to disable the usage of Storage API in Power BI settings following this guide - https://learn.microsoft.com/en-us/power-query/connectors/google-bigquery.

"Use Storage API" - A flag that enables using the Storage API of Google BigQuery. This option is true by default. This option can be set to false to not use the Storage API and use REST APIs instead.

How to authenticate in Power BI with the service account

Use the following guide from Microsoft - https://learn.microsoft.com/en-us/power-query/connectors/google-bigquery

  • Service Account Email: must be in email format

  • Service Account JSON key file contents: once this JSON key is downloaded, all new lines must be removed from the file so that the contents are in one line. Once the JSON file is in that format, the contents can be pasted into this field.

Pay attention the content of JSON file need to be manually formatted in one line

Can't access table dwh_ext_xxx.j_engagements and any other

You need to replace "xxx" value with the label_id specific for your setup. You can also see the label_id in the name of the BigQuery user that has the format l"dwh_ext_1234"

Can different labels and brands have access using the same DWH account?

Every label requires a separate account to access DWH.

You could have many brands under one label setup. Data related to all these labels will be available from the respective DWH account for this label

The documentation says not to use the DW directly for analytical queries, as it may involve costs. Is the best way to carry out a daily data load to an internal DW?

The best approach is to make a daily incremental load from Smartico DWH to your DWH.

Dimensional tables (prefixed as “dm”) are quite small, and you can reload them fully every day.

Other tables (facts tables), are partitioned by date/time, so you can run every day at 1 AM UTC, and get fresh data for the previous day, and load it to your DWH

When using a BI tool such as Power BI in the import model, can I connect directly to the DWH? Is there any financial or performance impact?

If you use Power BI or any other tool to query data directly from Smartico DWH, this may incur a high load and additional costs depending on how complex reports you are doing. Best way is to load data from Smartico DWH to your DWH and connect Power BI to your DWH.

Regarding the "import" mode in Power BI - according to the Microsoft documentation, it can work in two ways - "Full refresh" and "Incremental refresh". Based on the names, the best way is to use "Incremental way" in order to avoid high resource usage, but for exact setup please consult with an expert in the Power BI tool.

Is there a limit on the number of queries that does not generate this extra cost?

There is a limit of 1 TB of scanned data per month (it reflects the way how Google is building pricing for BigQuery usage). There is no limit in number of queries.

Reading the documentation, there are some details we'd like to know: Is the data in the DWH always available for a window of the last 180 days?

Yes, fact tables are limited to last 180 days. If you do incremental/daily load of data finally, you can store unlimited in terms of time, data on your side.

Can we get data from all the campaigns within this DWH? Information related to the segmentations we used, results by channel (sms, email, push, etc), value assigned to the active group and control group, CTR, conversion rate, etc?

You can get results from communication channels like SMS, email, push, etc.

The fact table for communication channels - j_communication, includes all fails, clicks, and impressions, so you can build derivative metrics like CTR, Delivery rate, Fail rate, etc.

j_engagements fact table represents information about campaigns, you can build conversion rate for campaigns using it.

Segment exporting is available through the table-function fn_export_segment, you can find details in the documentation above.

Last updated