This document is a reference for the V2 Segmentation API. The V2 Segmentation API allows you to create and manage segments using a subset of SQL that includes statements and operations necessary to define any segment. In addition, this version of the Segmentation API exposes the contact_data
and event_data
tables to enable SQL-compatible queries.
_10SELECT contact_id, updated_at_10FROM <table_reference> [ AS <table_alias> ]_10[ [ INNER ] JOIN <table_reference> ON contact_id ]_10[ WHERE <predicate> ]
In the previous example, the contact_id
and updated_at
columns are selected in the query. These two columns must be selected, and aggregate function MAX and its alias can be selected part of Last Event queries. Other than those columns, no additional columns can be selected. When joining two tables, they must be joined on the contact_id
column only because it is the foreign key used to connect the contact_data
and event_data
tables.
SELECT
FROM
JOIN
(defaults to
INNER JOIN
)
INNER JOIN
ON
WHERE
[NOT] IN
IS [NOT]
AND
OR
NOT
=
>
>=
<
<=
!=
+
-
*
/
%
Column Name | Data Type |
---|---|
CONTACT_ID | VARCHAR(36) |
VARCHAR(254) | |
PHONE_NUMBER_ID | VARCHAR |
EXTERNAL_ID | VARCHAR(254) |
ANONYMOUS_ID | VARCHAR(254) |
FIRST_NAME | VARCHAR(50) |
LAST_NAME | VARCHAR(50) |
ALTERNATE_EMAILS | VARCHAR ARRAY |
ADDRESS_LINE_1 | VARCHAR(100) |
ADDRESS_LINE_2 | VARCHAR(100) |
CITY | VARCHAR(60) |
STATE_PROVINCE_REGION | VARCHAR(50) |
POSTAL_CODE | VARCHAR(60) |
COUNTRY | VARCHAR(55) |
LIST_IDS | VARCHAR ARRAY |
CREATED_AT | TIMESTAMP |
UPDATED_AT | TIMESTAMP |
EMAIL_DOMAINS | VARCHAR ARRAY |
PHONE_NUMBER | VARCHAR(24) |
VARCHAR(254) | |
LINE | VARCHAR(254) |
VARCHAR(254) | |
UNIQUE_NAME | VARCHAR(254) |
CUSTOM_FIELDS * | Each custom field will appear as a column on the CONTACT table based on its name. Valid data types are: VARCHAR, NUMBER, TIMESTAMP |
Event data that can be optionally be used to enhanced segments
Column name | Data type |
---|---|
CONTACT_ID | VARCHAR(36) |
EVENT_SOURCE | VARCHAR(256) |
EVENT_TYPE | VARCHAR(256) |
TIMESTAMP | TIMESTAMP |
DATA | JSON BLOB. Numeric and String type data |
Combining more than one singlesend_id and mc_auto_id/mc_auto_step_id in a single statement using an AND clause is not allowed. In order to segment contacts that have events corresponding to more than one single sends and automations use the JOIN clause. See Multi-Engagement Events for examples.
_10{_10 "payload": {_10 "unique_args": {_10 "mc_auto_id": "02bc41ca-07f9-11eb-9c1f-422f9e0f60d1",_10 "mc_auto_step_id": "0478e207-07f9-11eb-9c1f-422f9e0f60d1",_10 "singlesend_id": "b1ff0108-0994-11eb-915a-7ab0feb0ce95",_10 },_10 },_10}
Example query
current_timestamp
The following query will return all contacts that were created before this run of the segment:
_10SELECT contact_id, updated_at_10FROM contact_data_10WHERE created_at < CURRENT_TIMESTAMP
timestampadd
allows you to specify a time interval, a number of those intervals as a count, and a timestamp that you can use to compare Date fields against.
Parameters
Parameter | Supported values |
---|---|
interval | year , quarter , month , day , hour , minute , second |
count | any integer (negative or positive) |
timestamp | A supported date/time format (RFC3339), an existing date field, or a custom field of type Date |
Example Query
timestampadd(interval, count, timestamp)
The following query will return all contacts that were created before one year prior to 2020-10-15 6PM GMT:
_10SELECT contact_id, updated_at _10FROM contact_data _10WHERE created_at > timestampadd(day, 1, '2020-10-15T18:00:12Z')
array_contains
checks for at least one of the values provided in an array. It does not hard check for the presence of all the values or an exact match of the values given. It also doesn't support pattern-matches like '%gmail.com%
' as array values. array_contains
will instead check for the exact string, including pattern match characters. For example, when searching '%gmail.com%
' the %
characters are considered part of the string literal.
Parameters
Parameter | Supported values |
---|---|
field_name | "alternate_emails" , "list_ids" , "email_domains" |
array | alternate emails, list IDs, and email domains that correspond to the specified field_name as strings |
Example Query
array_contains(field_name, array)
The following query will return all contacts with email domains equal to gmail.com or yahoo.com:
_10SELECT contact_id, updated_at _10FROM contact_data _10WHERE array_contains(email_domains, ['gmail.com', 'yahoo.com'])
contains_word
is a custom function provided by the Marketing Campaigns V2 Segments API. contains_word
accepts two arguments: a word_value
that will be searched for in the field_name
values stored for each contact.
The field_name
you specify must be of type Text, and the value stored in field_name
must be a comma-separated string. For example, if the function evaluates the word_value
alice
against a first_name
field where the value is bob,alice,sue
, it will find alice
and return the contact. However, if the first_name
value is stored as bobalicesue
, bob.alice.sue
, or any other non comma separated string, alice
will not be found, and the contact will not be returned. This is important when thinking about how to enter the string values associated with your contacts. The check is case sensitive.
bob,(alice),sue
and a match value of
(alice)
will return a contact
bo'b,alice
will require a match value of
bo''b
.
Parameters
Parameter | Supported values |
---|---|
word_value | A string |
field_name | Any String type fields in your contact data or custom fields. Stored strings must be comma separated. |
Example Query
The following query will return all contacts where first_name
contains the word bob
:
_10SELECT contact_id, updated_at _10FROM contact_data _10WHERE contains_word('bob', first_name)
Segmentation on a list is supported by providing a list ID value to the optional parent_list_ids
field in the v2 api request. This parameter currently accepts only one list ID. Segmentation of contacts from multiple list ids is supported through the use of array_contains
function. See example below.
All contacts:
_10SELECT contact_id, updated_at _10FROM contact_data
All contacts with first_name
'Dave':
_10SELECT contact_id, updated_at _10FROM contact_data _10WHERE first_name = 'Dave'
All contacts where state_province_region
is 'Colorado':
_10SELECT contact_id, updated_at _10FROM contact_data _10WHERE state_province_region = 'CO'
All contacts with primary email with the substring 'gmail.com':
_10SELECT contact_id, updated_at _10FROM contact_data _10WHERE email like '%gmail.com%'
All contacts with a text type custom field my_text_custom_field
value 'abc':
_10SELECT contact_id, updated_at _10FROM contact_data _10WHERE my_text_custom_field = 'abc'
All contacts with primary email with 'gmail.com' as domain name, and a text type custom field my_custom_field
value 'abc':
_10SELECT contact_id, updated_at _10FROM contact_data _10WHERE email like '%gmail.com' and my_custom_field = 'abc'
All contacts with a number type custom field my_number_custom_field
value 12:
_10SELECT contact_id, updated_at _10FROM contact_data _10WHERE my_number_custom_field = 12
All contacts with a date type custom field my_date_custom_field
value `2021-01-01T12:46:24Z':
_10SELECT contact_id, updated_at _10FROM contact_data _10WHERE my_date_custom_field = '2021-01-01T12:46:24Z'
All contacts where alternate email is equal to 'alternate@gmail.com':
_10SELECT contact_id, updated_at _10FROM contact_data _10WHERE array_contains(alternate_emails, ['alternate@gmail.com'])
All contacts where alternate email is equal to 'alternate@gmail.com' or 'alternate2@gmail.com':
_10SELECT contact_id, updated_at _10FROM contact_data _10WHERE array_contains(alternate_emails, ['alternate@gmail.com','alternate2@gmail.com'])
All contacts present in a specific list "list_id":
_10SELECT contact_id, updated_at _10FROM contact_data _10WHERE array_contains(list_ids, ['02bc41ca-07f9-11eb-9c1f-422f9e0f60d1'])
All contacts present in either of the list_ids:
_10SELECT contact_id, updated_at _10FROM contact_data _10WHERE array_contains(list_ids, ['02bc41ca-07f9-11eb-9c1f-422f9e0f60d1', '02bc41ca-07f9-11eb-9c1f-422f9e0f62e4', '042a8e48-6e31-11eb-a8bc-7656c249c550'])
All contacts with specific email domain(s) gmail.com
:
_10SELECT contact_id, updated_at _10FROM contact_data _10WHERE array_contains(email_domains, ['gmail.com'])
All contacts where created_at
is after 2021-01-01 12 PM GMT:
_10SELECT contact_id, updated_at _10FROM contact_data _10WHERE created_at > '2021-01-01T12:00:00Z'
All contacts where created_at
is equal to 2021-01-01 12 PM GMT:
_10SELECT contact_id, updated_at _10FROM contact_data _10WHERE created_at = '2021-01-01T12:00:12Z'
All contacts with external id
that starts with '123':
_10SELECT contact_id, updated_at_10FROM contact_data_10WHERE external_id like '123%'
When a contact's reserved fields are not set during creation, they default to a NULL value. Not setting a value for custom fields when creating a contact will save the contact without those custom fields. A segment query using these fields does not return contacts having NULL as a value. Additional conditions may be used to specify the field is null
so that contacts having a null value will be included in the segment.
All contacts where first_name
does not equal Dave
, also including contacts without a value for first_name
:
_10SELECT contact_id, updated_at _10FROM contact_data _10WHERE first_name != ‘Dave' OR first_name is null
All contacts where my_custom_field
does not contain abc
, also including contacts without a value for my_custom_field
:
_10SELECT contact_id, updated_at _10FROM contact_data _10WHERE my_custom_field like ‘%abc%' OR my_custom_field is null
Only JOIN
and INNER JOIN
are allowed and INNER JOIN
will be internally converted to JOIN
. LEFT JOIN
and RIGHT JOIN
are restricted for performance, and the same functionality can be achieved using JOIN
. The examples below show how LEFT JOIN
/ RIGHT JOIN
can be replaced with JOIN
and UNION
.
The following queries return all contacts that have an event "delivered" for a particular Single Send or whose state is Colorado. While both queries return the same set of contacts, using JOIN
and UNION
is more performant than using LEFT JOIN
.
LEFT JOIN (not supported)
_10SELECT c.contact_id, c.updated_at _10FROM contact_data c_10LEFT JOIN event_data e _10ON c.contact_id = e.contact_id_10WHERE e.event_source = 'mail'_10AND e.event_type = 'delivered' _10AND e.DATA:payload.unique_args.singlesend_id = '042a8e48-6e31-11eb-a8bc-7656c249c550'_10OR c.state_province_region = 'CO'
JOIN and UNION
UNION
can be used instead of "OR" to return contacts based on distinct criteria that requires use of a JOIN
.
For example, a query for contacts that were 'delivered' a Single Send OR the contacts whose state_province_region
is in "CO" can be written as follows:
_11SELECT c.contact_id, c.updated_at _11FROM contact_data c_11JOIN event_data e _11ON c.contact_id = e.contact_id_11WHERE e.event_source = 'mail'_11AND e.event_type = 'delivered'_11AND e.DATA:payload.unique_args.singlesend_id = '042a8e48-6e31-11eb-a8bc-7656c249c550'_11UNION _11SELECT contact_id, updated_at _11FROM contact_data _11WHERE c.state_province_region = 'CO'
All contacts that have at least one event ( i.e. an attempt was made to send them an email):
_10SELECT c.contact_id, c.updated_at _10FROM contact_data c _10JOIN event_data e ON c.contact_id = e.contact_id
All contacts that have opened emails from two different single sends:
_13SELECT e2.contact_id, c2.updated_at_13FROM event_data e2_13INNER JOIN (_13 SELECT c1.contact_id, c1.updated_at_13 FROM contact_data AS c1_13 INNER JOIN event_data e1 ON c1.contact_id = e1.contact_id_13 WHERE e1.event_source = 'mail'_13 AND e1.event_type = 'open'_13 AND e1.DATA:payload.unique_args.singlesend_id = '02bc41ca-07f9-11eb-9c1f-422f9e0f60d1'_13) AS c2 ON c2.contact_id = e2.contact_id_13WHERE e2.event_source = 'mail' _13 AND e2.event_type = 'open'_13 AND e2.DATA:payload.unique_args.singlesend_id = '00163f67-7211-4363-ab4e-12dd6f313b3a'
All contacts that have opened ANY single sends within 3 days:
_10SELECT c.contact_id, c.updated_at_10FROM contact_data as c_10JOIN event_data as e on c.contact_id = e.contact_id_10WHERE e.event_source = 'mail'_10 AND e.event_type = 'open'_10 AND e.DATA:payload.unique_args.singlesend_id is not null_10 AND e.timestamp >= timestampadd(day, -3, current_timestamp())
All contacts that have both of the two events - click and open:
_10SELECT e2.contact_id, c2.updated_at_10FROM event_data e2_10INNER JOIN (_10 SELECT c1.contact_id, c1.updated_at_10 FROM contact_data AS c1_10 INNER JOIN event_data e1 ON c1.contact_id = e1.contact_id_10 WHERE e1.event_source = 'mail' AND e1.event_type = 'open'_10) AS c2 ON c2.contact_id = e2.contact_id_10WHERE e2.event_source = 'mail' AND e2.event_type = 'click'
All contacts that have both of the two events - bounce and deferred for a particular automation:
_13SELECT e2.contact_id, c2.updated_at_13FROM event_data e2_13INNER JOIN (_13 SELECT c1.contact_id, c1.updated_at_13 FROM contact_data AS c1_13 INNER JOIN event_data e1 ON c1.contact_id = e1.contact_id_13 WHERE e1.event_source = 'mail' _13 AND e1.event_type = 'bounce'_13 AND e1.DATA:payload.unique_args.mc_auto_id = '02bc41ca-07f9-11eb-9c1f-422f9e0f60d1'_13) AS c2 ON c2.contact_id = e2.contact_id_13WHERE e2.event_source = 'mail' _13 AND e2.event_type = 'deferred'_13 AND e2.DATA:payload.unique_args.mc_auto_id = '02bc41ca-07f9-11eb-9c1f-422f9e0f60d1'
All contacts that have no event data (i.e. all contacts with where there has been no attempt to send an email to):
_10SELECT c.contact_id, c.updated_at_10FROM contact_data c_10WHERE c.contact_id NOT IN (_10 SELECT e.contact_id_10 FROM event_data e_10)
All contacts that have not opened any mail in the last two months worth of seconds from the time when the segment is run:
_10SELECT c.contact_id, c.updated_at_10FROM contact_data c_10WHERE c.contact_id NOT IN (_10 SELECT e.contact_id_10 FROM event_data e_10 WHERE e.event_source = 'mail' _10 AND e.event_type = 'open'_10 AND e.timestamp < timestampadd(MONTH, -2, CURRENT_TIMESTAMP)_10)
All contacts that have been 'delivered' the ANY single sends but have not 'open' ANY single sends within 1 month:
_14SELECT c.contact_id, c.updated_at_14FROM contact_data c_14JOIN event_data AS e1 ON c.contact_id = e1.contact_id_14 WHERE e1.event_source = 'mail' _14 AND e1.event_type = 'delivered'_14 AND e1.DATA:payload.unique_args.singlesend_id is not null_14 AND c.contact_id NOT IN (_14 SELECT e2.contact_id_14 FROM event_data AS e2_14 WHERE e2.event_source = 'mail' _14 AND e2.event_type = 'open'_14 AND e2.DATA:payload.unique_args.singlesend_id is not null_14 AND e2.timestamp >= timestampadd(month, -1, current_timestamp()))_14 )
All contacts that have been 'delivered' the second email from automation but have not 'open' the email (mc_auto_id = 'fa3e0f6e-d397-11eb-87ce-22ffc6ed50f5' mc_auto_step_id = 'fc9ead9f-d397-11eb-92fe-02aa17a6534e'):
_15SELECT c.contact_id, c.updated_at_15FROM contact_data c_15JOIN event_data AS e1 ON c.contact_id = e1.contact_id_15 WHERE e1.event_source = 'mail' _15 AND e1.event_type = 'delivered'_15 AND e1.DATA:payload.unique_args.mc_auto_id = 'fa3e0f6e-d397-11eb-87ce-22ffc6ed50f5'_15 AND e1.DATA:payload.unique_args.mc_auto_step_id = 'fc9ead9f-d397-11eb-92fe-02aa17a6534e'_15 AND c.contact_id NOT IN (_15 SELECT e2.contact_id_15 FROM event_data AS e2_15 WHERE e2.event_source = 'mail' _15 AND e2.event_type = 'open'_15 AND e2.DATA:payload.unique_args.mc_auto_id = 'fa3e0f6e-d397-11eb-87ce-22ffc6ed50f5'_15 AND e2.DATA:payload.unique_args.mc_auto_step_id = 'fc9ead9f-d397-11eb-92fe-02aa17a6534e'_15 )
There are three email activities that support last events: "Last Clicked", "Last Opened", and "Last Emailed". The last event queries are helpful to see the contacts whose last mail activity is in a certain period. The following queries are used to segment contacts based on their last mail activity.
All contacts that have not opened contacts since January of 2021 (last opened activity is before 2021):
_12SELECT c1.contact_id, c1.updated_at _12 FROM contact_data as c1 _12 JOIN (_12 SELECT e1.contact_id, e1.max_timestamp _12 FROM (_12 SELECT e.contact_id, MAX(e.timestamp) max_timestamp _12 FROM event_data e _12 WHERE e.event_source = 'mail' and e.event_type = 'open' _12 GROUP BY e.contact_id_12 ) as e1_12 WHERE e1.max_timestamp < '2021-01-01T00:00:00.000Z'_12 ) as e2 on c1.contact_id = e2.contact_id
All contacts that last clicked an email within 7 days (last clicked activity is within 7 days):
_12SELECT c1.contact_id, c1.updated_at _12 FROM contact_data as c1 _12 JOIN (_12 SELECT e1.contact_id, e1.max_timestamp _12 FROM (_12 SELECT e.contact_id, MAX(e.timestamp) max_timestamp _12 FROM event_data e _12 WHERE e.event_source = 'mail' and e.event_type = 'click' _12 GROUP BY e.contact_id_12 ) as e1_12 WHERE e1.max_timestamp >= timestampadd(day, -7, current_timestamp())_12 ) as e2 on c1.contact_id = e2.contact_id
All contacts that were last emailed in the past three months and last opened in the previous month (last emailed is within 3 months and last opened is within 1 month):
_29SELECT c1.contact_id, c1.updated_at _29FROM (_29 SELECT c.contact_id, c.updated_at _29 FROM contact_data as c _29 JOIN (_29 SELECT e1.contact_id, e1.max_timestamp _29 FROM (_29 SELECT e.contact_id, MAX(e.timestamp) as max_timestamp _29 FROM event_data e _29 WHERE e.event_source = 'mail' _29 AND e.event_type = 'delivered' group by e.contact_id_29 ) as e1 _29 WHERE e1.max_timestamp >= timestampadd(month, -3, current_timestamp())_29 ) as e2 on c.contact_id = e2.contact_id_29) as c1 _29JOIN (_29 SELECT c.contact_id, c.updated_at _29 FROM contact_data as c _29 JOIN (_29 SELECT e1.contact_id, e1.max_timestamp _29 FROM (_29 SELECT e.contact_id, MAX(e.timestamp) as max_timestamp _29 FROM event_data e _29 WHERE e.event_source = 'mail' _29 AND e.event_type = 'open' group by e.contact_id_29 ) as e1 _29 WHERE e1.max_timestamp >= timestampadd(month, -1, current_timestamp())_29 ) as e2 on c.contact_id = e2.contact_id_29) as c11 on c1.contact_id = c11.contact_id
NOTE: Even if the aggregate function MAX is supported along with GROUP BY, the usage is restricted to last event queries only. MAX function accepts only the "timestamp" field from event_data and GROUP BY clause accepts only the "contact_id" field. An alias name can be used only for the aggregate function MAX, but not for "contact_id" or "updated_at".
Multiple data sets are allowed to be joined together if done in a specific manner even though SQL supports both JOINs expressed linearly and JOINs expressed using subqueries. Each data set's criteria must be listed alongside the corresponding table reference as it makes for better readability by having the queries nested explicitly by using subqueries. Following is an example of a segment query for both types. Here, the query defines a segment of all contacts that have the event of type 'processed' from the list of contacts which have the event of type 'delivered'.
_10SELECT c1.contact_id, c1.updated_at FROM event_data AS e2_10JOIN contact_data AS c1 ON e2.contact_id = c1.contact_id_10JOIN event_data as e1 ON e1.contact_id = e2.contact_id_10WHERE e1.event_source = 'mail' AND e1.event_type = 'delivered' AND e2.event_source = 'mail' AND e2.event_type = 'processed'
This logic, however, can easily be represented using JOINs with subqueries in a more readable way.
_10SELECT e2.contact_id, c2.updated_at_10FROM event_data AS e2 JOIN (_10 SELECT c1.contact_id, c1.updated_at_10 FROM contact_data AS c1 JOIN event_data AS e1_10 ON c1.contact_id = e1.contact_id_10 WHERE e1.event_source = 'mail' AND e1.event_type = 'delivered') AS c2_10ON c2.contact_id = e2.contact_id_10WHERE e2.event_source = 'mail' AND e2.event_type = 'processed'
NOTE: There is no reduction in functionality as both generate the exact same result.
When creating or changing a segment query using curl command, escape any single quotes present in any parameter:
_10SELECT contact_id, updated_at_10FROM contact_data _10WHERE first_name = '\''Dave'\''