Frequently Asked Questions

How to consume an API Exposure in a repository?

To consume data from a V2 Exposure in a repository, you must create the DataChain Connector that will allow the future consumer repository to access the data.

Consuming the API with REST

Creating the HTTP-type Connector requires:

The domain and port of the REST access URL of the Exposure (example: http://marketplace.dc-cicd.adobis.lan:40969 if the full access URL is http://marketplace.dc-cicd.adobis.lan:40969/dmp-api/data/PROJET_EXPOSITI_1172/exposition. This URL is available on the Exposure details page in the Marketplace.
The URL for generating an authentication token to the marketplace (example: http://kc.dc-cicd.adobis.lan:40969/auth/realms/dc-realm/protocol/openid-connect/token)
The login credentials of a DataChain user with access to the Exposure data (username and password)
Input of the Client identifier: dc_marketplace
Input of the Client secret. For DataChain instances equipped with Keycloak, it is available from its interface at the following location: Clients > dc_marketplace > Credentials > Client secret (Client secret format: 7192f374-xxxx-xxxx-xxxx-xxxxxxxxxxx)

The authentication parameter of this Connector must be OAuth2 (Password) and the authentication method, Basic (Request Header).

Access to the data is performed at the level of a DataChain Repository consuming the connector previously created.

Repository configuration

Reader: JSON File V2
URI: the second part of the REST access URL of the Exposure (example: dmp-api/data/PROJET_EXPOSITI_1172/exposition if the full access URL is http://marketplace.dc-cicd.adobis.lan:40969/dmp-api/data/PROJET_EXPOSITI_1172/exposition)
Method: GET

Then simply select the headers among the available columns of the exposure and save the Repository to use the retrieved data.

Consuming the API with OData

The connector is created in the same way as when consuming the API with REST.

The repository can be created using either a JSON File V2 reader or an XML File V2 reader.

When using an XML File V2 reader, the URI must include the query option "$format=XML" (example: dmp-api/odata/PROJET_EXPOSITI_1172/exposition?$format=XML if the full access URL is http://marketplace.dc-cicd.adobis.lan:41010/dmp-api/odata/PROJET_EXPOSITI_1172/exposition).

The repository can also use the JSON File V2 reader with the same URI, without any query option (example: dmp-api/odata/PROJET_EXPOSITI_1172/exposition).

After selecting the GET method in both cases, simply choose the headers and save the repository to use the API data.

How to use pagination in a repository consuming a DataChain REST API?

Pagination allows for progressively loading data to optimize the performance and reliability of reading large volumes of data provided by an API.

Configuring pagination with a request to a DataChain REST API in an HTTP repository requires adding the hitsPerPage option in the repository’s URI field. It specifies how many items are desired per page in the response.

Example of a URI with 5 records per page

dmp-api/data/PROJET_EXPOSITI_1172/exposition?hitsPerPage=5

Then, in the box to the right of the URI field, specify an iteration Per Page with the variable page, the desired delay between two iterations in milliseconds, and the number of desired iterations by filling in the Start and End fields (example: 0 and 50,000 respectively to perform 50,001 iterations).

Finally, it is possible to end iterations before they reach the defined number with the Start and End fields by configuring an end-of-iteration condition.

In the case of consuming a standard DataChain REST API, the boolean dc_has_next present in the API response header at each iteration can be used.

To do so, from the End of iteration box in the reading parameters of the HTTP repository consuming the DataChain API, click on Header Content then the "+" button to fill in the fields of the new Key-Value pair as follows: dc_has_next (Key), false (Value).

Thus, the iterations will automatically stop when the current page indicates the absence of a next page via the dc_has_next Key.

How to use pagination in a repository consuming a DataChain OData API?

Pagination allows for progressively loading data to optimize the performance and reliability of reading large volumes of data provided by an API.

Configuring pagination with a request to a DataChain OData API in an HTTP repository is done by adding the top option in the repository’s URI field. It specifies how many items are desired per page in the response.

Example of a URI with 1000 records per page

dmp-api/odata/PROJET_EXPOSITI_1172/exposition?$top=1000

In the box to the right of the URI field, specify an iteration By Offset with the variable $skip, the desired delay between two iterations in milliseconds, the desired number of iterations (Nb Iteration field), as well as the starting point of the $skip variable (Start field) and the batch size of data processed at each iteration (Limit field).

For example, with the above URL, 5000 iterations, a $skip variable starting at index 0, and a limit of 1000, the repository will receive 1000 records per page starting from record 0 and advancing by 1000 at each iteration up to a maximum of 5000 times.

For performance reasons, it is possible to end iterations before they reach the configured number of iterations by setting an end-of-iteration condition.

In the case of consuming a DataChain OData API, a regular expression to detect the absence of the nextLink expression in the body of the API response can be used.

To do so, from the End of iteration box in the reading parameters of the HTTP repository consuming the DataChain API, click on Body Content then select With REGEX to enter the regular expression that will detect the absence of the nextLink expression: ^(?!.nextLink)..

Thus, iterations will automatically stop when the current page indicates the absence of a next page by the absence of nextLink in the body of the API response.