Links (Joins)

Overview

In each step of DataBlocksGenericsData Logo (GenericsData module), one or more links (joins) can be defined.

A link serves to automatically import columns (and their rows) from other Entities or DataBlocks, provided that these data meet one or more criteria specified by the user.

For example, a link can be used to retrieve, for each patient in a given DataBlock, the list of their Medical Procedures contained in the "Medical Procedures" column of a "Patient Records" Entity.

Such a link retrieves only the rows from the Entity’s Medical Procedures column whose value in the IPP column (permanent patient identifier) already appears in the Patient Identifier of the DataBlock:

DataBlock "Patient Admissions/Discharges"

Patient Identifier Admission Date Discharge Date

1001

2025-06-01

2025-06-10

1002

2025-06-03

2025-06-08

1003

2025-06-05

2025-06-12

Entity "Patient Records"

IPP Medical Procedures

1001

Brain MRI, Blood Test

1002

Chest CT Scan

1004

ECG, Cardiology Consultation

Patient Identifier Admission Date Discharge Date Medical Procedures

1001

2025-06-01

2025-06-10

Brain MRI, Blood Test

1002

2025-06-03

2025-06-08

Chest CT Scan

1003

2025-06-05

2025-06-12

null

Principles

Creating a link involves configuring:

  • a data block (Entity or DataBlock) to link to the current DataBlock,

  • the columns from the data block to add to the current DataBlock,

  • the join type used to retrieve rows from the linked data block,

  • the conditions for retrieving these rows, combined by the logical operator "AND".

Conditions are defined from a column of the current DataBlock (or a constant) and a column from the linked data block (or a constant), whose values will be compared using an operator (Equal, Not equal, Less than, Greater than, etc.).

A checkbox adjacent to the condition column selection fields allows them to be converted into constant input fields, if needed.

Join Types

Defining the join type specifies how data from the two sources will be combined to produce the resulting dataset from the link, depending on whether matching values are present or not.

The following subsections describe the different available join types and their functionality. The introductory example is reused, with the link result shown according to the join type selected by the user.

In each example below, all columns from the target Entity are retrieved whenever possible. The link criterion is: Patient Identifier Equal IPP. Reminder of the linked data blocks:

Source DataBlock "Patient Admissions/Discharges"
Patient Identifier Admission Date Discharge Date

1001

2025-06-01

2025-06-10

1002

2025-06-03

2025-06-08

1003

2025-06-05

2025-06-12

Entity "Patient Records"
IPP Medical Procedures

1001

Brain MRI, Blood Test

1002

Chest CT Scan

1004

ECG, Cardiology Consultation

ANTI Join

Specificity: retains only rows from the source DataBlock without any matching rows in the target entity.

Initially, source DataBlock rows meeting the link criteria are selected, then excluded from the final link result. No columns or rows from the target entity are joined.

Patient Identifier Admission Date Discharge Date

1003

2025-06-05

2025-06-12

INNER Join

Specificity: returns only rows that have matching records in both datasets.

Patient Identifier Admission Date Discharge Date IPP Medical Procedures Last Stay Date

1001

2025‑06‑01

2025‑06‑10

1001

Brain MRI; Blood Test

2025‑05‑22

1002

2025‑06‑03

2025‑06‑08

1002

Chest CT Scan

2025‑05‑30

LEFT Join

Specificity: retains all rows from the source DataBlock. Columns from the target entity are added when the condition is met; otherwise, they remain empty.

By default, links use the Left join type.

Patient Identifier Admission Date Discharge Date IPP Medical Procedures Last Stay Date

1001

2025‑06‑01

2025‑06‑10

1001

Brain MRI; Blood Test

2025‑05‑22

1002

2025‑06‑03

2025‑06‑08

1002

Chest CT Scan

2025‑05‑30

1003

2025‑06‑05

2025‑06‑12

null

null

null

RIGHT Join

Specificity: retains all rows from the target data block. Columns from the source DataBlock are added when a matching record exists; otherwise, they remain empty.

Patient Identifier Admission Date Discharge Date IPP Medical Procedures Last Stay Date

1001

2025‑06‑01

2025‑06‑10

1001

Brain MRI; Blood Test

2025‑05‑22

1002

2025‑06‑03

2025‑06‑08

1002

Chest CT Scan

2025‑05‑30

null

null

null

1004

ECG; Cardiology Consultation

2025‑06‑07

OUTER Join

Specificity: returns the union of both datasets. Rows without matching records are filled with null values for missing columns.

Patient Identifier Admission Date Discharge Date IPP Medical Procedures Last Stay Date

1001

2025‑06‑01

2025‑06‑10

1001

Brain MRI; Blood Test

2025‑05‑22

1002

2025‑06‑03

2025‑06‑08

1002

Chest CT Scan

2025‑05‑30

1003

2025‑06‑05

2025‑06‑12

null

null

null

null

null

null

1004

ECG; Cardiology Consultation

2025‑06‑07