Correlations

Use me when you need to measure exactly how strongly your numerical variables are related — and in which direction. I'll compute pairwise Pearson correlation coefficients across every column pair you choose, then lay them out in a color-coded matrix so you can instantly spot which variables climb together, which pull against each other, and which are strangers.

Overview

The Correlations plot computes pairwise linear (Pearson) correlation coefficients between selected numerical columns and renders them as a symmetric heatmap. Each cell holds a value from -1 to +1: values near +1 mean the two variables rise and fall together; values near -1 mean they move in opposite directions; values near 0 indicate no linear relationship. A diverging blue-white-red colorscale maps those extremes to color so patterns stand out at a glance.

Best used for:

Quickly scanning all pairwise relationships across many numerical columns at once
Identifying strong predictors of a target variable
Detecting multicollinearity before building regression or ML models
Flagging redundant features that carry duplicate information
Generating hypotheses for deeper pairwise investigation
Validating that expected relationships (or the absence of them) hold in your data

Common Use Cases

Data Science & Machine Learning

Feature selection: find columns most correlated with your target variable
Multicollinearity detection before linear regression or logistic regression
Pruning redundant features from high-dimensional datasets
Understanding which inputs a model is likely to conflate

Statistical Analysis

Exploratory data analysis (EDA) as a first pass over a new dataset
Hypothesis generation — strong correlations raise questions worth testing
Data quality checks — suspiciously perfect correlations may signal copy-paste errors
Confirming that theoretically independent variables are in fact uncorrelated

Healthcare & Life Sciences

Understanding relationships between lab values and clinical outcomes
Identifying biomarkers that co-vary with disease progression
Flagging redundant measurements in a clinical panel

Business Analytics

Discovering which KPIs move together across time periods
Identifying product or channel affinities from behavioral data
Validating that marketing spend and revenue are positively linked

Options

Columns of Interest

Optional — Select which numerical columns to include in the analysis.

When left empty, all numerical columns in the dataset are used. Selecting a subset focuses the matrix on the variables you care about and improves readability. Choose two or more numerical columns; the plot will produce an N×N matrix where N is the number of selected columns.

Settings

Annotate Segments With Value

Optional — Off by default.

When switched on, the Pearson correlation coefficient is printed inside every cell of the matrix. This is especially useful when:

Color differences between nearby values (e.g. 0.62 vs 0.71) are hard to distinguish visually
You need exact numbers for a report or presentation
You are working with a small matrix (≤ 8 columns) where text fits comfortably

For large matrices with many columns, leave this off to keep the chart readable and rely on the color gradient for a high-level overview.

Understanding Correlation Values

The Pearson Coefficient

The Pearson correlation coefficient (r) measures the strength and direction of the linear relationship between two numerical variables. It always falls in the range -1 to +1:

Range	Interpretation
+0.8 to +1.0	Strong positive correlation
+0.5 to +0.8	Moderate positive correlation
+0.2 to +0.5	Weak positive correlation
-0.2 to +0.2	Little or no linear correlation
-0.5 to -0.2	Weak negative correlation
-0.8 to -0.5	Moderate negative correlation
-1.0 to -0.8	Strong negative correlation

Positive Correlation (red cells)

Both variables tend to increase together.

Example: Albumin and N_Days — patients with higher albumin levels tend to survive longer.
On the chart: warm red color, value closer to +1.

Negative Correlation (blue cells)

One variable increases as the other decreases.

Example: Bilirubin and N_Days — higher bilirubin is associated with shorter survival.
On the chart: cool blue color, value closer to -1.

No Linear Correlation (white/near-white cells)

No consistent linear trend between the two variables.

Example: Alk_Phos and Platelets in the sample above (r = -0.06).
Note: a near-zero Pearson r does not rule out non-linear relationships.

Interpreting the Matrix

Diagonal

Every cell on the diagonal represents a variable's correlation with itself, which is always exactly 1.0. These cells serve as anchors — the diagonal of perfect self-correlation divides the matrix into two symmetric triangles.

Symmetry

The matrix is symmetric: the value at row A, column B equals the value at row B, column A. You only need to read one triangle; both carry the same information.

Color Scale

Deep blue → strong negative correlation (approaching -1)
White → no linear relationship (near 0)
Deep red → strong positive correlation (approaching +1)

The colorscale is fixed from -1 to +1 so comparisons across different datasets remain consistent.

Interpretation Tips

Correlation is not causation. A strong r value means two variables co-vary linearly — it does not tell you which one drives the other, or whether a third variable drives both.
Pearson r only captures linear relationships. A variable pair with a curved or U-shaped relationship may show r ≈ 0 even though a strong pattern exists. Use scatter plots to investigate further.
Outliers can inflate or deflate r. A single extreme point can create or destroy an apparent correlation. Check distributions before drawing conclusions.
Sample size matters. With fewer than 30 observations, individual r values are unreliable. With very large samples, even r = 0.05 may be statistically significant while being practically meaningless.
Enable annotations for precision. When exact values matter — for feature selection thresholds, reports, or multicollinearity checks — turn on "Annotate Segments With Value" so you can read the numbers directly without guessing from color.
Watch the off-diagonal extremes. Cells with |r| > 0.8 outside the diagonal often indicate redundant features. In regression contexts, consider dropping one of the pair or combining them with PCA.

Correlations

Overview

Common Use Cases

Data Science & Machine Learning

Statistical Analysis

Healthcare & Life Sciences

Business Analytics

Options

Columns of Interest

Settings

Annotate Segments With Value

Understanding Correlation Values

The Pearson Coefficient

Positive Correlation (red cells)

Negative Correlation (blue cells)

No Linear Correlation (white/near-white cells)

Interpreting the Matrix

Diagonal

Symmetry

Color Scale

Interpretation Tips

On this page

Sicherheit auf Enterprise-Niveau

In jeder Infrastruktur einsetzbar

DSGVO-konform

Correlations

Overview

Common Use Cases

Data Science & Machine Learning

Statistical Analysis

Healthcare & Life Sciences

Business Analytics

Options

Columns of Interest

Settings

Annotate Segments With Value

Understanding Correlation Values

The Pearson Coefficient

Positive Correlation (red cells)

Negative Correlation (blue cells)

No Linear Correlation (white/near-white cells)

Interpreting the Matrix

Diagonal

Symmetry

Color Scale

Interpretation Tips

On this page

Command Palette