Dokumentation (english)

Sankey Diagram

Visualize flow and relationships between nodes

Use me when you want to watch flows move between categories like rivers branching and merging. Thick ribbons for big flows, thin ones for trickles. Perfect for energy flows, budget allocations, user paths through your website, or any system where things move from source to destination. I make the invisible visible - like watching your money or resources actually flow through stages.

Overview

A Sankey diagram is a flow diagram where the width of arrows or connections is proportional to the flow quantity. It shows how resources, energy, money, or other quantities move from sources through intermediaries to destinations. The visual thickness of connections immediately reveals the magnitude of flows, making it easy to identify major paths and bottlenecks.

Best used for:

  • Visualizing flow of resources, energy, or money
  • Showing multi-step processes and transformations
  • Network traffic and data flow analysis
  • Customer journey and conversion paths
  • Budget allocation and spending breakdown
  • Material flow and supply chain visualization

Common Use Cases

Business & Finance

  • Budget allocation (departments → projects → expenses)
  • Revenue streams and profit distribution
  • Customer acquisition funnel (source → channel → conversion)
  • Product sales flow (category → subcategory → product)
  • Cash flow and money movement

Energy & Environment

  • Energy production and consumption
  • Carbon emissions flow
  • Water usage and distribution
  • Material recycling and waste management
  • Resource allocation

Web Analytics & User Flow

  • Website navigation paths
  • User journey through application
  • Traffic sources to conversion
  • Feature usage patterns
  • Drop-off analysis

Options

Source

Required - Column indicating the starting point of flows.

Each unique value represents a node on the left side of the diagram. Flows originate from these nodes.

Target

Required - Column indicating the destination of flows.

Each unique value represents a node on the right side. Flows terminate at these nodes. Note: A node can be both a source and target (intermediate nodes).

Value/Flow

Required - Magnitude of flow between source and target.

Column

Select the numerical column representing flow quantity (e.g., amount, count, volume).

Aggregation Function

Choose how to aggregate flows:

Options:

  • Sum - Total flow (most common)
  • Mean - Average flow
  • Count - Number of connections
  • Median - Middle flow value
  • Min - Minimum flow
  • Max - Maximum flow

Color By (Optional)

Optional - Color flows by category.

When specified, flows are colored based on this categorical column, making it easy to distinguish different types of flows.

Settings

Hide Empty Values

Optional - Exclude flows with no data.

Hide Node Labels

Optional - Hide labels on nodes.

Useful when node names are long or when you want a cleaner visualization.

Orientation

Optional - Direction of flow.

Options:

  • Horizontal - Flows left to right (default)
  • Vertical - Flows top to bottom

Understanding Sankey Components

Nodes

  • Rectangles: Represent categories, stages, or entities
  • Height: Proportional to total flow through the node
  • Position: Automatically arranged in layers
  • Color: Can indicate category or be automatically assigned
  • Width: Proportional to flow magnitude
  • Color: Matches source node or custom by category
  • Curvature: Shows direction of flow
  • Transparency: Often semi-transparent to show overlaps

Layers

  • Left to right: Represents progression or transformation
  • Multiple layers: Intermediate steps in the flow
  • Nodes can repeat: Same entity at different stages

Tips for Effective Sankey Diagrams

  1. Data Structure:

    • Each row represents one flow connection
    • Source and target columns define connections
    • Value column indicates flow magnitude
    • Example: Source="Marketing", Target="Website", Value=1000
  2. Simplify When Needed:

    • Limit to 10-15 nodes for clarity
    • Group small flows into "Other"
    • Filter out minor connections below threshold
    • Consider multiple diagrams for complex systems
  3. Use Color Strategically:

    • Color by source to track origins
    • Color by category to distinguish flow types
    • Use consistent colors across related visualizations
    • Ensure accessibility (colorblind-friendly)
  4. Orientation Choice:

    • Horizontal: Traditional, good for time-based flows
    • Vertical: Better for top-down hierarchies
    • Match orientation to mental model of process
  5. Handle Complex Flows:

    • Break into multiple diagrams if too complex
    • Focus on main flows first
    • Use filtering to show different aspects
    • Consider animation for temporal data
  6. Label Strategically:

    • Keep node names short and clear
    • Use hover tooltips for details
    • Show values on major flows
    • Hide labels if too cluttered

Common Patterns

Simple Flow (2 Layers)

Sources → Destinations
Marketing → Website
Social → App
Email → Direct

Multi-Stage Flow (3+ Layers)

Sources → Channels → Conversions → Revenue
Traffic Sources → Landing Pages → Actions → Sales

Converging Flow

Multiple sources feeding into fewer destinations (consolidation).

Diverging Flow

Single source splitting into multiple destinations (distribution).

Circular Flow

Nodes that connect back to earlier stages (recycling, feedback loops).

Example Scenarios

Budget Allocation

Company Budget → Departments → Projects → Expenditures

Customer Journey

Traffic Source → Landing Page → Action → Conversion

Energy Flow

Production → Distribution → Consumption → Waste

Revenue Streams

Product Categories → Sales Channels → Customer Segments → Revenue

Troubleshooting

Issue: Diagram is too cluttered

  • Solution: Reduce number of flows by filtering low-value connections, grouping minor categories into "Other", or splitting into multiple diagrams.

Issue: Nodes are overlapping

  • Solution: Reduce number of nodes, increase plot height, or adjust node padding in advanced settings.

Issue: Can't see small flows

  • Solution: Use logarithmic scaling (advanced), filter out large flows to see detail, or create separate diagram for small flows.

Issue: Node order is confusing

  • Solution: Nodes are automatically ordered. You may need to rename nodes to control their position, or manually specify node order in data preparation.

Issue: Flows cross each other messily

  • Solution: This is common with complex networks. Simplify by removing minor flows, grouping categories, or reorganizing data structure.

Issue: Colors don't distinguish flows

  • Solution: Use "Color By" option to categorize flows by meaningful attribute. Ensure sufficient color contrast.

Issue: Labels are cut off or overlapping

  • Solution: Enable "Hide Node Labels" and rely on hover tooltips, or increase plot size to accommodate labels.

Issue: Cannot trace flow path

  • Solution: Use consistent naming between source and target. Hover over flows to highlight paths. Consider color coding by origin.

Command Palette

Search for a command to run...

Schnellzugriffe
STRG + KSuche
STRG + DNachtmodus / Tagmodus
STRG + LSprache ändern

Software-Details
Kompiliert vor 1 Tag
Release: v4.0.0-production
Buildnummer: master@64a3463
Historie: 68 Items