Dokumentation (english)

Data Module

Store, sync data securely with big data capabilities

Overview

The Data Module manages your data so it can be used in flows and AI models. It stores, syncs, and prepares data in a standardized way.

Getting Started

1. Add Your First Data

Click on Add Data to begin uploading your data.

2. Upload Files or Folders

Click Select File and drag it into the drop area, or click Select File to choose from the file explorer menu. You can also upload an entire folder by selecting Select Folder.

Click Done to upload your files.

Important:

  • Larger files take longer to upload
  • Do not reload your page during uploads
  • Common formats supported: tables, images, PDFs, and more

3. Data Processing and Indexing

After upload, data processing and indexing begins automatically. You'll see real-time updates of what stage your data is currently in.

What happens during processing:

  • Unified Storage: Data is stored in a standardized format for consistent use across flows and AI models
  • Semantic Indexing: All data is indexed using embeddings, enabling semantic search based on meaning rather than keywords
  • Metadata Generation: Structured metadata is automatically generated, allowing AI models, AI agents, and automation flows to understand and work with the data

Note: For large datasets, this processing (called "UNIFICATION") can take some time.

4. Access Your Data

Once processing is complete, you'll see the status as "ready". The file will now be available for selection throughout the tool - in plots, flows, and for previewing.

5. Preview Your Data

Click on the file to preview it. Hover over it to rename.

The preview appearance varies depending on your file type.

Data Connectors

Connect to external databases, APIs, and file storage systems. Data can be synchronized automatically or imported as a one-time operation.

Setting Up a Connector

1. Navigate to Integrations

Click on the Integrations tab and then Add Connector.

2. Select Your Data Source

For this example, we'll connect to Airtable.

3. Create an Access Token

To use Airtable, you need to create an Access Token first:

  1. Visit https://airtable.com/create/tokens
  2. Set all settings and give access to read tables and schemas
  3. Give access to all resources

Save the access token.

4. Configure the Connector

Add the access token to the API Key field in the connector setup.

5. Get Your Airtable IDs

Open the table you want to import in Airtable and check the URL of your webpage:

From the URL, extract:

  • Base ID: apptIcKShlYzE1m8q
  • Table ID: tblB4FcyOA4YszByy
  • View ID: viwlQi6665v6g3Iz2

6. Complete the Setup

After filling in all the connector details, it should look like this:

Your import job is now scheduled and will run automatically.

7. Create a Sync Job (Optional)

To update the data regularly, you can create a sync job. Click on the calendar icon to open the sync job page.

Click Create Sync Job and select the connector you created in the integrations tab from the dropdown. In our case, we created the Airtable connector.

Give the sync job a name and then click Save.

8. Monitor Your Sync Jobs

Check the Import Job Logs to see if a sync job has already triggered an import. You'll see whether the import was successful or failed.

Import Job Statistics:

When an import job completes, you'll see detailed statistics about the imported data:

  • File Size: Total size of imported files in bytes
  • Imported Files: Number of files that were imported
  • Duration: How long the import took to complete

Important Notes:

  • The synchronized data will automatically appear in the Data section
  • On every sync, the old data will be overwritten with fresh data from the source
  • This ensures your data stays up-to-date with the connected system

Use Cases

  • Data Preparation: Clean and prepare data for algorithms of any kind
  • Data Integration: Connect multiple data sources into a unified tool
  • Search & Discovery: Use semantic search to find relevant data quickly
  • AI Training: Prepare and organize training datasets for AI models
  • Plots: Create interactive dashboards with your data

Responsible Developers: Julia, Maxim, Finn, previously Shivam.


Command Palette

Search for a command to run...

Schnellzugriffe
STRG + KSuche
STRG + DNachtmodus / Tagmodus
STRG + LSprache ändern

Software-Details
Kompiliert vor etwa 8 Stunden
Release: v4.0.0-production
Buildnummer: master@d237a7f
Historie: 10 Items