> ## Documentation Index
> Fetch the complete documentation index at: https://docs.filefeed.io/llms.txt
> Use this file to discover all available pages before exploring further.

# SFTP Flow

> Complete guide to integrating with FileFeed through SFTP, webhooks, and REST API

<Note>
  The entity formerly called **Client** is now called **Connection** as of API
  version `2026-05-25`. Legacy `/clients` endpoints and `Client` types remain
  available — see the [migration guide](/migration/v1-to-v2). This page uses
  the new name throughout; if you're still on `2024-09-01`, mentally substitute
  "client" wherever you see "connection".
</Note>

## SFTP Integration

FileFeed provides a comprehensive SFTP-based integration platform that automates
file-based workflows by receiving, validating, transforming, and routing structured
data files. This section covers all aspects of SFTP integration including core
concepts, webhook listeners, REST API access, and implementation checklists.

## Core Concepts

Understanding the fundamental concepts of FileFeed platform is essential for
successful integration. This section explains how connections, schemas, and pipelines
work together to automate your file processing workflows.

### 1. Platform Overview

FileFeed is designed to solve the challenge of managing file transfers and data
integrations across multiple data sources, systems, and file formats. It provides a
centralized platform where you can:

* Create dedicated SFTP spaces for each connection (data source)
* Define schemas that validate incoming data files
* Build automated pipelines that transform files into standardized formats
* Send processed data to destination systems via webhooks or API
* Monitor file processing and alert on errors

<a id="2-clients" />

### 2. Connections

In FileFeed, a **Connection** (formerly called *Client*) represents one
data-source endpoint — typically an organization or business entity that
sends files to FileFeed via SFTP. Each connection has:

* **Dedicated SFTP space**: A secure
  environment where the data source can upload files
* **SFTP credentials**: Username and
  password for connecting to the SFTP server
* **SFTP host**: The server address
  displayed on the connection's page for connection
* **Connection ID**: A unique identifier
  used in API calls and data routing
* **Associated pipelines**: Workflows
  that process files uploaded into this connection

Connections are isolated from each other, ensuring data privacy and security.
They connect to their dedicated SFTP space using their assigned username and
password with the host specified on the connection's configuration page.

```json theme={null}
// Example connection object (v1 callers may see this returned as a `client`).
// Credentials (sftpPassword / sftpPrivateKey / sftpPassphrase / awsPassword)
// are write-only and never returned on reads.
{
  "id": "c4b3f495-5dfc-4a91-b604-a8e66ab4a220",
  "name": "Acme Logistics Inc.",
  "type": "SFTP",
  "sftpUsername": "acme_sftp",
  "useHostedSFTP": true,
  "awsUserName": "ws-acme-logi-7t9w",
  "sftpServer": {
    "id": "srv_123",
    "host": "sftp.sftpsync.io",
    "port": 22,
    "isAwsHosted": true,
    "status": "active"
  },
  "createdAt": "2023-05-15T10:30:00Z",
  "updatedAt": "2023-06-01T08:45:22Z"
}
```

<Note>
  **FileFeed-hosted vs self-hosted SFTP.** Set `useHostedSFTP: true` (default) to
  have FileFeed provision the SFTP server. For a **self-hosted** connection
  (`useHostedSFTP: false`) FileFeed dials out to your own server — supply
  `sftpHost`, `sftpPort`, `sftpRemotePath`, and credentials on create/update.
</Note>

### 3. Schemas

A **Schema** defines the structure and
validation rules for data files. It specifies:

* **Fields**: The columns or properties
  expected in the data
* **Data types**: The expected type for
  each field (string, number, date, etc.)
* **Validation rules**: Requirements for
  each field (required/optional, format, range, etc.)

Schemas help ensure data quality by rejecting files that don't meet your
specifications. They define what data is expected and its format, but do not
handle transformations (which are handled separately by pipelines).

```json theme={null}
// Example schema definition
{
  "schemaId": "ord-schema-v1",
  "name": "Order Schema v1",
  "fileType": "csv",
  "fields": [
    {
      "name": "order_id",
      "type": "string",
      "required": true,
      "validation": {
        "pattern": "^ORD-[0-9]{6}$"
      }
    },
    {
      "name": "customer_email",
      "type": "string",
      "required": true,
      "validation": {
        "format": "email"
      }
    },
    {
      "name": "order_date",
      "type": "date",
      "required": true,
      "sourceFormat": "MM/DD/YYYY",
      "targetFormat": "YYYY-MM-DD"
    },
    {
      "name": "total_amount",
      "type": "number",
      "required": true,
      "validation": {
        "min": 0
      }
    }
  ]
}
```

<Note>
  The JSON above illustrates the **concept** of a schema. Over the API a schema's
  structure is supplied as a single JSON Schema `definition` object (`{ type,
    properties, required }`) — see the [Schema endpoints](/api-reference/openapi)
  and the SDK `schemas.create({ name, definition })` example. The legacy `fields`
  array is not used by the API.
</Note>

### 4. Pipelines

A **Pipeline** in FileFeed defines how
files are processed when uploaded to a specific folder. Each pipeline includes:

* **Schema**: The file structure we're
  mapping to (defines expected data format)
* **Webhook**: Optional notification when
  new files are uploaded
* **Starter file**: Template that defines
  the expected header structure for Excel/CSV files
* **Mappings**: Column mappings (both
  automatic and manual) from source to target schema
* **Transformations**: Data manipulations
  that can be applied to specific columns

Each pipeline creates a dedicated folder in the connection's SFTP space. When a file
is uploaded to this folder, it automatically triggers the processing based on the
defined mappings and transformations.

```json theme={null}
// Example pipeline configuration
{
  "options": {
    "delimiter": ",",
    "skipHeaderRow": true
  },
  "fieldMappings": [
    {
      "source": "customer_id",
      "target": "id"
    },
    {
      "source": "customer_name",
      "target": "name"
    },
    {
      "source": "customer_email",
      "target": "email",
      "transform": "toLowerCase"
    },
    {
      "source": "customer_phone",
      "target": "phone",
      "transform": "formatPhoneNumber"
    }
  ],
  "transformations": {
    "toLowerCase": "function(value) { return value.toLowerCase(); }",
    "formatPhoneNumber": "function(value) { return value.replace(/[^0-9]/g, ''); }"
  }
}
```

Each entry in `fieldMappings` is one of three kinds:

* **Sourced** — `{ "source": "...", "target": "...", "transform": "..." }` copies a column from the input into the target field.
* **Static value** — `{ "target": "...", "value": "..." }` writes a fixed constant into the target field on **every row**, regardless of the input. Omit `source` and provide `value` instead. Useful for stamping metadata that isn't in the file (a source system, region, batch tag). Transforms don't apply to a static value — the constant is written verbatim. Works in both inbound and outbound pipelines.
* **Aggregated** — `{ "sources": ["...", "..."], "target": "...", "delimiter": " " }` joins **several** input columns into one target field, in order, joined by `delimiter` (default a single space), skipping empty values so the delimiter never dangles. Provide `sources` instead of `source`. A `transform`, if set, runs on the joined result.

```json theme={null}
// A static value alongside sourced mappings
{
  "fieldMappings": [
    { "source": "customer_id", "target": "id" },
    { "target": "source_system", "value": "FileFeed" }
  ]
}
```

```json theme={null}
// Combine first + last name into one column
{
  "fieldMappings": [
    { "sources": ["first_name", "last_name"], "target": "full_name", "delimiter": " " }
  ]
}
```

<Note>
  Aggregated (multi-source) mappings are in **limited availability** and must be
  enabled for your workspace. Until then, saving a pipeline that uses `sources`
  returns `400`. Contact [support@filefeed.io](mailto:support@filefeed.io) to
  enable the field-aggregation feature.
</Note>

### 5. Typical Workflow

Here's a typical workflow in FileFeed that illustrates how connections,
schemas, and pipelines work together:

1. **Connection Setup**: You create a new
   connection in FileFeed, which generates SFTP credentials and a dedicated SFTP space.
2. **Schema Definition**: You define a
   schema that specifies the target data structure you want to receive after
   processing.
3. **Pipeline Creation**: You create a
   pipeline and associate it with the schema. This automatically creates a
   dedicated folder in the connection's SFTP space.
4. **Starter File Setup**: You upload a
   template file with the expected header structure for the files the
   sending team will upload.
5. **Field Mapping**: You define mappings
   between source columns in the uploaded files and target fields in your schema.
6. **Transformation Setup**: You create
   JavaScript transformation functions to apply to specific fields.
7. **Webhook Configuration**: You set up
   an optional webhook URL to receive notifications when files are processed.
8. **File Upload**: The sender uploads a
   file to the dedicated pipeline folder in the SFTP space.
9. **Automatic Processing**: FileFeed
   detects the new file, applies the defined mappings and transformations.
10. **Notification**: If a webhook is
    configured, a notification is sent to the specified URL.

## Webhook Listener

When files are processed in FileFeed, our system can notify your application
through webhooks. Follow these steps to implement a webhook listener:

### 1. Understanding Webhooks

FileFeed webhooks send HTTP POST requests to your specified endpoint when
certain events occur:

* **GENERAL:** Sent for all file processing events, including
  successful processing and error situations

### 2. Webhook Request Structure

Below is an example of the JSON payload sent to your webhook endpoint.

```json theme={null}
{
  "event": "GENERAL",
  "timestamp": "2025-05-19T21:30:00.000Z",
  "data": {
    "fileId": "550e8400-e29b-41d4-a716-446655440000",
    "filename": "example_data.csv",
    "clientName": "acme_inc",
    "status": "completed",
    "processedFilename": "example_data_processed.csv",
    "jsonFilename": "example_data.json",
    "size": 24680,
    "processedAt": "2025-05-19T21:29:55.000Z"
  }
}
```

### 3. Webhook Security

All webhook requests include a signature to verify authenticity:

* Requests contain an `x-sftpsync-signature` header
* The signature is an HMAC-SHA256 hash of the request body using your webhook secret
* The secret can be found in the webhook configuration section of your dashboard (Dashboard -> Webhooks -> Configuration -> View Configuration button -> Get secret)

### 4. Example (Node.js)

Here's a basic Node.js example using Express to listen for FileFeed webhooks
and verify their signatures.

```javascript theme={null}
const express = require('express');
const crypto = require('crypto');
const app = express();

app.use(express.json());

const WEBHOOK_SECRET = 'your_webhook_secret';

function verifySignature(requestPayload, signature) {
  const hmac = crypto.createHmac('sha256', WEBHOOK_SECRET);
  const digest = hmac.update(JSON.stringify(requestPayload)).digest('hex');
  return crypto.timingSafeEqual(
    Buffer.from(digest),
    Buffer.from(signature)
  );
}

app.post('/webhooks/sftpsync', (req, res) => {
const signature = req.headers['x-sftpsync-signature'];
  const requestBody = req.body;

  if (!signature || !verifySignature(requestBody, signature)) {
    return res.status(401).send('Invalid signature');
  }

  const event = requestBody.event;

  switch (event) {
    case 'GENERAL': {
      const { filename, status } = requestBody.data;
      console.log(`File ${filename} event received with status: ${status}`);
      
      if (status === 'completed') {
        console.log(`File ${filename} processed successfully`);
      } else if (status === 'failed') {
        const errorMessage = requestBody.data.errorMessage || 'Unknown error';
        console.log(`File ${filename} processing failed: ${errorMessage}`);
      }
      break;
    }
    default:
      console.log(`Unknown event type: ${event}`);
  }

  res.status(200).send('Webhook received');
});

app.listen(3000, () => {
  console.log('Webhook listener running on port 3000');
});
```

### 5. Best Practices

* Respond quickly (within 5 seconds) to avoid webhook timeouts
* Implement idempotency to handle potential duplicate webhook deliveries
* Use a queue system for processing webhook data asynchronously
* Store your webhook secret securely
* Implement proper error handling

## Retrieve processed data

Use either the TypeScript SDK or the REST API to fetch processed JSON rows for a pipeline run.

<Tabs>
  <Tab title="TypeScript SDK">
    ```ts theme={null}
    import FileFeed from '@filefeed/sdk';

    const filefeed = new FileFeed({ apiKey: process.env.FILEFEED_API_KEY! });

    // Get recent completed runs, then paginate data for a run
    const runs = await filefeed.pipelineRuns.list({ status: 'completed', limit: 25 });
    for (const run of runs.data) {
      let offset: number | null = 0;
      do {
        const page = await filefeed.pipelineRuns.getData({ pipelineRunId: run.id, limit: 1000, offset });
        // process page.data
        offset = page.data.length === 1000 ? (offset ?? 0) + page.data.length : null;
      } while (offset !== null);
      // Acknowledge once persisted
      await filefeed.pipelineRuns.ack({ pipelineRunId: run.id });
    }
    ```
  </Tab>

  <Tab title="REST API">
    ```bash theme={null}
    curl -X GET "https://api.sftpsync.io/files/pipeline-runs/run_123?offset=0&limit=1000" \
      -H "X-API-Key: $API_KEY"
    ```

    Example response:

    ```json theme={null}
    {
      "data": [ { "id": "123", "email": "a@b.com" } ],
      "metadata": {
        "pipelineRunId": "run_123",
        "offset": 0,
        "limit": 1000,
        "hasMore": false
      }
    }
    ```
  </Tab>
</Tabs>

Finding the pipeline run ID:

* In the app: Dashboard → Pipeline Runs
* Via webhook: included in webhook payloads

<Note>
  All requests require an API key. Store it securely and never commit it.
</Note>

## Integration Checklist

Use this checklist to ensure your FileFeed integration is properly configured
and ready for production use.

### Step-by-step

* [ ] Get API key (Dashboard → My Account → Security Settings)
* [ ] Create Connection (SFTP credentials)
* [ ] Define Schema (fields, validation)
* [ ] Create Webhook (Dashboard → Webhooks)
* [ ] Create and activate Pipeline (link connection + schema; mappings/transforms)
* [ ] [Register Webhook](/automated-flows/sftp-flow#4-example-node-js) ([store secret; verify HMAC signature](/automated-flows/sftp-flow#3-webhook-security))
* [ ] Upload a sample file and confirm run "completed"
* [ ] Retrieve processed data and persist (SDK or REST)
* [ ] Acknowledge the pipeline run (OPTIONAL)
* [ ] Monitor runs and webhook deliveries; set alerts

<Tip>
  Once you've completed all items in this checklist, your FileFeed integration
  should be ready for production use.
</Tip>
