SFTP Flow

SFTP Integration
Core Concepts
1. Platform Overview
2. Clients
3. Schemas
4. Pipelines
5. Typical Workflow
Webhook Listener
1. Understanding Webhooks
2. Webhook Request Structure
3. Webhook Security
4. Example (Node.js)
5. Best Practices
Retrieve processed data
Integration Checklist
Step-by-step

SFTP Integration

FileFeed provides a comprehensive SFTP-based integration platform that automates file-based workflows by receiving, validating, transforming, and routing structured data files. This section covers all aspects of SFTP integration including core concepts, webhook listeners, REST API access, and implementation checklists.

Core Concepts

Understanding the fundamental concepts of FileFeed platform is essential for successful integration. This section explains how clients, schemas, and pipelines work together to automate your file processing workflows.

1. Platform Overview

FileFeed is designed to solve the challenge of managing file transfers and data integrations across multiple clients, systems, and file formats. It provides a centralized platform where you can:

Create dedicated SFTP spaces for each client or data source
Define schemas that validate incoming data files
Build automated pipelines that transform files into standardized formats
Send processed data to destination systems via webhooks or API
Monitor file processing and alert on errors

2. Clients

In FileFeed, a Client represents an organization or business entity that interacts with your system through file transfers. Each client has:

Dedicated SFTP space: A secure environment where the client can upload files
SFTP credentials: Username and password for connecting to the SFTP server
SFTP host: The server address displayed on the client’s page for connection
Client ID: A unique identifier used in API calls and data routing
Associated pipelines: Workflows that process files uploaded by this client

Clients are isolated from each other, ensuring data privacy and security. Clients can connect to their dedicated SFTP space using their assigned username and password with the host specified on their configuration page.

// Example client object
{
  "clientId": "c4b3f495-5dfc-4a91-b604-a8e66ab4a220",
  "clientName": "acme_logistics",
  "displayName": "Acme Logistics Inc.",
  "status": "active",
  "created": "2023-05-15T10:30:00Z",
  "lastActive": "2023-06-01T08:45:22Z",
  "sftpUsername": "acme_sftp"
}

3. Schemas

A Schema defines the structure and validation rules for data files. It specifies:

Fields: The columns or properties expected in the data
Data types: The expected type for each field (string, number, date, etc.)
Validation rules: Requirements for each field (required/optional, format, range, etc.)

Schemas help ensure data quality by rejecting files that don’t meet your specifications. They define what data is expected and its format, but do not handle transformations (which are handled separately by pipelines).

// Example schema definition
{
  "schemaId": "ord-schema-v1",
  "name": "Order Schema v1",
  "fileType": "csv",
  "fields": [
    {
      "name": "order_id",
      "type": "string",
      "required": true,
      "validation": {
        "pattern": "^ORD-[0-9]{6}$"
      }
    },
    {
      "name": "customer_email",
      "type": "string",
      "required": true,
      "validation": {
        "format": "email"
      }
    },
    {
      "name": "order_date",
      "type": "date",
      "required": true,
      "sourceFormat": "MM/DD/YYYY",
      "targetFormat": "YYYY-MM-DD"
    },
    {
      "name": "total_amount",
      "type": "number",
      "required": true,
      "validation": {
        "min": 0
      }
    }
  ]
}

4. Pipelines

A Pipeline in FileFeed defines how files are processed when uploaded to a specific folder. Each pipeline includes:

Schema: The file structure we’re mapping to (defines expected data format)
Webhook: Optional notification when new files are uploaded
Starter file: Template that defines the expected header structure for Excel/CSV files
Mappings: Column mappings (both automatic and manual) from source to target schema
Transformations: Data manipulations that can be applied to specific columns

Each pipeline creates a dedicated folder in the client’s SFTP space. When a file is uploaded to this folder, it automatically triggers the processing based on the defined mappings and transformations.

// Example pipeline configuration
{
  "options": {
    "delimiter": ",",
    "skipHeaderRow": true
  },
  "fieldMappings": [
    {
      "source": "customer_id",
      "target": "id"
    },
    {
      "source": "customer_name",
      "target": "name"
    },
    {
      "source": "customer_email",
      "target": "email",
      "transform": "toLowerCase"
    },
    {
      "source": "customer_phone",
      "target": "phone",
      "transform": "formatPhoneNumber"
    }
  ],
  "transformations": {
    "toLowerCase": "function(value) { return value.toLowerCase(); }",
    "formatPhoneNumber": "function(value) { return value.replace(/[^0-9]/g, ''); }"
  }
}

5. Typical Workflow

Here’s a typical workflow in FileFeed that illustrates how clients, schemas, and pipelines work together:

Client Setup: You create a new client in FileFeed, which generates SFTP credentials and a dedicated SFTP space.
Schema Definition: You define a schema that specifies the target data structure you want to receive after processing.
Pipeline Creation: You create a pipeline and associate it with the schema. This automatically creates a dedicated folder in the client’s SFTP space.
Starter File Setup: You upload a template file with the expected header structure for the files your client will upload.
Field Mapping: You define mappings between source columns in the uploaded files and target fields in your schema.
Transformation Setup: You create JavaScript transformation functions to apply to specific fields.
Webhook Configuration: You set up an optional webhook URL to receive notifications when files are processed.
File Upload: The client uploads a file to their dedicated pipeline folder in the SFTP space.
Automatic Processing: FileFeed detects the new file, applies the defined mappings and transformations.
Notification: If a webhook is configured, a notification is sent to the specified URL.

Webhook Listener

When files are processed in FileFeed, our system can notify your application through webhooks. Follow these steps to implement a webhook listener:

1. Understanding Webhooks

FileFeed webhooks send HTTP POST requests to your specified endpoint when certain events occur:

GENERAL: Sent for all file processing events, including successful processing and error situations

2. Webhook Request Structure

Below is an example of the JSON payload sent to your webhook endpoint.

{
  "event": "GENERAL",
  "timestamp": "2025-05-19T21:30:00.000Z",
  "data": {
    "fileId": "550e8400-e29b-41d4-a716-446655440000",
    "filename": "example_data.csv",
    "clientName": "acme_inc",
    "status": "completed",
    "processedFilename": "example_data_processed.csv",
    "jsonFilename": "example_data.json",
    "size": 24680,
    "processedAt": "2025-05-19T21:29:55.000Z"
  }
}

3. Webhook Security

All webhook requests include a signature to verify authenticity:

Requests contain an x-sftpsync-signature header
The signature is an HMAC-SHA256 hash of the request body using your webhook secret
The secret can be found in the webhook configuration section of your dashboard (Dashboard -> Webhooks -> Configuration -> View Configuration button -> Get secret)

4. Example (Node.js)

Here’s a basic Node.js example using Express to listen for FileFeed webhooks and verify their signatures.

const express = require('express');
const crypto = require('crypto');
const app = express();

app.use(express.json());

const WEBHOOK_SECRET = 'your_webhook_secret';

function verifySignature(requestPayload, signature) {
  const hmac = crypto.createHmac('sha256', WEBHOOK_SECRET);
  const digest = hmac.update(JSON.stringify(requestPayload)).digest('hex');
  return crypto.timingSafeEqual(
    Buffer.from(digest),
    Buffer.from(signature)
  );
}

app.post('/webhooks/sftpsync', (req, res) => {
const signature = req.headers['x-sftpsync-signature'];
  const requestBody = req.body;

  if (!signature || !verifySignature(requestBody, signature)) {
    return res.status(401).send('Invalid signature');
  }

  const event = requestBody.event;

  switch (event) {
    case 'GENERAL': {
      const { filename, status } = requestBody.data;
      console.log(`File ${filename} event received with status: ${status}`);
      
      if (status === 'completed') {
        console.log(`File ${filename} processed successfully`);
      } else if (status === 'failed') {
        const errorMessage = requestBody.data.errorMessage || 'Unknown error';
        console.log(`File ${filename} processing failed: ${errorMessage}`);
      }
      break;
    }
    default:
      console.log(`Unknown event type: ${event}`);
  }

  res.status(200).send('Webhook received');
});

app.listen(3000, () => {
  console.log('Webhook listener running on port 3000');
});

5. Best Practices

Respond quickly (within 5 seconds) to avoid webhook timeouts
Implement idempotency to handle potential duplicate webhook deliveries
Use a queue system for processing webhook data asynchronously
Store your webhook secret securely
Implement proper error handling

Retrieve processed data

Use either the TypeScript SDK or the REST API to fetch processed JSON rows for a pipeline run.

TypeScript SDK
REST API

import FileFeed from '@filefeed/sdk';

const filefeed = new FileFeed({ apiKey: process.env.FILEFEED_API_KEY! });

// Get recent completed runs, then paginate data for a run
const runs = await filefeed.pipelineRuns.list({ status: 'completed', limit: 25 });
for (const run of runs.data) {
  let offset: number | null = 0;
  do {
    const page = await filefeed.pipelineRuns.getData({ pipelineRunId: run.id, limit: 1000, offset });
    // process page.data
    offset = page.data.length === 1000 ? (offset ?? 0) + page.data.length : null;
  } while (offset !== null);
  // Acknowledge once persisted
  await filefeed.pipelineRuns.ack({ pipelineRunId: run.id });
}

curl -X GET "https://api.sftpsync.io/files/pipeline-runs/run_123?offset=0&limit=1000" \
  -H "X-API-Key: $API_KEY"

Example response:

{
  "data": [ { "id": "123", "email": "[email protected]" } ],
  "metadata": {
    "pipelineRunId": "run_123",
    "offset": 0,
    "limit": 1000,
    "hasMore": false
  }
}

Finding the pipeline run ID:

In the app: Dashboard → Pipeline Runs
Via webhook: included in webhook payloads

All requests require an API key. Store it securely and never commit it.

Integration Checklist

Use this checklist to ensure your FileFeed integration is properly configured and ready for production use.

Step-by-step

Once you’ve completed all items in this checklist, your FileFeed integration should be ready for production use.

Overview API Library

⌘I

Getting Started

Embedded Importers

Automated Flows

SFTP Integration

Core Concepts

1. Platform Overview

2. Clients

3. Schemas

4. Pipelines

5. Typical Workflow

Webhook Listener

1. Understanding Webhooks

2. Webhook Request Structure

3. Webhook Security

4. Example (Node.js)

5. Best Practices

Retrieve processed data

Integration Checklist

Step-by-step

Getting Started

Embedded Importers

Automated Flows

​SFTP Integration

​Core Concepts

​1. Platform Overview

​2. Clients

​3. Schemas

​4. Pipelines

​5. Typical Workflow

​Webhook Listener

​1. Understanding Webhooks

​2. Webhook Request Structure

​3. Webhook Security

​4. Example (Node.js)

​5. Best Practices

​Retrieve processed data

​Integration Checklist

​Step-by-step

SFTP Integration

Core Concepts

1. Platform Overview

2. Clients

3. Schemas

4. Pipelines

5. Typical Workflow

Webhook Listener

1. Understanding Webhooks

2. Webhook Request Structure

3. Webhook Security

4. Example (Node.js)

5. Best Practices

Retrieve processed data

Integration Checklist

Step-by-step